| 2 2 3 3 1 2 2 2 1 2 2 1 1 2 2 1 1 1 1 1 1 3 3 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 | // SPDX-License-Identifier: GPL-2.0-only /* * NFC Digital Protocol stack * Copyright (c) 2013, Intel Corporation. */ #define pr_fmt(fmt) "digital: %s: " fmt, __func__ #include <linux/module.h> #include "digital.h" #define DIGITAL_PROTO_NFCA_RF_TECH \ (NFC_PROTO_JEWEL_MASK | NFC_PROTO_MIFARE_MASK | \ NFC_PROTO_NFC_DEP_MASK | NFC_PROTO_ISO14443_MASK) #define DIGITAL_PROTO_NFCB_RF_TECH NFC_PROTO_ISO14443_B_MASK #define DIGITAL_PROTO_NFCF_RF_TECH \ (NFC_PROTO_FELICA_MASK | NFC_PROTO_NFC_DEP_MASK) #define DIGITAL_PROTO_ISO15693_RF_TECH NFC_PROTO_ISO15693_MASK /* Delay between each poll frame (ms) */ #define DIGITAL_POLL_INTERVAL 10 struct digital_cmd { struct list_head queue; u8 type; u8 pending; u16 timeout; struct sk_buff *req; struct sk_buff *resp; struct digital_tg_mdaa_params *mdaa_params; nfc_digital_cmd_complete_t cmd_cb; void *cb_context; }; struct sk_buff *digital_skb_alloc(struct nfc_digital_dev *ddev, unsigned int len) { struct sk_buff *skb; skb = alloc_skb(len + ddev->tx_headroom + ddev->tx_tailroom, GFP_KERNEL); if (skb) skb_reserve(skb, ddev->tx_headroom); return skb; } void digital_skb_add_crc(struct sk_buff *skb, crc_func_t crc_func, u16 init, u8 bitwise_inv, u8 msb_first) { u16 crc; crc = crc_func(init, skb->data, skb->len); if (bitwise_inv) crc = ~crc; if (msb_first) crc = __fswab16(crc); skb_put_u8(skb, crc & 0xFF); skb_put_u8(skb, (crc >> 8) & 0xFF); } int digital_skb_check_crc(struct sk_buff *skb, crc_func_t crc_func, u16 crc_init, u8 bitwise_inv, u8 msb_first) { int rc; u16 crc; if (skb->len <= 2) return -EIO; crc = crc_func(crc_init, skb->data, skb->len - 2); if (bitwise_inv) crc = ~crc; if (msb_first) crc = __swab16(crc); rc = (skb->data[skb->len - 2] - (crc & 0xFF)) + (skb->data[skb->len - 1] - ((crc >> 8) & 0xFF)); if (rc) return -EIO; skb_trim(skb, skb->len - 2); return 0; } static inline void digital_switch_rf(struct nfc_digital_dev *ddev, bool on) { ddev->ops->switch_rf(ddev, on); } static inline void digital_abort_cmd(struct nfc_digital_dev *ddev) { ddev->ops->abort_cmd(ddev); } static void digital_wq_cmd_complete(struct work_struct *work) { struct digital_cmd *cmd; struct nfc_digital_dev *ddev = container_of(work, struct nfc_digital_dev, cmd_complete_work); mutex_lock(&ddev->cmd_lock); cmd = list_first_entry_or_null(&ddev->cmd_queue, struct digital_cmd, queue); if (!cmd) { mutex_unlock(&ddev->cmd_lock); return; } list_del(&cmd->queue); mutex_unlock(&ddev->cmd_lock); if (!IS_ERR(cmd->resp)) print_hex_dump_debug("DIGITAL RX: ", DUMP_PREFIX_NONE, 16, 1, cmd->resp->data, cmd->resp->len, false); cmd->cmd_cb(ddev, cmd->cb_context, cmd->resp); kfree(cmd->mdaa_params); kfree(cmd); schedule_work(&ddev->cmd_work); } static void digital_send_cmd_complete(struct nfc_digital_dev *ddev, void *arg, struct sk_buff *resp) { struct digital_cmd *cmd = arg; cmd->resp = resp; schedule_work(&ddev->cmd_complete_work); } static void digital_wq_cmd(struct work_struct *work) { int rc; struct digital_cmd *cmd; struct digital_tg_mdaa_params *params; struct nfc_digital_dev *ddev = container_of(work, struct nfc_digital_dev, cmd_work); mutex_lock(&ddev->cmd_lock); cmd = list_first_entry_or_null(&ddev->cmd_queue, struct digital_cmd, queue); if (!cmd || cmd->pending) { mutex_unlock(&ddev->cmd_lock); return; } cmd->pending = 1; mutex_unlock(&ddev->cmd_lock); if (cmd->req) print_hex_dump_debug("DIGITAL TX: ", DUMP_PREFIX_NONE, 16, 1, cmd->req->data, cmd->req->len, false); switch (cmd->type) { case DIGITAL_CMD_IN_SEND: rc = ddev->ops->in_send_cmd(ddev, cmd->req, cmd->timeout, digital_send_cmd_complete, cmd); break; case DIGITAL_CMD_TG_SEND: rc = ddev->ops->tg_send_cmd(ddev, cmd->req, cmd->timeout, digital_send_cmd_complete, cmd); break; case DIGITAL_CMD_TG_LISTEN: rc = ddev->ops->tg_listen(ddev, cmd->timeout, digital_send_cmd_complete, cmd); break; case DIGITAL_CMD_TG_LISTEN_MDAA: params = cmd->mdaa_params; rc = ddev->ops->tg_listen_mdaa(ddev, params, cmd->timeout, digital_send_cmd_complete, cmd); break; case DIGITAL_CMD_TG_LISTEN_MD: rc = ddev->ops->tg_listen_md(ddev, cmd->timeout, digital_send_cmd_complete, cmd); break; default: pr_err("Unknown cmd type %d\n", cmd->type); return; } if (!rc) return; pr_err("in_send_command returned err %d\n", rc); mutex_lock(&ddev->cmd_lock); list_del(&cmd->queue); mutex_unlock(&ddev->cmd_lock); kfree_skb(cmd->req); kfree(cmd->mdaa_params); kfree(cmd); schedule_work(&ddev->cmd_work); } int digital_send_cmd(struct nfc_digital_dev *ddev, u8 cmd_type, struct sk_buff *skb, struct digital_tg_mdaa_params *params, u16 timeout, nfc_digital_cmd_complete_t cmd_cb, void *cb_context) { struct digital_cmd *cmd; cmd = kzalloc(sizeof(*cmd), GFP_KERNEL); if (!cmd) return -ENOMEM; cmd->type = cmd_type; cmd->timeout = timeout; cmd->req = skb; cmd->mdaa_params = params; cmd->cmd_cb = cmd_cb; cmd->cb_context = cb_context; INIT_LIST_HEAD(&cmd->queue); mutex_lock(&ddev->cmd_lock); list_add_tail(&cmd->queue, &ddev->cmd_queue); mutex_unlock(&ddev->cmd_lock); schedule_work(&ddev->cmd_work); return 0; } int digital_in_configure_hw(struct nfc_digital_dev *ddev, int type, int param) { int rc; rc = ddev->ops->in_configure_hw(ddev, type, param); if (rc) pr_err("in_configure_hw failed: %d\n", rc); return rc; } int digital_tg_configure_hw(struct nfc_digital_dev *ddev, int type, int param) { int rc; rc = ddev->ops->tg_configure_hw(ddev, type, param); if (rc) pr_err("tg_configure_hw failed: %d\n", rc); return rc; } static int digital_tg_listen_mdaa(struct nfc_digital_dev *ddev, u8 rf_tech) { struct digital_tg_mdaa_params *params; int rc; params = kzalloc(sizeof(*params), GFP_KERNEL); if (!params) return -ENOMEM; params->sens_res = DIGITAL_SENS_RES_NFC_DEP; get_random_bytes(params->nfcid1, sizeof(params->nfcid1)); params->sel_res = DIGITAL_SEL_RES_NFC_DEP; params->nfcid2[0] = DIGITAL_SENSF_NFCID2_NFC_DEP_B1; params->nfcid2[1] = DIGITAL_SENSF_NFCID2_NFC_DEP_B2; get_random_bytes(params->nfcid2 + 2, NFC_NFCID2_MAXSIZE - 2); params->sc = DIGITAL_SENSF_FELICA_SC; rc = digital_send_cmd(ddev, DIGITAL_CMD_TG_LISTEN_MDAA, NULL, params, 500, digital_tg_recv_atr_req, NULL); if (rc) kfree(params); return rc; } static int digital_tg_listen_md(struct nfc_digital_dev *ddev, u8 rf_tech) { return digital_send_cmd(ddev, DIGITAL_CMD_TG_LISTEN_MD, NULL, NULL, 500, digital_tg_recv_md_req, NULL); } int digital_target_found(struct nfc_digital_dev *ddev, struct nfc_target *target, u8 protocol) { int rc; u8 framing; u8 rf_tech; u8 poll_tech_count; int (*check_crc)(struct sk_buff *skb); void (*add_crc)(struct sk_buff *skb); rf_tech = ddev->poll_techs[ddev->poll_tech_index].rf_tech; switch (protocol) { case NFC_PROTO_JEWEL: framing = NFC_DIGITAL_FRAMING_NFCA_T1T; check_crc = digital_skb_check_crc_b; add_crc = digital_skb_add_crc_b; break; case NFC_PROTO_MIFARE: framing = NFC_DIGITAL_FRAMING_NFCA_T2T; check_crc = digital_skb_check_crc_a; add_crc = digital_skb_add_crc_a; break; case NFC_PROTO_FELICA: framing = NFC_DIGITAL_FRAMING_NFCF_T3T; check_crc = digital_skb_check_crc_f; add_crc = digital_skb_add_crc_f; break; case NFC_PROTO_NFC_DEP: if (rf_tech == NFC_DIGITAL_RF_TECH_106A) { framing = NFC_DIGITAL_FRAMING_NFCA_NFC_DEP; check_crc = digital_skb_check_crc_a; add_crc = digital_skb_add_crc_a; } else { framing = NFC_DIGITAL_FRAMING_NFCF_NFC_DEP; check_crc = digital_skb_check_crc_f; add_crc = digital_skb_add_crc_f; } break; case NFC_PROTO_ISO15693: framing = NFC_DIGITAL_FRAMING_ISO15693_T5T; check_crc = digital_skb_check_crc_b; add_crc = digital_skb_add_crc_b; break; case NFC_PROTO_ISO14443: framing = NFC_DIGITAL_FRAMING_NFCA_T4T; check_crc = digital_skb_check_crc_a; add_crc = digital_skb_add_crc_a; break; case NFC_PROTO_ISO14443_B: framing = NFC_DIGITAL_FRAMING_NFCB_T4T; check_crc = digital_skb_check_crc_b; add_crc = digital_skb_add_crc_b; break; default: pr_err("Invalid protocol %d\n", protocol); return -EINVAL; } pr_debug("rf_tech=%d, protocol=%d\n", rf_tech, protocol); ddev->curr_rf_tech = rf_tech; if (DIGITAL_DRV_CAPS_IN_CRC(ddev)) { ddev->skb_add_crc = digital_skb_add_crc_none; ddev->skb_check_crc = digital_skb_check_crc_none; } else { ddev->skb_add_crc = add_crc; ddev->skb_check_crc = check_crc; } rc = digital_in_configure_hw(ddev, NFC_DIGITAL_CONFIG_FRAMING, framing); if (rc) return rc; target->supported_protocols = (1 << protocol); poll_tech_count = ddev->poll_tech_count; ddev->poll_tech_count = 0; rc = nfc_targets_found(ddev->nfc_dev, target, 1); if (rc) { ddev->poll_tech_count = poll_tech_count; return rc; } return 0; } void digital_poll_next_tech(struct nfc_digital_dev *ddev) { u8 rand_mod; digital_switch_rf(ddev, 0); mutex_lock(&ddev->poll_lock); if (!ddev->poll_tech_count) { mutex_unlock(&ddev->poll_lock); return; } get_random_bytes(&rand_mod, sizeof(rand_mod)); ddev->poll_tech_index = rand_mod % ddev->poll_tech_count; mutex_unlock(&ddev->poll_lock); schedule_delayed_work(&ddev->poll_work, msecs_to_jiffies(DIGITAL_POLL_INTERVAL)); } static void digital_wq_poll(struct work_struct *work) { int rc; struct digital_poll_tech *poll_tech; struct nfc_digital_dev *ddev = container_of(work, struct nfc_digital_dev, poll_work.work); mutex_lock(&ddev->poll_lock); if (!ddev->poll_tech_count) { mutex_unlock(&ddev->poll_lock); return; } poll_tech = &ddev->poll_techs[ddev->poll_tech_index]; mutex_unlock(&ddev->poll_lock); rc = poll_tech->poll_func(ddev, poll_tech->rf_tech); if (rc) digital_poll_next_tech(ddev); } static void digital_add_poll_tech(struct nfc_digital_dev *ddev, u8 rf_tech, digital_poll_t poll_func) { struct digital_poll_tech *poll_tech; if (ddev->poll_tech_count >= NFC_DIGITAL_POLL_MODE_COUNT_MAX) return; poll_tech = &ddev->poll_techs[ddev->poll_tech_count++]; poll_tech->rf_tech = rf_tech; poll_tech->poll_func = poll_func; } /** * digital_start_poll - start_poll operation * @nfc_dev: device to be polled * @im_protocols: bitset of nfc initiator protocols to be used for polling * @tm_protocols: bitset of nfc transport protocols to be used for polling * * For every supported protocol, the corresponding polling function is added * to the table of polling technologies (ddev->poll_techs[]) using * digital_add_poll_tech(). * When a polling function fails (by timeout or protocol error) the next one is * schedule by digital_poll_next_tech() on the poll workqueue (ddev->poll_work). */ static int digital_start_poll(struct nfc_dev *nfc_dev, __u32 im_protocols, __u32 tm_protocols) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); u32 matching_im_protocols, matching_tm_protocols; pr_debug("protocols: im 0x%x, tm 0x%x, supported 0x%x\n", im_protocols, tm_protocols, ddev->protocols); matching_im_protocols = ddev->protocols & im_protocols; matching_tm_protocols = ddev->protocols & tm_protocols; if (!matching_im_protocols && !matching_tm_protocols) { pr_err("Unknown protocol\n"); return -EINVAL; } if (ddev->poll_tech_count) { pr_err("Already polling\n"); return -EBUSY; } if (ddev->curr_protocol) { pr_err("A target is already active\n"); return -EBUSY; } ddev->poll_tech_count = 0; ddev->poll_tech_index = 0; if (matching_im_protocols & DIGITAL_PROTO_NFCA_RF_TECH) digital_add_poll_tech(ddev, NFC_DIGITAL_RF_TECH_106A, digital_in_send_sens_req); if (matching_im_protocols & DIGITAL_PROTO_NFCB_RF_TECH) digital_add_poll_tech(ddev, NFC_DIGITAL_RF_TECH_106B, digital_in_send_sensb_req); if (matching_im_protocols & DIGITAL_PROTO_NFCF_RF_TECH) { digital_add_poll_tech(ddev, NFC_DIGITAL_RF_TECH_212F, digital_in_send_sensf_req); digital_add_poll_tech(ddev, NFC_DIGITAL_RF_TECH_424F, digital_in_send_sensf_req); } if (matching_im_protocols & DIGITAL_PROTO_ISO15693_RF_TECH) digital_add_poll_tech(ddev, NFC_DIGITAL_RF_TECH_ISO15693, digital_in_send_iso15693_inv_req); if (matching_tm_protocols & NFC_PROTO_NFC_DEP_MASK) { if (ddev->ops->tg_listen_mdaa) { digital_add_poll_tech(ddev, 0, digital_tg_listen_mdaa); } else if (ddev->ops->tg_listen_md) { digital_add_poll_tech(ddev, 0, digital_tg_listen_md); } else { digital_add_poll_tech(ddev, NFC_DIGITAL_RF_TECH_106A, digital_tg_listen_nfca); digital_add_poll_tech(ddev, NFC_DIGITAL_RF_TECH_212F, digital_tg_listen_nfcf); digital_add_poll_tech(ddev, NFC_DIGITAL_RF_TECH_424F, digital_tg_listen_nfcf); } } if (!ddev->poll_tech_count) { pr_err("Unsupported protocols: im=0x%x, tm=0x%x\n", matching_im_protocols, matching_tm_protocols); return -EINVAL; } schedule_delayed_work(&ddev->poll_work, 0); return 0; } static void digital_stop_poll(struct nfc_dev *nfc_dev) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); mutex_lock(&ddev->poll_lock); if (!ddev->poll_tech_count) { pr_err("Polling operation was not running\n"); mutex_unlock(&ddev->poll_lock); return; } ddev->poll_tech_count = 0; mutex_unlock(&ddev->poll_lock); cancel_delayed_work_sync(&ddev->poll_work); digital_abort_cmd(ddev); } static int digital_dev_up(struct nfc_dev *nfc_dev) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); digital_switch_rf(ddev, 1); return 0; } static int digital_dev_down(struct nfc_dev *nfc_dev) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); digital_switch_rf(ddev, 0); return 0; } static int digital_dep_link_up(struct nfc_dev *nfc_dev, struct nfc_target *target, __u8 comm_mode, __u8 *gb, size_t gb_len) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); int rc; rc = digital_in_send_atr_req(ddev, target, comm_mode, gb, gb_len); if (!rc) ddev->curr_protocol = NFC_PROTO_NFC_DEP; return rc; } static int digital_dep_link_down(struct nfc_dev *nfc_dev) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); digital_abort_cmd(ddev); ddev->curr_protocol = 0; return 0; } static int digital_activate_target(struct nfc_dev *nfc_dev, struct nfc_target *target, __u32 protocol) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); if (ddev->poll_tech_count) { pr_err("Can't activate a target while polling\n"); return -EBUSY; } if (ddev->curr_protocol) { pr_err("A target is already active\n"); return -EBUSY; } ddev->curr_protocol = protocol; return 0; } static void digital_deactivate_target(struct nfc_dev *nfc_dev, struct nfc_target *target, u8 mode) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); if (!ddev->curr_protocol) { pr_err("No active target\n"); return; } digital_abort_cmd(ddev); ddev->curr_protocol = 0; } static int digital_tg_send(struct nfc_dev *dev, struct sk_buff *skb) { struct nfc_digital_dev *ddev = nfc_get_drvdata(dev); return digital_tg_send_dep_res(ddev, skb); } static void digital_in_send_complete(struct nfc_digital_dev *ddev, void *arg, struct sk_buff *resp) { struct digital_data_exch *data_exch = arg; int rc; if (IS_ERR(resp)) { rc = PTR_ERR(resp); resp = NULL; goto done; } if (ddev->curr_protocol == NFC_PROTO_MIFARE) { rc = digital_in_recv_mifare_res(resp); /* crc check is done in digital_in_recv_mifare_res() */ goto done; } if ((ddev->curr_protocol == NFC_PROTO_ISO14443) || (ddev->curr_protocol == NFC_PROTO_ISO14443_B)) { rc = digital_in_iso_dep_pull_sod(ddev, resp); if (rc) goto done; } rc = ddev->skb_check_crc(resp); done: if (rc) { kfree_skb(resp); resp = NULL; } data_exch->cb(data_exch->cb_context, resp, rc); kfree(data_exch); } static int digital_in_send(struct nfc_dev *nfc_dev, struct nfc_target *target, struct sk_buff *skb, data_exchange_cb_t cb, void *cb_context) { struct nfc_digital_dev *ddev = nfc_get_drvdata(nfc_dev); struct digital_data_exch *data_exch; int rc; data_exch = kzalloc(sizeof(*data_exch), GFP_KERNEL); if (!data_exch) return -ENOMEM; data_exch->cb = cb; data_exch->cb_context = cb_context; if (ddev->curr_protocol == NFC_PROTO_NFC_DEP) { rc = digital_in_send_dep_req(ddev, target, skb, data_exch); goto exit; } if ((ddev->curr_protocol == NFC_PROTO_ISO14443) || (ddev->curr_protocol == NFC_PROTO_ISO14443_B)) { rc = digital_in_iso_dep_push_sod(ddev, skb); if (rc) goto exit; } ddev->skb_add_crc(skb); rc = digital_in_send_cmd(ddev, skb, 500, digital_in_send_complete, data_exch); exit: if (rc) kfree(data_exch); return rc; } static const struct nfc_ops digital_nfc_ops = { .dev_up = digital_dev_up, .dev_down = digital_dev_down, .start_poll = digital_start_poll, .stop_poll = digital_stop_poll, .dep_link_up = digital_dep_link_up, .dep_link_down = digital_dep_link_down, .activate_target = digital_activate_target, .deactivate_target = digital_deactivate_target, .tm_send = digital_tg_send, .im_transceive = digital_in_send, }; struct nfc_digital_dev *nfc_digital_allocate_device(const struct nfc_digital_ops *ops, __u32 supported_protocols, __u32 driver_capabilities, int tx_headroom, int tx_tailroom) { struct nfc_digital_dev *ddev; if (!ops->in_configure_hw || !ops->in_send_cmd || !ops->tg_listen || !ops->tg_configure_hw || !ops->tg_send_cmd || !ops->abort_cmd || !ops->switch_rf || (ops->tg_listen_md && !ops->tg_get_rf_tech)) return NULL; ddev = kzalloc(sizeof(*ddev), GFP_KERNEL); if (!ddev) return NULL; ddev->driver_capabilities = driver_capabilities; ddev->ops = ops; mutex_init(&ddev->cmd_lock); INIT_LIST_HEAD(&ddev->cmd_queue); INIT_WORK(&ddev->cmd_work, digital_wq_cmd); INIT_WORK(&ddev->cmd_complete_work, digital_wq_cmd_complete); mutex_init(&ddev->poll_lock); INIT_DELAYED_WORK(&ddev->poll_work, digital_wq_poll); if (supported_protocols & NFC_PROTO_JEWEL_MASK) ddev->protocols |= NFC_PROTO_JEWEL_MASK; if (supported_protocols & NFC_PROTO_MIFARE_MASK) ddev->protocols |= NFC_PROTO_MIFARE_MASK; if (supported_protocols & NFC_PROTO_FELICA_MASK) ddev->protocols |= NFC_PROTO_FELICA_MASK; if (supported_protocols & NFC_PROTO_NFC_DEP_MASK) ddev->protocols |= NFC_PROTO_NFC_DEP_MASK; if (supported_protocols & NFC_PROTO_ISO15693_MASK) ddev->protocols |= NFC_PROTO_ISO15693_MASK; if (supported_protocols & NFC_PROTO_ISO14443_MASK) ddev->protocols |= NFC_PROTO_ISO14443_MASK; if (supported_protocols & NFC_PROTO_ISO14443_B_MASK) ddev->protocols |= NFC_PROTO_ISO14443_B_MASK; ddev->tx_headroom = tx_headroom + DIGITAL_MAX_HEADER_LEN; ddev->tx_tailroom = tx_tailroom + DIGITAL_CRC_LEN; ddev->nfc_dev = nfc_allocate_device(&digital_nfc_ops, ddev->protocols, ddev->tx_headroom, ddev->tx_tailroom); if (!ddev->nfc_dev) { pr_err("nfc_allocate_device failed\n"); goto free_dev; } nfc_set_drvdata(ddev->nfc_dev, ddev); return ddev; free_dev: kfree(ddev); return NULL; } EXPORT_SYMBOL(nfc_digital_allocate_device); void nfc_digital_free_device(struct nfc_digital_dev *ddev) { nfc_free_device(ddev->nfc_dev); kfree(ddev); } EXPORT_SYMBOL(nfc_digital_free_device); int nfc_digital_register_device(struct nfc_digital_dev *ddev) { return nfc_register_device(ddev->nfc_dev); } EXPORT_SYMBOL(nfc_digital_register_device); void nfc_digital_unregister_device(struct nfc_digital_dev *ddev) { struct digital_cmd *cmd, *n; nfc_unregister_device(ddev->nfc_dev); mutex_lock(&ddev->poll_lock); ddev->poll_tech_count = 0; mutex_unlock(&ddev->poll_lock); cancel_delayed_work_sync(&ddev->poll_work); cancel_work_sync(&ddev->cmd_work); cancel_work_sync(&ddev->cmd_complete_work); list_for_each_entry_safe(cmd, n, &ddev->cmd_queue, queue) { list_del(&cmd->queue); /* Call the command callback if any and pass it a ENODEV error. * This gives a chance to the command issuer to free any * allocated buffer. */ if (cmd->cmd_cb) cmd->cmd_cb(ddev, cmd->cb_context, ERR_PTR(-ENODEV)); kfree(cmd->mdaa_params); kfree(cmd); } } EXPORT_SYMBOL(nfc_digital_unregister_device); MODULE_DESCRIPTION("NFC Digital protocol stack"); MODULE_LICENSE("GPL"); |
| 12 53 52 52 52 18 14 14 41 11 10 11 3 2 3 17 16 17 5 4 5 15 6 9 13 13 5 1 4 6 6 15 6 4 2 4 4 3 3 4 4 6 4 3 3 3 2 3 1 4 12 12 12 12 6 5 5 6 5 6 5 5 5 5 12 10 8 5 5 6 2 1 3 6 6 3 6 6 6 6 6 14 35 59 34 34 33 1 28 28 28 26 2 3 3 27 27 36 35 5 35 4 35 35 34 34 34 35 28 1 1 1 27 27 28 11 28 34 27 37 33 37 20 20 35 9 31 6 56 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Support for AES-NI and VAES instructions. This file contains glue code. * The real AES implementations are in aesni-intel_asm.S and other .S files. * * Copyright (C) 2008, Intel Corp. * Author: Huang Ying <ying.huang@intel.com> * * Added RFC4106 AES-GCM support for 128-bit keys under the AEAD * interface for 64-bit kernels. * Authors: Adrian Hoban <adrian.hoban@intel.com> * Gabriele Paoloni <gabriele.paoloni@intel.com> * Tadeusz Struk (tadeusz.struk@intel.com) * Aidan O'Mahony (aidan.o.mahony@intel.com) * Copyright (c) 2010, Intel Corporation. * * Copyright 2024 Google LLC */ #include <linux/hardirq.h> #include <linux/types.h> #include <linux/module.h> #include <linux/err.h> #include <crypto/algapi.h> #include <crypto/aes.h> #include <crypto/b128ops.h> #include <crypto/gcm.h> #include <crypto/xts.h> #include <asm/cpu_device_id.h> #include <asm/simd.h> #include <crypto/scatterwalk.h> #include <crypto/internal/aead.h> #include <crypto/internal/simd.h> #include <crypto/internal/skcipher.h> #include <linux/jump_label.h> #include <linux/workqueue.h> #include <linux/spinlock.h> #include <linux/static_call.h> #define AESNI_ALIGN 16 #define AESNI_ALIGN_ATTR __attribute__ ((__aligned__(AESNI_ALIGN))) #define AES_BLOCK_MASK (~(AES_BLOCK_SIZE - 1)) #define AESNI_ALIGN_EXTRA ((AESNI_ALIGN - 1) & ~(CRYPTO_MINALIGN - 1)) #define CRYPTO_AES_CTX_SIZE (sizeof(struct crypto_aes_ctx) + AESNI_ALIGN_EXTRA) #define XTS_AES_CTX_SIZE (sizeof(struct aesni_xts_ctx) + AESNI_ALIGN_EXTRA) struct aesni_xts_ctx { struct crypto_aes_ctx tweak_ctx AESNI_ALIGN_ATTR; struct crypto_aes_ctx crypt_ctx AESNI_ALIGN_ATTR; }; static inline void *aes_align_addr(void *addr) { if (crypto_tfm_ctx_alignment() >= AESNI_ALIGN) return addr; return PTR_ALIGN(addr, AESNI_ALIGN); } asmlinkage void aesni_set_key(struct crypto_aes_ctx *ctx, const u8 *in_key, unsigned int key_len); asmlinkage void aesni_enc(const void *ctx, u8 *out, const u8 *in); asmlinkage void aesni_dec(const void *ctx, u8 *out, const u8 *in); asmlinkage void aesni_ecb_enc(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len); asmlinkage void aesni_ecb_dec(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len); asmlinkage void aesni_cbc_enc(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len, u8 *iv); asmlinkage void aesni_cbc_dec(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len, u8 *iv); asmlinkage void aesni_cts_cbc_enc(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len, u8 *iv); asmlinkage void aesni_cts_cbc_dec(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len, u8 *iv); asmlinkage void aesni_xts_enc(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len, u8 *iv); asmlinkage void aesni_xts_dec(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len, u8 *iv); #ifdef CONFIG_X86_64 asmlinkage void aesni_ctr_enc(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in, unsigned int len, u8 *iv); #endif static inline struct crypto_aes_ctx *aes_ctx(void *raw_ctx) { return aes_align_addr(raw_ctx); } static inline struct aesni_xts_ctx *aes_xts_ctx(struct crypto_skcipher *tfm) { return aes_align_addr(crypto_skcipher_ctx(tfm)); } static int aes_set_key_common(struct crypto_aes_ctx *ctx, const u8 *in_key, unsigned int key_len) { int err; if (!crypto_simd_usable()) return aes_expandkey(ctx, in_key, key_len); err = aes_check_keylen(key_len); if (err) return err; kernel_fpu_begin(); aesni_set_key(ctx, in_key, key_len); kernel_fpu_end(); return 0; } static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key, unsigned int key_len) { return aes_set_key_common(aes_ctx(crypto_tfm_ctx(tfm)), in_key, key_len); } static void aesni_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src) { struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm)); if (!crypto_simd_usable()) { aes_encrypt(ctx, dst, src); } else { kernel_fpu_begin(); aesni_enc(ctx, dst, src); kernel_fpu_end(); } } static void aesni_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src) { struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm)); if (!crypto_simd_usable()) { aes_decrypt(ctx, dst, src); } else { kernel_fpu_begin(); aesni_dec(ctx, dst, src); kernel_fpu_end(); } } static int aesni_skcipher_setkey(struct crypto_skcipher *tfm, const u8 *key, unsigned int len) { return aes_set_key_common(aes_ctx(crypto_skcipher_ctx(tfm)), key, len); } static int ecb_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); struct skcipher_walk walk; unsigned int nbytes; int err; err = skcipher_walk_virt(&walk, req, false); while ((nbytes = walk.nbytes)) { kernel_fpu_begin(); aesni_ecb_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr, nbytes & AES_BLOCK_MASK); kernel_fpu_end(); nbytes &= AES_BLOCK_SIZE - 1; err = skcipher_walk_done(&walk, nbytes); } return err; } static int ecb_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); struct skcipher_walk walk; unsigned int nbytes; int err; err = skcipher_walk_virt(&walk, req, false); while ((nbytes = walk.nbytes)) { kernel_fpu_begin(); aesni_ecb_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr, nbytes & AES_BLOCK_MASK); kernel_fpu_end(); nbytes &= AES_BLOCK_SIZE - 1; err = skcipher_walk_done(&walk, nbytes); } return err; } static int cbc_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); struct skcipher_walk walk; unsigned int nbytes; int err; err = skcipher_walk_virt(&walk, req, false); while ((nbytes = walk.nbytes)) { kernel_fpu_begin(); aesni_cbc_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr, nbytes & AES_BLOCK_MASK, walk.iv); kernel_fpu_end(); nbytes &= AES_BLOCK_SIZE - 1; err = skcipher_walk_done(&walk, nbytes); } return err; } static int cbc_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); struct skcipher_walk walk; unsigned int nbytes; int err; err = skcipher_walk_virt(&walk, req, false); while ((nbytes = walk.nbytes)) { kernel_fpu_begin(); aesni_cbc_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr, nbytes & AES_BLOCK_MASK, walk.iv); kernel_fpu_end(); nbytes &= AES_BLOCK_SIZE - 1; err = skcipher_walk_done(&walk, nbytes); } return err; } static int cts_cbc_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2; struct scatterlist *src = req->src, *dst = req->dst; struct scatterlist sg_src[2], sg_dst[2]; struct skcipher_request subreq; struct skcipher_walk walk; int err; skcipher_request_set_tfm(&subreq, tfm); skcipher_request_set_callback(&subreq, skcipher_request_flags(req), NULL, NULL); if (req->cryptlen <= AES_BLOCK_SIZE) { if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; cbc_blocks = 1; } if (cbc_blocks > 0) { skcipher_request_set_crypt(&subreq, req->src, req->dst, cbc_blocks * AES_BLOCK_SIZE, req->iv); err = cbc_encrypt(&subreq); if (err) return err; if (req->cryptlen == AES_BLOCK_SIZE) return 0; dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen); if (req->dst != req->src) dst = scatterwalk_ffwd(sg_dst, req->dst, subreq.cryptlen); } /* handle ciphertext stealing */ skcipher_request_set_crypt(&subreq, src, dst, req->cryptlen - cbc_blocks * AES_BLOCK_SIZE, req->iv); err = skcipher_walk_virt(&walk, &subreq, false); if (err) return err; kernel_fpu_begin(); aesni_cts_cbc_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes, walk.iv); kernel_fpu_end(); return skcipher_walk_done(&walk, 0); } static int cts_cbc_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); int cbc_blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2; struct scatterlist *src = req->src, *dst = req->dst; struct scatterlist sg_src[2], sg_dst[2]; struct skcipher_request subreq; struct skcipher_walk walk; int err; skcipher_request_set_tfm(&subreq, tfm); skcipher_request_set_callback(&subreq, skcipher_request_flags(req), NULL, NULL); if (req->cryptlen <= AES_BLOCK_SIZE) { if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; cbc_blocks = 1; } if (cbc_blocks > 0) { skcipher_request_set_crypt(&subreq, req->src, req->dst, cbc_blocks * AES_BLOCK_SIZE, req->iv); err = cbc_decrypt(&subreq); if (err) return err; if (req->cryptlen == AES_BLOCK_SIZE) return 0; dst = src = scatterwalk_ffwd(sg_src, req->src, subreq.cryptlen); if (req->dst != req->src) dst = scatterwalk_ffwd(sg_dst, req->dst, subreq.cryptlen); } /* handle ciphertext stealing */ skcipher_request_set_crypt(&subreq, src, dst, req->cryptlen - cbc_blocks * AES_BLOCK_SIZE, req->iv); err = skcipher_walk_virt(&walk, &subreq, false); if (err) return err; kernel_fpu_begin(); aesni_cts_cbc_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes, walk.iv); kernel_fpu_end(); return skcipher_walk_done(&walk, 0); } #ifdef CONFIG_X86_64 /* This is the non-AVX version. */ static int ctr_crypt_aesni(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); u8 keystream[AES_BLOCK_SIZE]; struct skcipher_walk walk; unsigned int nbytes; int err; err = skcipher_walk_virt(&walk, req, false); while ((nbytes = walk.nbytes) > 0) { kernel_fpu_begin(); if (nbytes & AES_BLOCK_MASK) aesni_ctr_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr, nbytes & AES_BLOCK_MASK, walk.iv); nbytes &= ~AES_BLOCK_MASK; if (walk.nbytes == walk.total && nbytes > 0) { aesni_enc(ctx, keystream, walk.iv); crypto_xor_cpy(walk.dst.virt.addr + walk.nbytes - nbytes, walk.src.virt.addr + walk.nbytes - nbytes, keystream, nbytes); crypto_inc(walk.iv, AES_BLOCK_SIZE); nbytes = 0; } kernel_fpu_end(); err = skcipher_walk_done(&walk, nbytes); } return err; } #endif static int xts_setkey_aesni(struct crypto_skcipher *tfm, const u8 *key, unsigned int keylen) { struct aesni_xts_ctx *ctx = aes_xts_ctx(tfm); int err; err = xts_verify_key(tfm, key, keylen); if (err) return err; keylen /= 2; /* first half of xts-key is for crypt */ err = aes_set_key_common(&ctx->crypt_ctx, key, keylen); if (err) return err; /* second half of xts-key is for tweak */ return aes_set_key_common(&ctx->tweak_ctx, key + keylen, keylen); } typedef void (*xts_encrypt_iv_func)(const struct crypto_aes_ctx *tweak_key, u8 iv[AES_BLOCK_SIZE]); typedef void (*xts_crypt_func)(const struct crypto_aes_ctx *key, const u8 *src, u8 *dst, int len, u8 tweak[AES_BLOCK_SIZE]); /* This handles cases where the source and/or destination span pages. */ static noinline int xts_crypt_slowpath(struct skcipher_request *req, xts_crypt_func crypt_func) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); const struct aesni_xts_ctx *ctx = aes_xts_ctx(tfm); int tail = req->cryptlen % AES_BLOCK_SIZE; struct scatterlist sg_src[2], sg_dst[2]; struct skcipher_request subreq; struct skcipher_walk walk; struct scatterlist *src, *dst; int err; /* * If the message length isn't divisible by the AES block size, then * separate off the last full block and the partial block. This ensures * that they are processed in the same call to the assembly function, * which is required for ciphertext stealing. */ if (tail) { skcipher_request_set_tfm(&subreq, tfm); skcipher_request_set_callback(&subreq, skcipher_request_flags(req), NULL, NULL); skcipher_request_set_crypt(&subreq, req->src, req->dst, req->cryptlen - tail - AES_BLOCK_SIZE, req->iv); req = &subreq; } err = skcipher_walk_virt(&walk, req, false); while (walk.nbytes) { kernel_fpu_begin(); (*crypt_func)(&ctx->crypt_ctx, walk.src.virt.addr, walk.dst.virt.addr, walk.nbytes & ~(AES_BLOCK_SIZE - 1), req->iv); kernel_fpu_end(); err = skcipher_walk_done(&walk, walk.nbytes & (AES_BLOCK_SIZE - 1)); } if (err || !tail) return err; /* Do ciphertext stealing with the last full block and partial block. */ dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen); if (req->dst != req->src) dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen); skcipher_request_set_crypt(req, src, dst, AES_BLOCK_SIZE + tail, req->iv); err = skcipher_walk_virt(&walk, req, false); if (err) return err; kernel_fpu_begin(); (*crypt_func)(&ctx->crypt_ctx, walk.src.virt.addr, walk.dst.virt.addr, walk.nbytes, req->iv); kernel_fpu_end(); return skcipher_walk_done(&walk, 0); } /* __always_inline to avoid indirect call in fastpath */ static __always_inline int xts_crypt(struct skcipher_request *req, xts_encrypt_iv_func encrypt_iv, xts_crypt_func crypt_func) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); const struct aesni_xts_ctx *ctx = aes_xts_ctx(tfm); if (unlikely(req->cryptlen < AES_BLOCK_SIZE)) return -EINVAL; kernel_fpu_begin(); (*encrypt_iv)(&ctx->tweak_ctx, req->iv); /* * In practice, virtually all XTS plaintexts and ciphertexts are either * 512 or 4096 bytes and do not use multiple scatterlist elements. To * optimize the performance of these cases, the below fast-path handles * single-scatterlist-element messages as efficiently as possible. The * code is 64-bit specific, as it assumes no page mapping is needed. */ if (IS_ENABLED(CONFIG_X86_64) && likely(req->src->length >= req->cryptlen && req->dst->length >= req->cryptlen)) { (*crypt_func)(&ctx->crypt_ctx, sg_virt(req->src), sg_virt(req->dst), req->cryptlen, req->iv); kernel_fpu_end(); return 0; } kernel_fpu_end(); return xts_crypt_slowpath(req, crypt_func); } static void aesni_xts_encrypt_iv(const struct crypto_aes_ctx *tweak_key, u8 iv[AES_BLOCK_SIZE]) { aesni_enc(tweak_key, iv, iv); } static void aesni_xts_encrypt(const struct crypto_aes_ctx *key, const u8 *src, u8 *dst, int len, u8 tweak[AES_BLOCK_SIZE]) { aesni_xts_enc(key, dst, src, len, tweak); } static void aesni_xts_decrypt(const struct crypto_aes_ctx *key, const u8 *src, u8 *dst, int len, u8 tweak[AES_BLOCK_SIZE]) { aesni_xts_dec(key, dst, src, len, tweak); } static int xts_encrypt_aesni(struct skcipher_request *req) { return xts_crypt(req, aesni_xts_encrypt_iv, aesni_xts_encrypt); } static int xts_decrypt_aesni(struct skcipher_request *req) { return xts_crypt(req, aesni_xts_encrypt_iv, aesni_xts_decrypt); } static struct crypto_alg aesni_cipher_alg = { .cra_name = "aes", .cra_driver_name = "aes-aesni", .cra_priority = 300, .cra_flags = CRYPTO_ALG_TYPE_CIPHER, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = CRYPTO_AES_CTX_SIZE, .cra_module = THIS_MODULE, .cra_u = { .cipher = { .cia_min_keysize = AES_MIN_KEY_SIZE, .cia_max_keysize = AES_MAX_KEY_SIZE, .cia_setkey = aes_set_key, .cia_encrypt = aesni_encrypt, .cia_decrypt = aesni_decrypt } } }; static struct skcipher_alg aesni_skciphers[] = { { .base = { .cra_name = "ecb(aes)", .cra_driver_name = "ecb-aes-aesni", .cra_priority = 400, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = CRYPTO_AES_CTX_SIZE, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, .max_keysize = AES_MAX_KEY_SIZE, .setkey = aesni_skcipher_setkey, .encrypt = ecb_encrypt, .decrypt = ecb_decrypt, }, { .base = { .cra_name = "cbc(aes)", .cra_driver_name = "cbc-aes-aesni", .cra_priority = 400, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = CRYPTO_AES_CTX_SIZE, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, .max_keysize = AES_MAX_KEY_SIZE, .ivsize = AES_BLOCK_SIZE, .setkey = aesni_skcipher_setkey, .encrypt = cbc_encrypt, .decrypt = cbc_decrypt, }, { .base = { .cra_name = "cts(cbc(aes))", .cra_driver_name = "cts-cbc-aes-aesni", .cra_priority = 400, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = CRYPTO_AES_CTX_SIZE, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, .max_keysize = AES_MAX_KEY_SIZE, .ivsize = AES_BLOCK_SIZE, .walksize = 2 * AES_BLOCK_SIZE, .setkey = aesni_skcipher_setkey, .encrypt = cts_cbc_encrypt, .decrypt = cts_cbc_decrypt, #ifdef CONFIG_X86_64 }, { .base = { .cra_name = "ctr(aes)", .cra_driver_name = "ctr-aes-aesni", .cra_priority = 400, .cra_blocksize = 1, .cra_ctxsize = CRYPTO_AES_CTX_SIZE, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, .max_keysize = AES_MAX_KEY_SIZE, .ivsize = AES_BLOCK_SIZE, .chunksize = AES_BLOCK_SIZE, .setkey = aesni_skcipher_setkey, .encrypt = ctr_crypt_aesni, .decrypt = ctr_crypt_aesni, #endif }, { .base = { .cra_name = "xts(aes)", .cra_driver_name = "xts-aes-aesni", .cra_priority = 401, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = XTS_AES_CTX_SIZE, .cra_module = THIS_MODULE, }, .min_keysize = 2 * AES_MIN_KEY_SIZE, .max_keysize = 2 * AES_MAX_KEY_SIZE, .ivsize = AES_BLOCK_SIZE, .walksize = 2 * AES_BLOCK_SIZE, .setkey = xts_setkey_aesni, .encrypt = xts_encrypt_aesni, .decrypt = xts_decrypt_aesni, } }; #ifdef CONFIG_X86_64 asmlinkage void aes_xts_encrypt_iv(const struct crypto_aes_ctx *tweak_key, u8 iv[AES_BLOCK_SIZE]); /* __always_inline to avoid indirect call */ static __always_inline int ctr_crypt(struct skcipher_request *req, void (*ctr64_func)(const struct crypto_aes_ctx *key, const u8 *src, u8 *dst, int len, const u64 le_ctr[2])) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); const struct crypto_aes_ctx *key = aes_ctx(crypto_skcipher_ctx(tfm)); unsigned int nbytes, p1_nbytes, nblocks; struct skcipher_walk walk; u64 le_ctr[2]; u64 ctr64; int err; ctr64 = le_ctr[0] = get_unaligned_be64(&req->iv[8]); le_ctr[1] = get_unaligned_be64(&req->iv[0]); err = skcipher_walk_virt(&walk, req, false); while ((nbytes = walk.nbytes) != 0) { if (nbytes < walk.total) { /* Not the end yet, so keep the length block-aligned. */ nbytes = round_down(nbytes, AES_BLOCK_SIZE); nblocks = nbytes / AES_BLOCK_SIZE; } else { /* It's the end, so include any final partial block. */ nblocks = DIV_ROUND_UP(nbytes, AES_BLOCK_SIZE); } ctr64 += nblocks; kernel_fpu_begin(); if (likely(ctr64 >= nblocks)) { /* The low 64 bits of the counter won't overflow. */ (*ctr64_func)(key, walk.src.virt.addr, walk.dst.virt.addr, nbytes, le_ctr); } else { /* * The low 64 bits of the counter will overflow. The * assembly doesn't handle this case, so split the * operation into two at the point where the overflow * will occur. After the first part, add the carry bit. */ p1_nbytes = min_t(unsigned int, nbytes, (nblocks - ctr64) * AES_BLOCK_SIZE); (*ctr64_func)(key, walk.src.virt.addr, walk.dst.virt.addr, p1_nbytes, le_ctr); le_ctr[0] = 0; le_ctr[1]++; (*ctr64_func)(key, walk.src.virt.addr + p1_nbytes, walk.dst.virt.addr + p1_nbytes, nbytes - p1_nbytes, le_ctr); } kernel_fpu_end(); le_ctr[0] = ctr64; err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } put_unaligned_be64(ctr64, &req->iv[8]); put_unaligned_be64(le_ctr[1], &req->iv[0]); return err; } /* __always_inline to avoid indirect call */ static __always_inline int xctr_crypt(struct skcipher_request *req, void (*xctr_func)(const struct crypto_aes_ctx *key, const u8 *src, u8 *dst, int len, const u8 iv[AES_BLOCK_SIZE], u64 ctr)) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); const struct crypto_aes_ctx *key = aes_ctx(crypto_skcipher_ctx(tfm)); struct skcipher_walk walk; unsigned int nbytes; u64 ctr = 1; int err; err = skcipher_walk_virt(&walk, req, false); while ((nbytes = walk.nbytes) != 0) { if (nbytes < walk.total) nbytes = round_down(nbytes, AES_BLOCK_SIZE); kernel_fpu_begin(); (*xctr_func)(key, walk.src.virt.addr, walk.dst.virt.addr, nbytes, req->iv, ctr); kernel_fpu_end(); ctr += DIV_ROUND_UP(nbytes, AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } return err; } #define DEFINE_AVX_SKCIPHER_ALGS(suffix, driver_name_suffix, priority) \ \ asmlinkage void \ aes_xts_encrypt_##suffix(const struct crypto_aes_ctx *key, const u8 *src, \ u8 *dst, int len, u8 tweak[AES_BLOCK_SIZE]); \ asmlinkage void \ aes_xts_decrypt_##suffix(const struct crypto_aes_ctx *key, const u8 *src, \ u8 *dst, int len, u8 tweak[AES_BLOCK_SIZE]); \ \ static int xts_encrypt_##suffix(struct skcipher_request *req) \ { \ return xts_crypt(req, aes_xts_encrypt_iv, aes_xts_encrypt_##suffix); \ } \ \ static int xts_decrypt_##suffix(struct skcipher_request *req) \ { \ return xts_crypt(req, aes_xts_encrypt_iv, aes_xts_decrypt_##suffix); \ } \ \ asmlinkage void \ aes_ctr64_crypt_##suffix(const struct crypto_aes_ctx *key, \ const u8 *src, u8 *dst, int len, const u64 le_ctr[2]);\ \ static int ctr_crypt_##suffix(struct skcipher_request *req) \ { \ return ctr_crypt(req, aes_ctr64_crypt_##suffix); \ } \ \ asmlinkage void \ aes_xctr_crypt_##suffix(const struct crypto_aes_ctx *key, \ const u8 *src, u8 *dst, int len, \ const u8 iv[AES_BLOCK_SIZE], u64 ctr); \ \ static int xctr_crypt_##suffix(struct skcipher_request *req) \ { \ return xctr_crypt(req, aes_xctr_crypt_##suffix); \ } \ \ static struct skcipher_alg skcipher_algs_##suffix[] = {{ \ .base.cra_name = "xts(aes)", \ .base.cra_driver_name = "xts-aes-" driver_name_suffix, \ .base.cra_priority = priority, \ .base.cra_blocksize = AES_BLOCK_SIZE, \ .base.cra_ctxsize = XTS_AES_CTX_SIZE, \ .base.cra_module = THIS_MODULE, \ .min_keysize = 2 * AES_MIN_KEY_SIZE, \ .max_keysize = 2 * AES_MAX_KEY_SIZE, \ .ivsize = AES_BLOCK_SIZE, \ .walksize = 2 * AES_BLOCK_SIZE, \ .setkey = xts_setkey_aesni, \ .encrypt = xts_encrypt_##suffix, \ .decrypt = xts_decrypt_##suffix, \ }, { \ .base.cra_name = "ctr(aes)", \ .base.cra_driver_name = "ctr-aes-" driver_name_suffix, \ .base.cra_priority = priority, \ .base.cra_blocksize = 1, \ .base.cra_ctxsize = CRYPTO_AES_CTX_SIZE, \ .base.cra_module = THIS_MODULE, \ .min_keysize = AES_MIN_KEY_SIZE, \ .max_keysize = AES_MAX_KEY_SIZE, \ .ivsize = AES_BLOCK_SIZE, \ .chunksize = AES_BLOCK_SIZE, \ .setkey = aesni_skcipher_setkey, \ .encrypt = ctr_crypt_##suffix, \ .decrypt = ctr_crypt_##suffix, \ }, { \ .base.cra_name = "xctr(aes)", \ .base.cra_driver_name = "xctr-aes-" driver_name_suffix, \ .base.cra_priority = priority, \ .base.cra_blocksize = 1, \ .base.cra_ctxsize = CRYPTO_AES_CTX_SIZE, \ .base.cra_module = THIS_MODULE, \ .min_keysize = AES_MIN_KEY_SIZE, \ .max_keysize = AES_MAX_KEY_SIZE, \ .ivsize = AES_BLOCK_SIZE, \ .chunksize = AES_BLOCK_SIZE, \ .setkey = aesni_skcipher_setkey, \ .encrypt = xctr_crypt_##suffix, \ .decrypt = xctr_crypt_##suffix, \ }} DEFINE_AVX_SKCIPHER_ALGS(aesni_avx, "aesni-avx", 500); #if defined(CONFIG_AS_VAES) && defined(CONFIG_AS_VPCLMULQDQ) DEFINE_AVX_SKCIPHER_ALGS(vaes_avx2, "vaes-avx2", 600); DEFINE_AVX_SKCIPHER_ALGS(vaes_avx512, "vaes-avx512", 800); #endif /* The common part of the x86_64 AES-GCM key struct */ struct aes_gcm_key { /* Expanded AES key and the AES key length in bytes */ struct crypto_aes_ctx aes_key; /* RFC4106 nonce (used only by the rfc4106 algorithms) */ u32 rfc4106_nonce; }; /* Key struct used by the AES-NI implementations of AES-GCM */ struct aes_gcm_key_aesni { /* * Common part of the key. The assembly code requires 16-byte alignment * for the round keys; we get this by them being located at the start of * the struct and the whole struct being 16-byte aligned. */ struct aes_gcm_key base; /* * Powers of the hash key H^8 through H^1. These are 128-bit values. * They all have an extra factor of x^-1 and are byte-reversed. 16-byte * alignment is required by the assembly code. */ u64 h_powers[8][2] __aligned(16); /* * h_powers_xored[i] contains the two 64-bit halves of h_powers[i] XOR'd * together. It's used for Karatsuba multiplication. 16-byte alignment * is required by the assembly code. */ u64 h_powers_xored[8] __aligned(16); /* * H^1 times x^64 (and also the usual extra factor of x^-1). 16-byte * alignment is required by the assembly code. */ u64 h_times_x64[2] __aligned(16); }; #define AES_GCM_KEY_AESNI(key) \ container_of((key), struct aes_gcm_key_aesni, base) #define AES_GCM_KEY_AESNI_SIZE \ (sizeof(struct aes_gcm_key_aesni) + (15 & ~(CRYPTO_MINALIGN - 1))) /* Key struct used by the VAES + AVX10 implementations of AES-GCM */ struct aes_gcm_key_avx10 { /* * Common part of the key. The assembly code prefers 16-byte alignment * for the round keys; we get this by them being located at the start of * the struct and the whole struct being 64-byte aligned. */ struct aes_gcm_key base; /* * Powers of the hash key H^16 through H^1. These are 128-bit values. * They all have an extra factor of x^-1 and are byte-reversed. This * array is aligned to a 64-byte boundary to make it naturally aligned * for 512-bit loads, which can improve performance. (The assembly code * doesn't *need* the alignment; this is just an optimization.) */ u64 h_powers[16][2] __aligned(64); /* Three padding blocks required by the assembly code */ u64 padding[3][2]; }; #define AES_GCM_KEY_AVX10(key) \ container_of((key), struct aes_gcm_key_avx10, base) #define AES_GCM_KEY_AVX10_SIZE \ (sizeof(struct aes_gcm_key_avx10) + (63 & ~(CRYPTO_MINALIGN - 1))) /* * These flags are passed to the AES-GCM helper functions to specify the * specific version of AES-GCM (RFC4106 or not), whether it's encryption or * decryption, and which assembly functions should be called. Assembly * functions are selected using flags instead of function pointers to avoid * indirect calls (which are very expensive on x86) regardless of inlining. */ #define FLAG_RFC4106 BIT(0) #define FLAG_ENC BIT(1) #define FLAG_AVX BIT(2) #if defined(CONFIG_AS_VAES) && defined(CONFIG_AS_VPCLMULQDQ) # define FLAG_AVX10_256 BIT(3) # define FLAG_AVX10_512 BIT(4) #else /* * This should cause all calls to the AVX10 assembly functions to be * optimized out, avoiding the need to ifdef each call individually. */ # define FLAG_AVX10_256 0 # define FLAG_AVX10_512 0 #endif static inline struct aes_gcm_key * aes_gcm_key_get(struct crypto_aead *tfm, int flags) { if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512)) return PTR_ALIGN(crypto_aead_ctx(tfm), 64); else return PTR_ALIGN(crypto_aead_ctx(tfm), 16); } asmlinkage void aes_gcm_precompute_aesni(struct aes_gcm_key_aesni *key); asmlinkage void aes_gcm_precompute_aesni_avx(struct aes_gcm_key_aesni *key); asmlinkage void aes_gcm_precompute_vaes_avx10_256(struct aes_gcm_key_avx10 *key); asmlinkage void aes_gcm_precompute_vaes_avx10_512(struct aes_gcm_key_avx10 *key); static void aes_gcm_precompute(struct aes_gcm_key *key, int flags) { /* * To make things a bit easier on the assembly side, the AVX10 * implementations use the same key format. Therefore, a single * function using 256-bit vectors would suffice here. However, it's * straightforward to provide a 512-bit one because of how the assembly * code is structured, and it works nicely because the total size of the * key powers is a multiple of 512 bits. So we take advantage of that. * * A similar situation applies to the AES-NI implementations. */ if (flags & FLAG_AVX10_512) aes_gcm_precompute_vaes_avx10_512(AES_GCM_KEY_AVX10(key)); else if (flags & FLAG_AVX10_256) aes_gcm_precompute_vaes_avx10_256(AES_GCM_KEY_AVX10(key)); else if (flags & FLAG_AVX) aes_gcm_precompute_aesni_avx(AES_GCM_KEY_AESNI(key)); else aes_gcm_precompute_aesni(AES_GCM_KEY_AESNI(key)); } asmlinkage void aes_gcm_aad_update_aesni(const struct aes_gcm_key_aesni *key, u8 ghash_acc[16], const u8 *aad, int aadlen); asmlinkage void aes_gcm_aad_update_aesni_avx(const struct aes_gcm_key_aesni *key, u8 ghash_acc[16], const u8 *aad, int aadlen); asmlinkage void aes_gcm_aad_update_vaes_avx10(const struct aes_gcm_key_avx10 *key, u8 ghash_acc[16], const u8 *aad, int aadlen); static void aes_gcm_aad_update(const struct aes_gcm_key *key, u8 ghash_acc[16], const u8 *aad, int aadlen, int flags) { if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512)) aes_gcm_aad_update_vaes_avx10(AES_GCM_KEY_AVX10(key), ghash_acc, aad, aadlen); else if (flags & FLAG_AVX) aes_gcm_aad_update_aesni_avx(AES_GCM_KEY_AESNI(key), ghash_acc, aad, aadlen); else aes_gcm_aad_update_aesni(AES_GCM_KEY_AESNI(key), ghash_acc, aad, aadlen); } asmlinkage void aes_gcm_enc_update_aesni(const struct aes_gcm_key_aesni *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen); asmlinkage void aes_gcm_enc_update_aesni_avx(const struct aes_gcm_key_aesni *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen); asmlinkage void aes_gcm_enc_update_vaes_avx10_256(const struct aes_gcm_key_avx10 *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen); asmlinkage void aes_gcm_enc_update_vaes_avx10_512(const struct aes_gcm_key_avx10 *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen); asmlinkage void aes_gcm_dec_update_aesni(const struct aes_gcm_key_aesni *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen); asmlinkage void aes_gcm_dec_update_aesni_avx(const struct aes_gcm_key_aesni *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen); asmlinkage void aes_gcm_dec_update_vaes_avx10_256(const struct aes_gcm_key_avx10 *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen); asmlinkage void aes_gcm_dec_update_vaes_avx10_512(const struct aes_gcm_key_avx10 *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen); /* __always_inline to optimize out the branches based on @flags */ static __always_inline void aes_gcm_update(const struct aes_gcm_key *key, const u32 le_ctr[4], u8 ghash_acc[16], const u8 *src, u8 *dst, int datalen, int flags) { if (flags & FLAG_ENC) { if (flags & FLAG_AVX10_512) aes_gcm_enc_update_vaes_avx10_512(AES_GCM_KEY_AVX10(key), le_ctr, ghash_acc, src, dst, datalen); else if (flags & FLAG_AVX10_256) aes_gcm_enc_update_vaes_avx10_256(AES_GCM_KEY_AVX10(key), le_ctr, ghash_acc, src, dst, datalen); else if (flags & FLAG_AVX) aes_gcm_enc_update_aesni_avx(AES_GCM_KEY_AESNI(key), le_ctr, ghash_acc, src, dst, datalen); else aes_gcm_enc_update_aesni(AES_GCM_KEY_AESNI(key), le_ctr, ghash_acc, src, dst, datalen); } else { if (flags & FLAG_AVX10_512) aes_gcm_dec_update_vaes_avx10_512(AES_GCM_KEY_AVX10(key), le_ctr, ghash_acc, src, dst, datalen); else if (flags & FLAG_AVX10_256) aes_gcm_dec_update_vaes_avx10_256(AES_GCM_KEY_AVX10(key), le_ctr, ghash_acc, src, dst, datalen); else if (flags & FLAG_AVX) aes_gcm_dec_update_aesni_avx(AES_GCM_KEY_AESNI(key), le_ctr, ghash_acc, src, dst, datalen); else aes_gcm_dec_update_aesni(AES_GCM_KEY_AESNI(key), le_ctr, ghash_acc, src, dst, datalen); } } asmlinkage void aes_gcm_enc_final_aesni(const struct aes_gcm_key_aesni *key, const u32 le_ctr[4], u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen); asmlinkage void aes_gcm_enc_final_aesni_avx(const struct aes_gcm_key_aesni *key, const u32 le_ctr[4], u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen); asmlinkage void aes_gcm_enc_final_vaes_avx10(const struct aes_gcm_key_avx10 *key, const u32 le_ctr[4], u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen); /* __always_inline to optimize out the branches based on @flags */ static __always_inline void aes_gcm_enc_final(const struct aes_gcm_key *key, const u32 le_ctr[4], u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen, int flags) { if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512)) aes_gcm_enc_final_vaes_avx10(AES_GCM_KEY_AVX10(key), le_ctr, ghash_acc, total_aadlen, total_datalen); else if (flags & FLAG_AVX) aes_gcm_enc_final_aesni_avx(AES_GCM_KEY_AESNI(key), le_ctr, ghash_acc, total_aadlen, total_datalen); else aes_gcm_enc_final_aesni(AES_GCM_KEY_AESNI(key), le_ctr, ghash_acc, total_aadlen, total_datalen); } asmlinkage bool __must_check aes_gcm_dec_final_aesni(const struct aes_gcm_key_aesni *key, const u32 le_ctr[4], const u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen, const u8 tag[16], int taglen); asmlinkage bool __must_check aes_gcm_dec_final_aesni_avx(const struct aes_gcm_key_aesni *key, const u32 le_ctr[4], const u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen, const u8 tag[16], int taglen); asmlinkage bool __must_check aes_gcm_dec_final_vaes_avx10(const struct aes_gcm_key_avx10 *key, const u32 le_ctr[4], const u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen, const u8 tag[16], int taglen); /* __always_inline to optimize out the branches based on @flags */ static __always_inline bool __must_check aes_gcm_dec_final(const struct aes_gcm_key *key, const u32 le_ctr[4], u8 ghash_acc[16], u64 total_aadlen, u64 total_datalen, u8 tag[16], int taglen, int flags) { if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512)) return aes_gcm_dec_final_vaes_avx10(AES_GCM_KEY_AVX10(key), le_ctr, ghash_acc, total_aadlen, total_datalen, tag, taglen); else if (flags & FLAG_AVX) return aes_gcm_dec_final_aesni_avx(AES_GCM_KEY_AESNI(key), le_ctr, ghash_acc, total_aadlen, total_datalen, tag, taglen); else return aes_gcm_dec_final_aesni(AES_GCM_KEY_AESNI(key), le_ctr, ghash_acc, total_aadlen, total_datalen, tag, taglen); } /* * This is the Integrity Check Value (aka the authentication tag) length and can * be 8, 12 or 16 bytes long. */ static int common_rfc4106_set_authsize(struct crypto_aead *aead, unsigned int authsize) { switch (authsize) { case 8: case 12: case 16: break; default: return -EINVAL; } return 0; } static int generic_gcmaes_set_authsize(struct crypto_aead *tfm, unsigned int authsize) { switch (authsize) { case 4: case 8: case 12: case 13: case 14: case 15: case 16: break; default: return -EINVAL; } return 0; } /* * This is the setkey function for the x86_64 implementations of AES-GCM. It * saves the RFC4106 nonce if applicable, expands the AES key, and precomputes * powers of the hash key. * * To comply with the crypto_aead API, this has to be usable in no-SIMD context. * For that reason, this function includes a portable C implementation of the * needed logic. However, the portable C implementation is very slow, taking * about the same time as encrypting 37 KB of data. To be ready for users that * may set a key even somewhat frequently, we therefore also include a SIMD * assembly implementation, expanding the AES key using AES-NI and precomputing * the hash key powers using PCLMULQDQ or VPCLMULQDQ. */ static int gcm_setkey(struct crypto_aead *tfm, const u8 *raw_key, unsigned int keylen, int flags) { struct aes_gcm_key *key = aes_gcm_key_get(tfm, flags); int err; if (flags & FLAG_RFC4106) { if (keylen < 4) return -EINVAL; keylen -= 4; key->rfc4106_nonce = get_unaligned_be32(raw_key + keylen); } /* The assembly code assumes the following offsets. */ BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, base.aes_key.key_enc) != 0); BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, base.aes_key.key_length) != 480); BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_powers) != 496); BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_powers_xored) != 624); BUILD_BUG_ON(offsetof(struct aes_gcm_key_aesni, h_times_x64) != 688); BUILD_BUG_ON(offsetof(struct aes_gcm_key_avx10, base.aes_key.key_enc) != 0); BUILD_BUG_ON(offsetof(struct aes_gcm_key_avx10, base.aes_key.key_length) != 480); BUILD_BUG_ON(offsetof(struct aes_gcm_key_avx10, h_powers) != 512); BUILD_BUG_ON(offsetof(struct aes_gcm_key_avx10, padding) != 768); if (likely(crypto_simd_usable())) { err = aes_check_keylen(keylen); if (err) return err; kernel_fpu_begin(); aesni_set_key(&key->aes_key, raw_key, keylen); aes_gcm_precompute(key, flags); kernel_fpu_end(); } else { static const u8 x_to_the_minus1[16] __aligned(__alignof__(be128)) = { [0] = 0xc2, [15] = 1 }; static const u8 x_to_the_63[16] __aligned(__alignof__(be128)) = { [7] = 1, }; be128 h1 = {}; be128 h; int i; err = aes_expandkey(&key->aes_key, raw_key, keylen); if (err) return err; /* Encrypt the all-zeroes block to get the hash key H^1 */ aes_encrypt(&key->aes_key, (u8 *)&h1, (u8 *)&h1); /* Compute H^1 * x^-1 */ h = h1; gf128mul_lle(&h, (const be128 *)x_to_the_minus1); /* Compute the needed key powers */ if (flags & (FLAG_AVX10_256 | FLAG_AVX10_512)) { struct aes_gcm_key_avx10 *k = AES_GCM_KEY_AVX10(key); for (i = ARRAY_SIZE(k->h_powers) - 1; i >= 0; i--) { k->h_powers[i][0] = be64_to_cpu(h.b); k->h_powers[i][1] = be64_to_cpu(h.a); gf128mul_lle(&h, &h1); } memset(k->padding, 0, sizeof(k->padding)); } else { struct aes_gcm_key_aesni *k = AES_GCM_KEY_AESNI(key); for (i = ARRAY_SIZE(k->h_powers) - 1; i >= 0; i--) { k->h_powers[i][0] = be64_to_cpu(h.b); k->h_powers[i][1] = be64_to_cpu(h.a); k->h_powers_xored[i] = k->h_powers[i][0] ^ k->h_powers[i][1]; gf128mul_lle(&h, &h1); } gf128mul_lle(&h1, (const be128 *)x_to_the_63); k->h_times_x64[0] = be64_to_cpu(h1.b); k->h_times_x64[1] = be64_to_cpu(h1.a); } } return 0; } /* * Initialize @ghash_acc, then pass all @assoclen bytes of associated data * (a.k.a. additional authenticated data) from @sg_src through the GHASH update * assembly function. kernel_fpu_begin() must have already been called. */ static void gcm_process_assoc(const struct aes_gcm_key *key, u8 ghash_acc[16], struct scatterlist *sg_src, unsigned int assoclen, int flags) { struct scatter_walk walk; /* * The assembly function requires that the length of any non-last * segment of associated data be a multiple of 16 bytes, so this * function does the buffering needed to achieve that. */ unsigned int pos = 0; u8 buf[16]; memset(ghash_acc, 0, 16); scatterwalk_start(&walk, sg_src); while (assoclen) { unsigned int orig_len_this_step = scatterwalk_next( &walk, assoclen); unsigned int len_this_step = orig_len_this_step; unsigned int len; const u8 *src = walk.addr; if (unlikely(pos)) { len = min(len_this_step, 16 - pos); memcpy(&buf[pos], src, len); pos += len; src += len; len_this_step -= len; if (pos < 16) goto next; aes_gcm_aad_update(key, ghash_acc, buf, 16, flags); pos = 0; } len = len_this_step; if (unlikely(assoclen)) /* Not the last segment yet? */ len = round_down(len, 16); aes_gcm_aad_update(key, ghash_acc, src, len, flags); src += len; len_this_step -= len; if (unlikely(len_this_step)) { memcpy(buf, src, len_this_step); pos = len_this_step; } next: scatterwalk_done_src(&walk, orig_len_this_step); if (need_resched()) { kernel_fpu_end(); kernel_fpu_begin(); } assoclen -= orig_len_this_step; } if (unlikely(pos)) aes_gcm_aad_update(key, ghash_acc, buf, pos, flags); } /* __always_inline to optimize out the branches based on @flags */ static __always_inline int gcm_crypt(struct aead_request *req, int flags) { struct crypto_aead *tfm = crypto_aead_reqtfm(req); const struct aes_gcm_key *key = aes_gcm_key_get(tfm, flags); unsigned int assoclen = req->assoclen; struct skcipher_walk walk; unsigned int nbytes; u8 ghash_acc[16]; /* GHASH accumulator */ u32 le_ctr[4]; /* Counter in little-endian format */ int taglen; int err; /* Initialize the counter and determine the associated data length. */ le_ctr[0] = 2; if (flags & FLAG_RFC4106) { if (unlikely(assoclen != 16 && assoclen != 20)) return -EINVAL; assoclen -= 8; le_ctr[1] = get_unaligned_be32(req->iv + 4); le_ctr[2] = get_unaligned_be32(req->iv + 0); le_ctr[3] = key->rfc4106_nonce; /* already byte-swapped */ } else { le_ctr[1] = get_unaligned_be32(req->iv + 8); le_ctr[2] = get_unaligned_be32(req->iv + 4); le_ctr[3] = get_unaligned_be32(req->iv + 0); } /* Begin walking through the plaintext or ciphertext. */ if (flags & FLAG_ENC) err = skcipher_walk_aead_encrypt(&walk, req, false); else err = skcipher_walk_aead_decrypt(&walk, req, false); if (err) return err; /* * Since the AES-GCM assembly code requires that at least three assembly * functions be called to process any message (this is needed to support * incremental updates cleanly), to reduce overhead we try to do all * three calls in the same kernel FPU section if possible. We close the * section and start a new one if there are multiple data segments or if * rescheduling is needed while processing the associated data. */ kernel_fpu_begin(); /* Pass the associated data through GHASH. */ gcm_process_assoc(key, ghash_acc, req->src, assoclen, flags); /* En/decrypt the data and pass the ciphertext through GHASH. */ while (unlikely((nbytes = walk.nbytes) < walk.total)) { /* * Non-last segment. In this case, the assembly function * requires that the length be a multiple of 16 (AES_BLOCK_SIZE) * bytes. The needed buffering of up to 16 bytes is handled by * the skcipher_walk. Here we just need to round down to a * multiple of 16. */ nbytes = round_down(nbytes, AES_BLOCK_SIZE); aes_gcm_update(key, le_ctr, ghash_acc, walk.src.virt.addr, walk.dst.virt.addr, nbytes, flags); le_ctr[0] += nbytes / AES_BLOCK_SIZE; kernel_fpu_end(); err = skcipher_walk_done(&walk, walk.nbytes - nbytes); if (err) return err; kernel_fpu_begin(); } /* Last segment: process all remaining data. */ aes_gcm_update(key, le_ctr, ghash_acc, walk.src.virt.addr, walk.dst.virt.addr, nbytes, flags); /* * The low word of the counter isn't used by the finalize, so there's no * need to increment it here. */ /* Finalize */ taglen = crypto_aead_authsize(tfm); if (flags & FLAG_ENC) { /* Finish computing the auth tag. */ aes_gcm_enc_final(key, le_ctr, ghash_acc, assoclen, req->cryptlen, flags); /* Store the computed auth tag in the dst scatterlist. */ scatterwalk_map_and_copy(ghash_acc, req->dst, req->assoclen + req->cryptlen, taglen, 1); } else { unsigned int datalen = req->cryptlen - taglen; u8 tag[16]; /* Get the transmitted auth tag from the src scatterlist. */ scatterwalk_map_and_copy(tag, req->src, req->assoclen + datalen, taglen, 0); /* * Finish computing the auth tag and compare it to the * transmitted one. The assembly function does the actual tag * comparison. Here, just check the boolean result. */ if (!aes_gcm_dec_final(key, le_ctr, ghash_acc, assoclen, datalen, tag, taglen, flags)) err = -EBADMSG; } kernel_fpu_end(); if (nbytes) skcipher_walk_done(&walk, 0); return err; } #define DEFINE_GCM_ALGS(suffix, flags, generic_driver_name, rfc_driver_name, \ ctxsize, priority) \ \ static int gcm_setkey_##suffix(struct crypto_aead *tfm, const u8 *raw_key, \ unsigned int keylen) \ { \ return gcm_setkey(tfm, raw_key, keylen, (flags)); \ } \ \ static int gcm_encrypt_##suffix(struct aead_request *req) \ { \ return gcm_crypt(req, (flags) | FLAG_ENC); \ } \ \ static int gcm_decrypt_##suffix(struct aead_request *req) \ { \ return gcm_crypt(req, (flags)); \ } \ \ static int rfc4106_setkey_##suffix(struct crypto_aead *tfm, const u8 *raw_key, \ unsigned int keylen) \ { \ return gcm_setkey(tfm, raw_key, keylen, (flags) | FLAG_RFC4106); \ } \ \ static int rfc4106_encrypt_##suffix(struct aead_request *req) \ { \ return gcm_crypt(req, (flags) | FLAG_RFC4106 | FLAG_ENC); \ } \ \ static int rfc4106_decrypt_##suffix(struct aead_request *req) \ { \ return gcm_crypt(req, (flags) | FLAG_RFC4106); \ } \ \ static struct aead_alg aes_gcm_algs_##suffix[] = { { \ .setkey = gcm_setkey_##suffix, \ .setauthsize = generic_gcmaes_set_authsize, \ .encrypt = gcm_encrypt_##suffix, \ .decrypt = gcm_decrypt_##suffix, \ .ivsize = GCM_AES_IV_SIZE, \ .chunksize = AES_BLOCK_SIZE, \ .maxauthsize = 16, \ .base = { \ .cra_name = "gcm(aes)", \ .cra_driver_name = generic_driver_name, \ .cra_priority = (priority), \ .cra_blocksize = 1, \ .cra_ctxsize = (ctxsize), \ .cra_module = THIS_MODULE, \ }, \ }, { \ .setkey = rfc4106_setkey_##suffix, \ .setauthsize = common_rfc4106_set_authsize, \ .encrypt = rfc4106_encrypt_##suffix, \ .decrypt = rfc4106_decrypt_##suffix, \ .ivsize = GCM_RFC4106_IV_SIZE, \ .chunksize = AES_BLOCK_SIZE, \ .maxauthsize = 16, \ .base = { \ .cra_name = "rfc4106(gcm(aes))", \ .cra_driver_name = rfc_driver_name, \ .cra_priority = (priority), \ .cra_blocksize = 1, \ .cra_ctxsize = (ctxsize), \ .cra_module = THIS_MODULE, \ }, \ } } /* aes_gcm_algs_aesni */ DEFINE_GCM_ALGS(aesni, /* no flags */ 0, "generic-gcm-aesni", "rfc4106-gcm-aesni", AES_GCM_KEY_AESNI_SIZE, 400); /* aes_gcm_algs_aesni_avx */ DEFINE_GCM_ALGS(aesni_avx, FLAG_AVX, "generic-gcm-aesni-avx", "rfc4106-gcm-aesni-avx", AES_GCM_KEY_AESNI_SIZE, 500); #if defined(CONFIG_AS_VAES) && defined(CONFIG_AS_VPCLMULQDQ) /* aes_gcm_algs_vaes_avx10_256 */ DEFINE_GCM_ALGS(vaes_avx10_256, FLAG_AVX10_256, "generic-gcm-vaes-avx10_256", "rfc4106-gcm-vaes-avx10_256", AES_GCM_KEY_AVX10_SIZE, 700); /* aes_gcm_algs_vaes_avx10_512 */ DEFINE_GCM_ALGS(vaes_avx10_512, FLAG_AVX10_512, "generic-gcm-vaes-avx10_512", "rfc4106-gcm-vaes-avx10_512", AES_GCM_KEY_AVX10_SIZE, 800); #endif /* CONFIG_AS_VAES && CONFIG_AS_VPCLMULQDQ */ static int __init register_avx_algs(void) { int err; if (!boot_cpu_has(X86_FEATURE_AVX)) return 0; err = crypto_register_skciphers(skcipher_algs_aesni_avx, ARRAY_SIZE(skcipher_algs_aesni_avx)); if (err) return err; err = crypto_register_aeads(aes_gcm_algs_aesni_avx, ARRAY_SIZE(aes_gcm_algs_aesni_avx)); if (err) return err; /* * Note: not all the algorithms registered below actually require * VPCLMULQDQ. But in practice every CPU with VAES also has VPCLMULQDQ. * Similarly, the assembler support was added at about the same time. * For simplicity, just always check for VAES and VPCLMULQDQ together. */ #if defined(CONFIG_AS_VAES) && defined(CONFIG_AS_VPCLMULQDQ) if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_VAES) || !boot_cpu_has(X86_FEATURE_VPCLMULQDQ) || !boot_cpu_has(X86_FEATURE_PCLMULQDQ) || !cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) return 0; err = crypto_register_skciphers(skcipher_algs_vaes_avx2, ARRAY_SIZE(skcipher_algs_vaes_avx2)); if (err) return err; if (!boot_cpu_has(X86_FEATURE_AVX512BW) || !boot_cpu_has(X86_FEATURE_AVX512VL) || !boot_cpu_has(X86_FEATURE_BMI2) || !cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM | XFEATURE_MASK_AVX512, NULL)) return 0; err = crypto_register_aeads(aes_gcm_algs_vaes_avx10_256, ARRAY_SIZE(aes_gcm_algs_vaes_avx10_256)); if (err) return err; if (boot_cpu_has(X86_FEATURE_PREFER_YMM)) { int i; for (i = 0; i < ARRAY_SIZE(skcipher_algs_vaes_avx512); i++) skcipher_algs_vaes_avx512[i].base.cra_priority = 1; for (i = 0; i < ARRAY_SIZE(aes_gcm_algs_vaes_avx10_512); i++) aes_gcm_algs_vaes_avx10_512[i].base.cra_priority = 1; } err = crypto_register_skciphers(skcipher_algs_vaes_avx512, ARRAY_SIZE(skcipher_algs_vaes_avx512)); if (err) return err; err = crypto_register_aeads(aes_gcm_algs_vaes_avx10_512, ARRAY_SIZE(aes_gcm_algs_vaes_avx10_512)); if (err) return err; #endif /* CONFIG_AS_VAES && CONFIG_AS_VPCLMULQDQ */ return 0; } #define unregister_skciphers(A) \ if (refcount_read(&(A)[0].base.cra_refcnt) != 0) \ crypto_unregister_skciphers((A), ARRAY_SIZE(A)) #define unregister_aeads(A) \ if (refcount_read(&(A)[0].base.cra_refcnt) != 0) \ crypto_unregister_aeads((A), ARRAY_SIZE(A)) static void unregister_avx_algs(void) { unregister_skciphers(skcipher_algs_aesni_avx); unregister_aeads(aes_gcm_algs_aesni_avx); #if defined(CONFIG_AS_VAES) && defined(CONFIG_AS_VPCLMULQDQ) unregister_skciphers(skcipher_algs_vaes_avx2); unregister_skciphers(skcipher_algs_vaes_avx512); unregister_aeads(aes_gcm_algs_vaes_avx10_256); unregister_aeads(aes_gcm_algs_vaes_avx10_512); #endif } #else /* CONFIG_X86_64 */ static struct aead_alg aes_gcm_algs_aesni[0]; static int __init register_avx_algs(void) { return 0; } static void unregister_avx_algs(void) { } #endif /* !CONFIG_X86_64 */ static const struct x86_cpu_id aesni_cpu_id[] = { X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), {} }; MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); static int __init aesni_init(void) { int err; if (!x86_match_cpu(aesni_cpu_id)) return -ENODEV; err = crypto_register_alg(&aesni_cipher_alg); if (err) return err; err = crypto_register_skciphers(aesni_skciphers, ARRAY_SIZE(aesni_skciphers)); if (err) goto unregister_cipher; err = crypto_register_aeads(aes_gcm_algs_aesni, ARRAY_SIZE(aes_gcm_algs_aesni)); if (err) goto unregister_skciphers; err = register_avx_algs(); if (err) goto unregister_avx; return 0; unregister_avx: unregister_avx_algs(); crypto_unregister_aeads(aes_gcm_algs_aesni, ARRAY_SIZE(aes_gcm_algs_aesni)); unregister_skciphers: crypto_unregister_skciphers(aesni_skciphers, ARRAY_SIZE(aesni_skciphers)); unregister_cipher: crypto_unregister_alg(&aesni_cipher_alg); return err; } static void __exit aesni_exit(void) { crypto_unregister_aeads(aes_gcm_algs_aesni, ARRAY_SIZE(aes_gcm_algs_aesni)); crypto_unregister_skciphers(aesni_skciphers, ARRAY_SIZE(aesni_skciphers)); crypto_unregister_alg(&aesni_cipher_alg); unregister_avx_algs(); } module_init(aesni_init); module_exit(aesni_exit); MODULE_DESCRIPTION("AES cipher and modes, optimized with AES-NI or VAES instructions"); MODULE_LICENSE("GPL"); MODULE_ALIAS_CRYPTO("aes"); |
| 16 16 16 16 16 16 16 16 14 14 13 9 9 13 11 13 13 13 13 1 13 1 1 13 12 13 8 13 13 13 13 12 4 4 3 1 15 16 16 16 16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 | /* inffast.c -- fast decoding * Copyright (C) 1995-2004 Mark Adler * For conditions of distribution and use, see copyright notice in zlib.h */ #include <linux/zutil.h> #include "inftrees.h" #include "inflate.h" #include "inffast.h" #ifndef ASMINF union uu { unsigned short us; unsigned char b[2]; }; /* Endian independent version */ static inline unsigned short get_unaligned16(const unsigned short *p) { union uu mm; unsigned char *b = (unsigned char *)p; mm.b[0] = b[0]; mm.b[1] = b[1]; return mm.us; } /* Decode literal, length, and distance codes and write out the resulting literal and match bytes until either not enough input or output is available, an end-of-block is encountered, or a data error is encountered. When large enough input and output buffers are supplied to inflate(), for example, a 16K input buffer and a 64K output buffer, more than 95% of the inflate execution time is spent in this routine. Entry assumptions: state->mode == LEN strm->avail_in >= 6 strm->avail_out >= 258 start >= strm->avail_out state->bits < 8 On return, state->mode is one of: LEN -- ran out of enough output space or enough available input TYPE -- reached end of block code, inflate() to interpret next block BAD -- error in block data Notes: - The maximum input bits used by a length/distance pair is 15 bits for the length code, 5 bits for the length extra, 15 bits for the distance code, and 13 bits for the distance extra. This totals 48 bits, or six bytes. Therefore if strm->avail_in >= 6, then there is enough input to avoid checking for available input while decoding. - The maximum bytes that a single length/distance pair can output is 258 bytes, which is the maximum length that can be coded. inflate_fast() requires strm->avail_out >= 258 for each loop to avoid checking for output space. - @start: inflate()'s starting value for strm->avail_out */ void inflate_fast(z_streamp strm, unsigned start) { struct inflate_state *state; const unsigned char *in; /* local strm->next_in */ const unsigned char *last; /* while in < last, enough input available */ unsigned char *out; /* local strm->next_out */ unsigned char *beg; /* inflate()'s initial strm->next_out */ unsigned char *end; /* while out < end, enough space available */ #ifdef INFLATE_STRICT unsigned dmax; /* maximum distance from zlib header */ #endif unsigned wsize; /* window size or zero if not using window */ unsigned whave; /* valid bytes in the window */ unsigned write; /* window write index */ unsigned char *window; /* allocated sliding window, if wsize != 0 */ unsigned long hold; /* local strm->hold */ unsigned bits; /* local strm->bits */ code const *lcode; /* local strm->lencode */ code const *dcode; /* local strm->distcode */ unsigned lmask; /* mask for first level of length codes */ unsigned dmask; /* mask for first level of distance codes */ code this; /* retrieved table entry */ unsigned op; /* code bits, operation, extra bits, or */ /* window position, window bytes to copy */ unsigned len; /* match length, unused bytes */ unsigned dist; /* match distance */ unsigned char *from; /* where to copy match from */ /* copy state to local variables */ state = (struct inflate_state *)strm->state; in = strm->next_in; last = in + (strm->avail_in - 5); out = strm->next_out; beg = out - (start - strm->avail_out); end = out + (strm->avail_out - 257); #ifdef INFLATE_STRICT dmax = state->dmax; #endif wsize = state->wsize; whave = state->whave; write = state->write; window = state->window; hold = state->hold; bits = state->bits; lcode = state->lencode; dcode = state->distcode; lmask = (1U << state->lenbits) - 1; dmask = (1U << state->distbits) - 1; /* decode literals and length/distances until end-of-block or not enough input data or output space */ do { if (bits < 15) { hold += (unsigned long)(*in++) << bits; bits += 8; hold += (unsigned long)(*in++) << bits; bits += 8; } this = lcode[hold & lmask]; dolen: op = (unsigned)(this.bits); hold >>= op; bits -= op; op = (unsigned)(this.op); if (op == 0) { /* literal */ *out++ = (unsigned char)(this.val); } else if (op & 16) { /* length base */ len = (unsigned)(this.val); op &= 15; /* number of extra bits */ if (op) { if (bits < op) { hold += (unsigned long)(*in++) << bits; bits += 8; } len += (unsigned)hold & ((1U << op) - 1); hold >>= op; bits -= op; } if (bits < 15) { hold += (unsigned long)(*in++) << bits; bits += 8; hold += (unsigned long)(*in++) << bits; bits += 8; } this = dcode[hold & dmask]; dodist: op = (unsigned)(this.bits); hold >>= op; bits -= op; op = (unsigned)(this.op); if (op & 16) { /* distance base */ dist = (unsigned)(this.val); op &= 15; /* number of extra bits */ if (bits < op) { hold += (unsigned long)(*in++) << bits; bits += 8; if (bits < op) { hold += (unsigned long)(*in++) << bits; bits += 8; } } dist += (unsigned)hold & ((1U << op) - 1); #ifdef INFLATE_STRICT if (dist > dmax) { strm->msg = (char *)"invalid distance too far back"; state->mode = BAD; break; } #endif hold >>= op; bits -= op; op = (unsigned)(out - beg); /* max distance in output */ if (dist > op) { /* see if copy from window */ op = dist - op; /* distance back in window */ if (op > whave) { strm->msg = (char *)"invalid distance too far back"; state->mode = BAD; break; } from = window; if (write == 0) { /* very common case */ from += wsize - op; if (op < len) { /* some from window */ len -= op; do { *out++ = *from++; } while (--op); from = out - dist; /* rest from output */ } } else if (write < op) { /* wrap around window */ from += wsize + write - op; op -= write; if (op < len) { /* some from end of window */ len -= op; do { *out++ = *from++; } while (--op); from = window; if (write < len) { /* some from start of window */ op = write; len -= op; do { *out++ = *from++; } while (--op); from = out - dist; /* rest from output */ } } } else { /* contiguous in window */ from += write - op; if (op < len) { /* some from window */ len -= op; do { *out++ = *from++; } while (--op); from = out - dist; /* rest from output */ } } while (len > 2) { *out++ = *from++; *out++ = *from++; *out++ = *from++; len -= 3; } if (len) { *out++ = *from++; if (len > 1) *out++ = *from++; } } else { unsigned short *sout; unsigned long loops; from = out - dist; /* copy direct from output */ /* minimum length is three */ /* Align out addr */ if (!((long)(out - 1) & 1)) { *out++ = *from++; len--; } sout = (unsigned short *)(out); if (dist > 2) { unsigned short *sfrom; sfrom = (unsigned short *)(from); loops = len >> 1; do { if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) *sout++ = *sfrom++; else *sout++ = get_unaligned16(sfrom++); } while (--loops); out = (unsigned char *)sout; from = (unsigned char *)sfrom; } else { /* dist == 1 or dist == 2 */ unsigned short pat16; pat16 = *(sout-1); if (dist == 1) { union uu mm; /* copy one char pattern to both bytes */ mm.us = pat16; mm.b[0] = mm.b[1]; pat16 = mm.us; } loops = len >> 1; do *sout++ = pat16; while (--loops); out = (unsigned char *)sout; } if (len & 1) *out++ = *from++; } } else if ((op & 64) == 0) { /* 2nd level distance code */ this = dcode[this.val + (hold & ((1U << op) - 1))]; goto dodist; } else { strm->msg = (char *)"invalid distance code"; state->mode = BAD; break; } } else if ((op & 64) == 0) { /* 2nd level length code */ this = lcode[this.val + (hold & ((1U << op) - 1))]; goto dolen; } else if (op & 32) { /* end-of-block */ state->mode = TYPE; break; } else { strm->msg = (char *)"invalid literal/length code"; state->mode = BAD; break; } } while (in < last && out < end); /* return unused bytes (on entry, bits < 8, so in won't go too far back) */ len = bits >> 3; in -= len; bits -= len << 3; hold &= (1U << bits) - 1; /* update state and return */ strm->next_in = in; strm->next_out = out; strm->avail_in = (unsigned)(in < last ? 5 + (last - in) : 5 - (in - last)); strm->avail_out = (unsigned)(out < end ? 257 + (end - out) : 257 - (out - end)); state->hold = hold; state->bits = bits; return; } /* inflate_fast() speedups that turned out slower (on a PowerPC G3 750CXe): - Using bit fields for code structure - Different op definition to avoid & for extra bits (do & for table bits) - Three separate decoding do-loops for direct, window, and write == 0 - Special case for distance > 1 copies to do overlapped load and store copy - Explicit branch predictions (based on measured branch probabilities) - Deferring match copy and interspersed it with decoding subsequent codes - Swapping literal/length else - Swapping window/direct else - Larger unrolled copy loops (three is about right) - Moving len -= 3 statement into middle of loop */ #endif /* !ASMINF */ |
| 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 | /* BNEP implementation for Linux Bluetooth stack (BlueZ). Copyright (C) 2001-2002 Inventel Systemes Written 2001-2002 by Clément Moreau <clement.moreau@inventel.fr> David Libault <david.libault@inventel.fr> Copyright (C) 2002 Maxim Krasnyansky <maxk@qualcomm.com> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) AND AUTHOR(S) BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. ALL LIABILITY, INCLUDING LIABILITY FOR INFRINGEMENT OF ANY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS, RELATING TO USE OF THIS SOFTWARE IS DISCLAIMED. */ #include <linux/etherdevice.h> #include <net/bluetooth/bluetooth.h> #include <net/bluetooth/hci_core.h> #include <net/bluetooth/l2cap.h> #include "bnep.h" #define BNEP_TX_QUEUE_LEN 20 static int bnep_net_open(struct net_device *dev) { netif_start_queue(dev); return 0; } static int bnep_net_close(struct net_device *dev) { netif_stop_queue(dev); return 0; } static void bnep_net_set_mc_list(struct net_device *dev) { #ifdef CONFIG_BT_BNEP_MC_FILTER struct bnep_session *s = netdev_priv(dev); struct sock *sk = s->sock->sk; struct bnep_set_filter_req *r; struct sk_buff *skb; int size; BT_DBG("%s mc_count %d", dev->name, netdev_mc_count(dev)); size = sizeof(*r) + (BNEP_MAX_MULTICAST_FILTERS + 1) * ETH_ALEN * 2; skb = alloc_skb(size, GFP_ATOMIC); if (!skb) { BT_ERR("%s Multicast list allocation failed", dev->name); return; } r = (void *) skb->data; __skb_put(skb, sizeof(*r)); r->type = BNEP_CONTROL; r->ctrl = BNEP_FILTER_MULTI_ADDR_SET; if (dev->flags & (IFF_PROMISC | IFF_ALLMULTI)) { u8 start[ETH_ALEN] = { 0x01 }; /* Request all addresses */ __skb_put_data(skb, start, ETH_ALEN); __skb_put_data(skb, dev->broadcast, ETH_ALEN); r->len = htons(ETH_ALEN * 2); } else { struct netdev_hw_addr *ha; int i, len = skb->len; if (dev->flags & IFF_BROADCAST) { __skb_put_data(skb, dev->broadcast, ETH_ALEN); __skb_put_data(skb, dev->broadcast, ETH_ALEN); } /* FIXME: We should group addresses here. */ i = 0; netdev_for_each_mc_addr(ha, dev) { if (i == BNEP_MAX_MULTICAST_FILTERS) break; __skb_put_data(skb, ha->addr, ETH_ALEN); __skb_put_data(skb, ha->addr, ETH_ALEN); i++; } r->len = htons(skb->len - len); } skb_queue_tail(&sk->sk_write_queue, skb); wake_up_interruptible(sk_sleep(sk)); #endif } static int bnep_net_set_mac_addr(struct net_device *dev, void *arg) { BT_DBG("%s", dev->name); return 0; } static void bnep_net_timeout(struct net_device *dev, unsigned int txqueue) { BT_DBG("net_timeout"); netif_wake_queue(dev); } #ifdef CONFIG_BT_BNEP_MC_FILTER static int bnep_net_mc_filter(struct sk_buff *skb, struct bnep_session *s) { struct ethhdr *eh = (void *) skb->data; if ((eh->h_dest[0] & 1) && !test_bit(bnep_mc_hash(eh->h_dest), (ulong *) &s->mc_filter)) return 1; return 0; } #endif #ifdef CONFIG_BT_BNEP_PROTO_FILTER /* Determine ether protocol. Based on eth_type_trans. */ static u16 bnep_net_eth_proto(struct sk_buff *skb) { struct ethhdr *eh = (void *) skb->data; u16 proto = ntohs(eh->h_proto); if (proto >= ETH_P_802_3_MIN) return proto; if (get_unaligned((__be16 *) skb->data) == htons(0xFFFF)) return ETH_P_802_3; return ETH_P_802_2; } static int bnep_net_proto_filter(struct sk_buff *skb, struct bnep_session *s) { u16 proto = bnep_net_eth_proto(skb); struct bnep_proto_filter *f = s->proto_filter; int i; for (i = 0; i < BNEP_MAX_PROTO_FILTERS && f[i].end; i++) { if (proto >= f[i].start && proto <= f[i].end) return 0; } BT_DBG("BNEP: filtered skb %p, proto 0x%.4x", skb, proto); return 1; } #endif static netdev_tx_t bnep_net_xmit(struct sk_buff *skb, struct net_device *dev) { struct bnep_session *s = netdev_priv(dev); struct sock *sk = s->sock->sk; BT_DBG("skb %p, dev %p", skb, dev); #ifdef CONFIG_BT_BNEP_MC_FILTER if (bnep_net_mc_filter(skb, s)) { kfree_skb(skb); return NETDEV_TX_OK; } #endif #ifdef CONFIG_BT_BNEP_PROTO_FILTER if (bnep_net_proto_filter(skb, s)) { kfree_skb(skb); return NETDEV_TX_OK; } #endif /* * We cannot send L2CAP packets from here as we are potentially in a bh. * So we have to queue them and wake up session thread which is sleeping * on the sk_sleep(sk). */ netif_trans_update(dev); skb_queue_tail(&sk->sk_write_queue, skb); wake_up_interruptible(sk_sleep(sk)); if (skb_queue_len(&sk->sk_write_queue) >= BNEP_TX_QUEUE_LEN) { BT_DBG("tx queue is full"); /* Stop queuing. * Session thread will do netif_wake_queue() */ netif_stop_queue(dev); } return NETDEV_TX_OK; } static const struct net_device_ops bnep_netdev_ops = { .ndo_open = bnep_net_open, .ndo_stop = bnep_net_close, .ndo_start_xmit = bnep_net_xmit, .ndo_validate_addr = eth_validate_addr, .ndo_set_rx_mode = bnep_net_set_mc_list, .ndo_set_mac_address = bnep_net_set_mac_addr, .ndo_tx_timeout = bnep_net_timeout, }; void bnep_net_setup(struct net_device *dev) { eth_broadcast_addr(dev->broadcast); dev->addr_len = ETH_ALEN; ether_setup(dev); dev->min_mtu = 0; dev->max_mtu = ETH_MAX_MTU; dev->priv_flags &= ~IFF_TX_SKB_SHARING; dev->netdev_ops = &bnep_netdev_ops; dev->watchdog_timeo = HZ * 2; } |
| 832 838 832 834 834 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 | // SPDX-License-Identifier: GPL-2.0 #include <linux/slab.h> #include <linux/kernel.h> #include <linux/bitops.h> #include <linux/cpumask.h> #include <linux/export.h> #include <linux/memblock.h> #include <linux/numa.h> /* These are not inline because of header tangles. */ #ifdef CONFIG_CPUMASK_OFFSTACK /** * alloc_cpumask_var_node - allocate a struct cpumask on a given node * @mask: pointer to cpumask_var_t where the cpumask is returned * @flags: GFP_ flags * @node: memory node from which to allocate or %NUMA_NO_NODE * * Only defined when CONFIG_CPUMASK_OFFSTACK=y, otherwise is * a nop returning a constant 1 (in <linux/cpumask.h>). * * Return: TRUE if memory allocation succeeded, FALSE otherwise. * * In addition, mask will be NULL if this fails. Note that gcc is * usually smart enough to know that mask can never be NULL if * CONFIG_CPUMASK_OFFSTACK=n, so does code elimination in that case * too. */ bool alloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags, int node) { *mask = kmalloc_node(cpumask_size(), flags, node); #ifdef CONFIG_DEBUG_PER_CPU_MAPS if (!*mask) { printk(KERN_ERR "=> alloc_cpumask_var: failed!\n"); dump_stack(); } #endif return *mask != NULL; } EXPORT_SYMBOL(alloc_cpumask_var_node); /** * alloc_bootmem_cpumask_var - allocate a struct cpumask from the bootmem arena. * @mask: pointer to cpumask_var_t where the cpumask is returned * * Only defined when CONFIG_CPUMASK_OFFSTACK=y, otherwise is * a nop (in <linux/cpumask.h>). * Either returns an allocated (zero-filled) cpumask, or causes the * system to panic. */ void __init alloc_bootmem_cpumask_var(cpumask_var_t *mask) { *mask = memblock_alloc_or_panic(cpumask_size(), SMP_CACHE_BYTES); } /** * free_cpumask_var - frees memory allocated for a struct cpumask. * @mask: cpumask to free * * This is safe on a NULL mask. */ void free_cpumask_var(cpumask_var_t mask) { kfree(mask); } EXPORT_SYMBOL(free_cpumask_var); /** * free_bootmem_cpumask_var - frees result of alloc_bootmem_cpumask_var * @mask: cpumask to free */ void __init free_bootmem_cpumask_var(cpumask_var_t mask) { memblock_free(mask, cpumask_size()); } #endif /** * cpumask_local_spread - select the i'th cpu based on NUMA distances * @i: index number * @node: local numa_node * * Return: online CPU according to a numa aware policy; local cpus are returned * first, followed by non-local ones, then it wraps around. * * For those who wants to enumerate all CPUs based on their NUMA distances, * i.e. call this function in a loop, like: * * for (i = 0; i < num_online_cpus(); i++) { * cpu = cpumask_local_spread(i, node); * do_something(cpu); * } * * There's a better alternative based on for_each()-like iterators: * * for_each_numa_hop_mask(mask, node) { * for_each_cpu_andnot(cpu, mask, prev) * do_something(cpu); * prev = mask; * } * * It's simpler and more verbose than above. Complexity of iterator-based * enumeration is O(sched_domains_numa_levels * nr_cpu_ids), while * cpumask_local_spread() when called for each cpu is * O(sched_domains_numa_levels * nr_cpu_ids * log(nr_cpu_ids)). */ unsigned int cpumask_local_spread(unsigned int i, int node) { unsigned int cpu; /* Wrap: we always want a cpu. */ i %= num_online_cpus(); cpu = sched_numa_find_nth_cpu(cpu_online_mask, i, node); WARN_ON(cpu >= nr_cpu_ids); return cpu; } EXPORT_SYMBOL(cpumask_local_spread); static DEFINE_PER_CPU(int, distribute_cpu_mask_prev); /** * cpumask_any_and_distribute - Return an arbitrary cpu within src1p & src2p. * @src1p: first &cpumask for intersection * @src2p: second &cpumask for intersection * * Iterated calls using the same srcp1 and srcp2 will be distributed within * their intersection. * * Return: >= nr_cpu_ids if the intersection is empty. */ unsigned int cpumask_any_and_distribute(const struct cpumask *src1p, const struct cpumask *src2p) { unsigned int next, prev; /* NOTE: our first selection will skip 0. */ prev = __this_cpu_read(distribute_cpu_mask_prev); next = cpumask_next_and_wrap(prev, src1p, src2p); if (next < nr_cpu_ids) __this_cpu_write(distribute_cpu_mask_prev, next); return next; } EXPORT_SYMBOL(cpumask_any_and_distribute); /** * cpumask_any_distribute - Return an arbitrary cpu from srcp * @srcp: &cpumask for selection * * Return: >= nr_cpu_ids if the intersection is empty. */ unsigned int cpumask_any_distribute(const struct cpumask *srcp) { unsigned int next, prev; /* NOTE: our first selection will skip 0. */ prev = __this_cpu_read(distribute_cpu_mask_prev); next = cpumask_next_wrap(prev, srcp); if (next < nr_cpu_ids) __this_cpu_write(distribute_cpu_mask_prev, next); return next; } EXPORT_SYMBOL(cpumask_any_distribute); |
| 2 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 | // SPDX-License-Identifier: GPL-2.0-or-later /* * IPVS: Shortest Expected Delay scheduling module * * Authors: Wensong Zhang <wensong@linuxvirtualserver.org> * * Changes: */ /* * The SED algorithm attempts to minimize each job's expected delay until * completion. The expected delay that the job will experience is * (Ci + 1) / Ui if sent to the ith server, in which Ci is the number of * jobs on the ith server and Ui is the fixed service rate (weight) of * the ith server. The SED algorithm adopts a greedy policy that each does * what is in its own best interest, i.e. to join the queue which would * minimize its expected delay of completion. * * See the following paper for more information: * A. Weinrib and S. Shenker, Greed is not enough: Adaptive load sharing * in large heterogeneous systems. In Proceedings IEEE INFOCOM'88, * pages 986-994, 1988. * * Thanks must go to Marko Buuri <marko@buuri.name> for talking SED to me. * * The difference between SED and WLC is that SED includes the incoming * job in the cost function (the increment of 1). SED may outperform * WLC, while scheduling big jobs under larger heterogeneous systems * (the server weight varies a lot). * */ #define KMSG_COMPONENT "IPVS" #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt #include <linux/module.h> #include <linux/kernel.h> #include <net/ip_vs.h> static inline int ip_vs_sed_dest_overhead(struct ip_vs_dest *dest) { /* * We only use the active connection number in the cost * calculation here. */ return atomic_read(&dest->activeconns) + 1; } /* * Weighted Least Connection scheduling */ static struct ip_vs_dest * ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb, struct ip_vs_iphdr *iph) { struct ip_vs_dest *dest, *least; int loh, doh; IP_VS_DBG(6, "%s(): Scheduling...\n", __func__); /* * We calculate the load of each dest server as follows: * (server expected overhead) / dest->weight * * Remember -- no floats in kernel mode!!! * The comparison of h1*w2 > h2*w1 is equivalent to that of * h1/w1 > h2/w2 * if every weight is larger than zero. * * The server with weight=0 is quiesced and will not receive any * new connections. */ list_for_each_entry_rcu(dest, &svc->destinations, n_list) { if (!(dest->flags & IP_VS_DEST_F_OVERLOAD) && atomic_read(&dest->weight) > 0) { least = dest; loh = ip_vs_sed_dest_overhead(least); goto nextstage; } } ip_vs_scheduler_err(svc, "no destination available"); return NULL; /* * Find the destination with the least load. */ nextstage: list_for_each_entry_continue_rcu(dest, &svc->destinations, n_list) { if (dest->flags & IP_VS_DEST_F_OVERLOAD) continue; doh = ip_vs_sed_dest_overhead(dest); if ((__s64)loh * atomic_read(&dest->weight) > (__s64)doh * atomic_read(&least->weight)) { least = dest; loh = doh; } } IP_VS_DBG_BUF(6, "SED: server %s:%u " "activeconns %d refcnt %d weight %d overhead %d\n", IP_VS_DBG_ADDR(least->af, &least->addr), ntohs(least->port), atomic_read(&least->activeconns), refcount_read(&least->refcnt), atomic_read(&least->weight), loh); return least; } static struct ip_vs_scheduler ip_vs_sed_scheduler = { .name = "sed", .refcnt = ATOMIC_INIT(0), .module = THIS_MODULE, .n_list = LIST_HEAD_INIT(ip_vs_sed_scheduler.n_list), .schedule = ip_vs_sed_schedule, }; static int __init ip_vs_sed_init(void) { return register_ip_vs_scheduler(&ip_vs_sed_scheduler); } static void __exit ip_vs_sed_cleanup(void) { unregister_ip_vs_scheduler(&ip_vs_sed_scheduler); synchronize_rcu(); } module_init(ip_vs_sed_init); module_exit(ip_vs_sed_cleanup); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("ipvs shortest expected delay scheduler"); |
| 5088 5078 416 3587 3975 3602 3590 419 420 1948 1957 1944 1952 441 420 420 420 416 420 440 434 433 398 400 398 399 399 400 400 398 41 41 41 41 4238 10 4230 4241 4229 4240 4229 4241 2829 4238 23 3306 4231 4198 4204 4216 4202 2828 1880 3604 3595 3598 3603 3596 4202 3601 1911 1960 4168 4159 4219 4215 3593 3604 4224 24 45 3 4207 577 163 163 163 161 163 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 | // SPDX-License-Identifier: GPL-2.0-only /* * Generic helpers for smp ipi calls * * (C) Jens Axboe <jens.axboe@oracle.com> 2008 */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/irq_work.h> #include <linux/rcupdate.h> #include <linux/rculist.h> #include <linux/kernel.h> #include <linux/export.h> #include <linux/percpu.h> #include <linux/init.h> #include <linux/interrupt.h> #include <linux/gfp.h> #include <linux/smp.h> #include <linux/cpu.h> #include <linux/sched.h> #include <linux/sched/idle.h> #include <linux/hypervisor.h> #include <linux/sched/clock.h> #include <linux/nmi.h> #include <linux/sched/debug.h> #include <linux/jump_label.h> #include <linux/string_choices.h> #include <trace/events/ipi.h> #define CREATE_TRACE_POINTS #include <trace/events/csd.h> #undef CREATE_TRACE_POINTS #include "smpboot.h" #include "sched/smp.h" #define CSD_TYPE(_csd) ((_csd)->node.u_flags & CSD_FLAG_TYPE_MASK) struct call_function_data { call_single_data_t __percpu *csd; cpumask_var_t cpumask; cpumask_var_t cpumask_ipi; }; static DEFINE_PER_CPU_ALIGNED(struct call_function_data, cfd_data); static DEFINE_PER_CPU_SHARED_ALIGNED(struct llist_head, call_single_queue); static DEFINE_PER_CPU(atomic_t, trigger_backtrace) = ATOMIC_INIT(1); static void __flush_smp_call_function_queue(bool warn_cpu_offline); int smpcfd_prepare_cpu(unsigned int cpu) { struct call_function_data *cfd = &per_cpu(cfd_data, cpu); if (!zalloc_cpumask_var_node(&cfd->cpumask, GFP_KERNEL, cpu_to_node(cpu))) return -ENOMEM; if (!zalloc_cpumask_var_node(&cfd->cpumask_ipi, GFP_KERNEL, cpu_to_node(cpu))) { free_cpumask_var(cfd->cpumask); return -ENOMEM; } cfd->csd = alloc_percpu(call_single_data_t); if (!cfd->csd) { free_cpumask_var(cfd->cpumask); free_cpumask_var(cfd->cpumask_ipi); return -ENOMEM; } return 0; } int smpcfd_dead_cpu(unsigned int cpu) { struct call_function_data *cfd = &per_cpu(cfd_data, cpu); free_cpumask_var(cfd->cpumask); free_cpumask_var(cfd->cpumask_ipi); free_percpu(cfd->csd); return 0; } int smpcfd_dying_cpu(unsigned int cpu) { /* * The IPIs for the smp-call-function callbacks queued by other * CPUs might arrive late, either due to hardware latencies or * because this CPU disabled interrupts (inside stop-machine) * before the IPIs were sent. So flush out any pending callbacks * explicitly (without waiting for the IPIs to arrive), to * ensure that the outgoing CPU doesn't go offline with work * still pending. */ __flush_smp_call_function_queue(false); irq_work_run(); return 0; } void __init call_function_init(void) { int i; for_each_possible_cpu(i) init_llist_head(&per_cpu(call_single_queue, i)); smpcfd_prepare_cpu(smp_processor_id()); } static __always_inline void send_call_function_single_ipi(int cpu) { if (call_function_single_prep_ipi(cpu)) { trace_ipi_send_cpu(cpu, _RET_IP_, generic_smp_call_function_single_interrupt); arch_send_call_function_single_ipi(cpu); } } static __always_inline void send_call_function_ipi_mask(struct cpumask *mask) { trace_ipi_send_cpumask(mask, _RET_IP_, generic_smp_call_function_single_interrupt); arch_send_call_function_ipi_mask(mask); } static __always_inline void csd_do_func(smp_call_func_t func, void *info, call_single_data_t *csd) { trace_csd_function_entry(func, csd); func(info); trace_csd_function_exit(func, csd); } #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG static DEFINE_STATIC_KEY_MAYBE(CONFIG_CSD_LOCK_WAIT_DEBUG_DEFAULT, csdlock_debug_enabled); /* * Parse the csdlock_debug= kernel boot parameter. * * If you need to restore the old "ext" value that once provided * additional debugging information, reapply the following commits: * * de7b09ef658d ("locking/csd_lock: Prepare more CSD lock debugging") * a5aabace5fb8 ("locking/csd_lock: Add more data to CSD lock debugging") */ static int __init csdlock_debug(char *str) { int ret; unsigned int val = 0; ret = get_option(&str, &val); if (ret) { if (val) static_branch_enable(&csdlock_debug_enabled); else static_branch_disable(&csdlock_debug_enabled); } return 1; } __setup("csdlock_debug=", csdlock_debug); static DEFINE_PER_CPU(call_single_data_t *, cur_csd); static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func); static DEFINE_PER_CPU(void *, cur_csd_info); static ulong csd_lock_timeout = 5000; /* CSD lock timeout in milliseconds. */ module_param(csd_lock_timeout, ulong, 0644); static int panic_on_ipistall; /* CSD panic timeout in milliseconds, 300000 for five minutes. */ module_param(panic_on_ipistall, int, 0644); static atomic_t csd_bug_count = ATOMIC_INIT(0); /* Record current CSD work for current CPU, NULL to erase. */ static void __csd_lock_record(call_single_data_t *csd) { if (!csd) { smp_mb(); /* NULL cur_csd after unlock. */ __this_cpu_write(cur_csd, NULL); return; } __this_cpu_write(cur_csd_func, csd->func); __this_cpu_write(cur_csd_info, csd->info); smp_wmb(); /* func and info before csd. */ __this_cpu_write(cur_csd, csd); smp_mb(); /* Update cur_csd before function call. */ /* Or before unlock, as the case may be. */ } static __always_inline void csd_lock_record(call_single_data_t *csd) { if (static_branch_unlikely(&csdlock_debug_enabled)) __csd_lock_record(csd); } static int csd_lock_wait_getcpu(call_single_data_t *csd) { unsigned int csd_type; csd_type = CSD_TYPE(csd); if (csd_type == CSD_TYPE_ASYNC || csd_type == CSD_TYPE_SYNC) return csd->node.dst; /* Other CSD_TYPE_ values might not have ->dst. */ return -1; } static atomic_t n_csd_lock_stuck; /** * csd_lock_is_stuck - Has a CSD-lock acquisition been stuck too long? * * Returns @true if a CSD-lock acquisition is stuck and has been stuck * long enough for a "non-responsive CSD lock" message to be printed. */ bool csd_lock_is_stuck(void) { return !!atomic_read(&n_csd_lock_stuck); } /* * Complain if too much time spent waiting. Note that only * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU, * so waiting on other types gets much less information. */ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id, unsigned long *nmessages) { int cpu = -1; int cpux; bool firsttime; u64 ts2, ts_delta; call_single_data_t *cpu_cur_csd; unsigned int flags = READ_ONCE(csd->node.u_flags); unsigned long long csd_lock_timeout_ns = csd_lock_timeout * NSEC_PER_MSEC; if (!(flags & CSD_FLAG_LOCK)) { if (!unlikely(*bug_id)) return true; cpu = csd_lock_wait_getcpu(csd); pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n", *bug_id, raw_smp_processor_id(), cpu); atomic_dec(&n_csd_lock_stuck); return true; } ts2 = ktime_get_mono_fast_ns(); /* How long since we last checked for a stuck CSD lock.*/ ts_delta = ts2 - *ts1; if (likely(ts_delta <= csd_lock_timeout_ns * (*nmessages + 1) * (!*nmessages ? 1 : (ilog2(num_online_cpus()) / 2 + 1)) || csd_lock_timeout_ns == 0)) return false; if (ts0 > ts2) { /* Our own sched_clock went backward; don't blame another CPU. */ ts_delta = ts0 - ts2; pr_alert("sched_clock on CPU %d went backward by %llu ns\n", raw_smp_processor_id(), ts_delta); *ts1 = ts2; return false; } firsttime = !*bug_id; if (firsttime) *bug_id = atomic_inc_return(&csd_bug_count); cpu = csd_lock_wait_getcpu(csd); if (WARN_ONCE(cpu < 0 || cpu >= nr_cpu_ids, "%s: cpu = %d\n", __func__, cpu)) cpux = 0; else cpux = cpu; cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */ /* How long since this CSD lock was stuck. */ ts_delta = ts2 - ts0; pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n", firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta, cpu, csd->func, csd->info); (*nmessages)++; if (firsttime) atomic_inc(&n_csd_lock_stuck); /* * If the CSD lock is still stuck after 5 minutes, it is unlikely * to become unstuck. Use a signed comparison to avoid triggering * on underflows when the TSC is out of sync between sockets. */ BUG_ON(panic_on_ipistall > 0 && (s64)ts_delta > ((s64)panic_on_ipistall * NSEC_PER_MSEC)); if (cpu_cur_csd && csd != cpu_cur_csd) { pr_alert("\tcsd: CSD lock (#%d) handling prior %pS(%ps) request.\n", *bug_id, READ_ONCE(per_cpu(cur_csd_func, cpux)), READ_ONCE(per_cpu(cur_csd_info, cpux))); } else { pr_alert("\tcsd: CSD lock (#%d) %s.\n", *bug_id, !cpu_cur_csd ? "unresponsive" : "handling this request"); } if (cpu >= 0) { if (atomic_cmpxchg_acquire(&per_cpu(trigger_backtrace, cpu), 1, 0)) dump_cpu_task(cpu); if (!cpu_cur_csd) { pr_alert("csd: Re-sending CSD lock (#%d) IPI from CPU#%02d to CPU#%02d\n", *bug_id, raw_smp_processor_id(), cpu); arch_send_call_function_single_ipi(cpu); } } if (firsttime) dump_stack(); *ts1 = ts2; return false; } /* * csd_lock/csd_unlock used to serialize access to per-cpu csd resources * * For non-synchronous ipi calls the csd can still be in use by the * previous function call. For multi-cpu calls its even more interesting * as we'll have to ensure no other cpu is observing our csd. */ static void __csd_lock_wait(call_single_data_t *csd) { unsigned long nmessages = 0; int bug_id = 0; u64 ts0, ts1; ts1 = ts0 = ktime_get_mono_fast_ns(); for (;;) { if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages)) break; cpu_relax(); } smp_acquire__after_ctrl_dep(); } static __always_inline void csd_lock_wait(call_single_data_t *csd) { if (static_branch_unlikely(&csdlock_debug_enabled)) { __csd_lock_wait(csd); return; } smp_cond_load_acquire(&csd->node.u_flags, !(VAL & CSD_FLAG_LOCK)); } #else static void csd_lock_record(call_single_data_t *csd) { } static __always_inline void csd_lock_wait(call_single_data_t *csd) { smp_cond_load_acquire(&csd->node.u_flags, !(VAL & CSD_FLAG_LOCK)); } #endif static __always_inline void csd_lock(call_single_data_t *csd) { csd_lock_wait(csd); csd->node.u_flags |= CSD_FLAG_LOCK; /* * prevent CPU from reordering the above assignment * to ->flags with any subsequent assignments to other * fields of the specified call_single_data_t structure: */ smp_wmb(); } static __always_inline void csd_unlock(call_single_data_t *csd) { WARN_ON(!(csd->node.u_flags & CSD_FLAG_LOCK)); /* * ensure we're all done before releasing data: */ smp_store_release(&csd->node.u_flags, 0); } static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data); void __smp_call_single_queue(int cpu, struct llist_node *node) { /* * We have to check the type of the CSD before queueing it, because * once queued it can have its flags cleared by * flush_smp_call_function_queue() * even if we haven't sent the smp_call IPI yet (e.g. the stopper * executes migration_cpu_stop() on the remote CPU). */ if (trace_csd_queue_cpu_enabled()) { call_single_data_t *csd; smp_call_func_t func; csd = container_of(node, call_single_data_t, node.llist); func = CSD_TYPE(csd) == CSD_TYPE_TTWU ? sched_ttwu_pending : csd->func; trace_csd_queue_cpu(cpu, _RET_IP_, func, csd); } /* * The list addition should be visible to the target CPU when it pops * the head of the list to pull the entry off it in the IPI handler * because of normal cache coherency rules implied by the underlying * llist ops. * * If IPIs can go out of order to the cache coherency protocol * in an architecture, sufficient synchronisation should be added * to arch code to make it appear to obey cache coherency WRT * locking and barrier primitives. Generic code isn't really * equipped to do the right thing... */ if (llist_add(node, &per_cpu(call_single_queue, cpu))) send_call_function_single_ipi(cpu); } /* * Insert a previously allocated call_single_data_t element * for execution on the given CPU. data must already have * ->func, ->info, and ->flags set. */ static int generic_exec_single(int cpu, call_single_data_t *csd) { if (cpu == smp_processor_id()) { smp_call_func_t func = csd->func; void *info = csd->info; unsigned long flags; /* * We can unlock early even for the synchronous on-stack case, * since we're doing this from the same CPU.. */ csd_lock_record(csd); csd_unlock(csd); local_irq_save(flags); csd_do_func(func, info, NULL); csd_lock_record(NULL); local_irq_restore(flags); return 0; } if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu)) { csd_unlock(csd); return -ENXIO; } __smp_call_single_queue(cpu, &csd->node.llist); return 0; } /** * generic_smp_call_function_single_interrupt - Execute SMP IPI callbacks * * Invoked by arch to handle an IPI for call function single. * Must be called with interrupts disabled. */ void generic_smp_call_function_single_interrupt(void) { __flush_smp_call_function_queue(true); } /** * __flush_smp_call_function_queue - Flush pending smp-call-function callbacks * * @warn_cpu_offline: If set to 'true', warn if callbacks were queued on an * offline CPU. Skip this check if set to 'false'. * * Flush any pending smp-call-function callbacks queued on this CPU. This is * invoked by the generic IPI handler, as well as by a CPU about to go offline, * to ensure that all pending IPI callbacks are run before it goes completely * offline. * * Loop through the call_single_queue and run all the queued callbacks. * Must be called with interrupts disabled. */ static void __flush_smp_call_function_queue(bool warn_cpu_offline) { call_single_data_t *csd, *csd_next; struct llist_node *entry, *prev; struct llist_head *head; static bool warned; atomic_t *tbt; lockdep_assert_irqs_disabled(); /* Allow waiters to send backtrace NMI from here onwards */ tbt = this_cpu_ptr(&trigger_backtrace); atomic_set_release(tbt, 1); head = this_cpu_ptr(&call_single_queue); entry = llist_del_all(head); entry = llist_reverse_order(entry); /* There shouldn't be any pending callbacks on an offline CPU. */ if (unlikely(warn_cpu_offline && !cpu_online(smp_processor_id()) && !warned && entry != NULL)) { warned = true; WARN(1, "IPI on offline CPU %d\n", smp_processor_id()); /* * We don't have to use the _safe() variant here * because we are not invoking the IPI handlers yet. */ llist_for_each_entry(csd, entry, node.llist) { switch (CSD_TYPE(csd)) { case CSD_TYPE_ASYNC: case CSD_TYPE_SYNC: case CSD_TYPE_IRQ_WORK: pr_warn("IPI callback %pS sent to offline CPU\n", csd->func); break; case CSD_TYPE_TTWU: pr_warn("IPI task-wakeup sent to offline CPU\n"); break; default: pr_warn("IPI callback, unknown type %d, sent to offline CPU\n", CSD_TYPE(csd)); break; } } } /* * First; run all SYNC callbacks, people are waiting for us. */ prev = NULL; llist_for_each_entry_safe(csd, csd_next, entry, node.llist) { /* Do we wait until *after* callback? */ if (CSD_TYPE(csd) == CSD_TYPE_SYNC) { smp_call_func_t func = csd->func; void *info = csd->info; if (prev) { prev->next = &csd_next->node.llist; } else { entry = &csd_next->node.llist; } csd_lock_record(csd); csd_do_func(func, info, csd); csd_unlock(csd); csd_lock_record(NULL); } else { prev = &csd->node.llist; } } if (!entry) return; /* * Second; run all !SYNC callbacks. */ prev = NULL; llist_for_each_entry_safe(csd, csd_next, entry, node.llist) { int type = CSD_TYPE(csd); if (type != CSD_TYPE_TTWU) { if (prev) { prev->next = &csd_next->node.llist; } else { entry = &csd_next->node.llist; } if (type == CSD_TYPE_ASYNC) { smp_call_func_t func = csd->func; void *info = csd->info; csd_lock_record(csd); csd_unlock(csd); csd_do_func(func, info, csd); csd_lock_record(NULL); } else if (type == CSD_TYPE_IRQ_WORK) { irq_work_single(csd); } } else { prev = &csd->node.llist; } } /* * Third; only CSD_TYPE_TTWU is left, issue those. */ if (entry) { csd = llist_entry(entry, typeof(*csd), node.llist); csd_do_func(sched_ttwu_pending, entry, csd); } } /** * flush_smp_call_function_queue - Flush pending smp-call-function callbacks * from task context (idle, migration thread) * * When TIF_POLLING_NRFLAG is supported and a CPU is in idle and has it * set, then remote CPUs can avoid sending IPIs and wake the idle CPU by * setting TIF_NEED_RESCHED. The idle task on the woken up CPU has to * handle queued SMP function calls before scheduling. * * The migration thread has to ensure that an eventually pending wakeup has * been handled before it migrates a task. */ void flush_smp_call_function_queue(void) { unsigned int was_pending; unsigned long flags; if (llist_empty(this_cpu_ptr(&call_single_queue))) return; local_irq_save(flags); /* Get the already pending soft interrupts for RT enabled kernels */ was_pending = local_softirq_pending(); __flush_smp_call_function_queue(true); if (local_softirq_pending()) do_softirq_post_smp_call_flush(was_pending); local_irq_restore(flags); } /* * smp_call_function_single - Run a function on a specific CPU * @func: The function to run. This must be fast and non-blocking. * @info: An arbitrary pointer to pass to the function. * @wait: If true, wait until function has completed on other CPUs. * * Returns 0 on success, else a negative status code. */ int smp_call_function_single(int cpu, smp_call_func_t func, void *info, int wait) { call_single_data_t *csd; call_single_data_t csd_stack = { .node = { .u_flags = CSD_FLAG_LOCK | CSD_TYPE_SYNC, }, }; int this_cpu; int err; /* * prevent preemption and reschedule on another processor, * as well as CPU removal */ this_cpu = get_cpu(); /* * Can deadlock when called with interrupts disabled. * We allow cpu's that are not yet online though, as no one else can * send smp call function interrupt to this cpu and as such deadlocks * can't happen. */ WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress); /* * When @wait we can deadlock when we interrupt between llist_add() and * arch_send_call_function_ipi*(); when !@wait we can deadlock due to * csd_lock() on because the interrupt context uses the same csd * storage. */ WARN_ON_ONCE(!in_task()); csd = &csd_stack; if (!wait) { csd = this_cpu_ptr(&csd_data); csd_lock(csd); } csd->func = func; csd->info = info; #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG csd->node.src = smp_processor_id(); csd->node.dst = cpu; #endif err = generic_exec_single(cpu, csd); if (wait) csd_lock_wait(csd); put_cpu(); return err; } EXPORT_SYMBOL(smp_call_function_single); /** * smp_call_function_single_async() - Run an asynchronous function on a * specific CPU. * @cpu: The CPU to run on. * @csd: Pre-allocated and setup data structure * * Like smp_call_function_single(), but the call is asynchonous and * can thus be done from contexts with disabled interrupts. * * The caller passes his own pre-allocated data structure * (ie: embedded in an object) and is responsible for synchronizing it * such that the IPIs performed on the @csd are strictly serialized. * * If the function is called with one csd which has not yet been * processed by previous call to smp_call_function_single_async(), the * function will return immediately with -EBUSY showing that the csd * object is still in progress. * * NOTE: Be careful, there is unfortunately no current debugging facility to * validate the correctness of this serialization. * * Return: %0 on success or negative errno value on error */ int smp_call_function_single_async(int cpu, call_single_data_t *csd) { int err = 0; preempt_disable(); if (csd->node.u_flags & CSD_FLAG_LOCK) { err = -EBUSY; goto out; } csd->node.u_flags = CSD_FLAG_LOCK; smp_wmb(); err = generic_exec_single(cpu, csd); out: preempt_enable(); return err; } EXPORT_SYMBOL_GPL(smp_call_function_single_async); /* * smp_call_function_any - Run a function on any of the given cpus * @mask: The mask of cpus it can run on. * @func: The function to run. This must be fast and non-blocking. * @info: An arbitrary pointer to pass to the function. * @wait: If true, wait until function has completed. * * Returns 0 on success, else a negative status code (if no cpus were online). * * Selection preference: * 1) current cpu if in @mask * 2) any cpu of current node if in @mask * 3) any other online cpu in @mask */ int smp_call_function_any(const struct cpumask *mask, smp_call_func_t func, void *info, int wait) { unsigned int cpu; const struct cpumask *nodemask; int ret; /* Try for same CPU (cheapest) */ cpu = get_cpu(); if (cpumask_test_cpu(cpu, mask)) goto call; /* Try for same node. */ nodemask = cpumask_of_node(cpu_to_node(cpu)); for (cpu = cpumask_first_and(nodemask, mask); cpu < nr_cpu_ids; cpu = cpumask_next_and(cpu, nodemask, mask)) { if (cpu_online(cpu)) goto call; } /* Any online will do: smp_call_function_single handles nr_cpu_ids. */ cpu = cpumask_any_and(mask, cpu_online_mask); call: ret = smp_call_function_single(cpu, func, info, wait); put_cpu(); return ret; } EXPORT_SYMBOL_GPL(smp_call_function_any); /* * Flags to be used as scf_flags argument of smp_call_function_many_cond(). * * %SCF_WAIT: Wait until function execution is completed * %SCF_RUN_LOCAL: Run also locally if local cpu is set in cpumask */ #define SCF_WAIT (1U << 0) #define SCF_RUN_LOCAL (1U << 1) static void smp_call_function_many_cond(const struct cpumask *mask, smp_call_func_t func, void *info, unsigned int scf_flags, smp_cond_func_t cond_func) { int cpu, last_cpu, this_cpu = smp_processor_id(); struct call_function_data *cfd; bool wait = scf_flags & SCF_WAIT; int nr_cpus = 0; bool run_remote = false; bool run_local = false; lockdep_assert_preemption_disabled(); /* * Can deadlock when called with interrupts disabled. * We allow cpu's that are not yet online though, as no one else can * send smp call function interrupt to this cpu and as such deadlocks * can't happen. */ if (cpu_online(this_cpu) && !oops_in_progress && !early_boot_irqs_disabled) lockdep_assert_irqs_enabled(); /* * When @wait we can deadlock when we interrupt between llist_add() and * arch_send_call_function_ipi*(); when !@wait we can deadlock due to * csd_lock() on because the interrupt context uses the same csd * storage. */ WARN_ON_ONCE(!in_task()); /* Check if we need local execution. */ if ((scf_flags & SCF_RUN_LOCAL) && cpumask_test_cpu(this_cpu, mask) && (!cond_func || cond_func(this_cpu, info))) run_local = true; /* Check if we need remote execution, i.e., any CPU excluding this one. */ cpu = cpumask_first_and(mask, cpu_online_mask); if (cpu == this_cpu) cpu = cpumask_next_and(cpu, mask, cpu_online_mask); if (cpu < nr_cpu_ids) run_remote = true; if (run_remote) { cfd = this_cpu_ptr(&cfd_data); cpumask_and(cfd->cpumask, mask, cpu_online_mask); __cpumask_clear_cpu(this_cpu, cfd->cpumask); cpumask_clear(cfd->cpumask_ipi); for_each_cpu(cpu, cfd->cpumask) { call_single_data_t *csd = per_cpu_ptr(cfd->csd, cpu); if (cond_func && !cond_func(cpu, info)) { __cpumask_clear_cpu(cpu, cfd->cpumask); continue; } csd_lock(csd); if (wait) csd->node.u_flags |= CSD_TYPE_SYNC; csd->func = func; csd->info = info; #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG csd->node.src = smp_processor_id(); csd->node.dst = cpu; #endif trace_csd_queue_cpu(cpu, _RET_IP_, func, csd); if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu))) { __cpumask_set_cpu(cpu, cfd->cpumask_ipi); nr_cpus++; last_cpu = cpu; } } /* * Choose the most efficient way to send an IPI. Note that the * number of CPUs might be zero due to concurrent changes to the * provided mask. */ if (nr_cpus == 1) send_call_function_single_ipi(last_cpu); else if (likely(nr_cpus > 1)) send_call_function_ipi_mask(cfd->cpumask_ipi); } if (run_local) { unsigned long flags; local_irq_save(flags); csd_do_func(func, info, NULL); local_irq_restore(flags); } if (run_remote && wait) { for_each_cpu(cpu, cfd->cpumask) { call_single_data_t *csd; csd = per_cpu_ptr(cfd->csd, cpu); csd_lock_wait(csd); } } } /** * smp_call_function_many(): Run a function on a set of CPUs. * @mask: The set of cpus to run on (only runs on online subset). * @func: The function to run. This must be fast and non-blocking. * @info: An arbitrary pointer to pass to the function. * @wait: Bitmask that controls the operation. If %SCF_WAIT is set, wait * (atomically) until function has completed on other CPUs. If * %SCF_RUN_LOCAL is set, the function will also be run locally * if the local CPU is set in the @cpumask. * * If @wait is true, then returns once @func has returned. * * You must not call this function with disabled interrupts or from a * hardware interrupt handler or from a bottom half handler. Preemption * must be disabled when calling this function. */ void smp_call_function_many(const struct cpumask *mask, smp_call_func_t func, void *info, bool wait) { smp_call_function_many_cond(mask, func, info, wait * SCF_WAIT, NULL); } EXPORT_SYMBOL(smp_call_function_many); /** * smp_call_function(): Run a function on all other CPUs. * @func: The function to run. This must be fast and non-blocking. * @info: An arbitrary pointer to pass to the function. * @wait: If true, wait (atomically) until function has completed * on other CPUs. * * Returns 0. * * If @wait is true, then returns once @func has returned; otherwise * it returns just before the target cpu calls @func. * * You must not call this function with disabled interrupts or from a * hardware interrupt handler or from a bottom half handler. */ void smp_call_function(smp_call_func_t func, void *info, int wait) { preempt_disable(); smp_call_function_many(cpu_online_mask, func, info, wait); preempt_enable(); } EXPORT_SYMBOL(smp_call_function); /* Setup configured maximum number of CPUs to activate */ unsigned int setup_max_cpus = NR_CPUS; EXPORT_SYMBOL(setup_max_cpus); /* * Setup routine for controlling SMP activation * * Command-line option of "nosmp" or "maxcpus=0" will disable SMP * activation entirely (the MPS table probe still happens, though). * * Command-line option of "maxcpus=<NUM>", where <NUM> is an integer * greater than 0, limits the maximum number of CPUs activated in * SMP mode to <NUM>. */ void __weak __init arch_disable_smp_support(void) { } static int __init nosmp(char *str) { setup_max_cpus = 0; arch_disable_smp_support(); return 0; } early_param("nosmp", nosmp); /* this is hard limit */ static int __init nrcpus(char *str) { int nr_cpus; if (get_option(&str, &nr_cpus) && nr_cpus > 0 && nr_cpus < nr_cpu_ids) set_nr_cpu_ids(nr_cpus); return 0; } early_param("nr_cpus", nrcpus); static int __init maxcpus(char *str) { get_option(&str, &setup_max_cpus); if (setup_max_cpus == 0) arch_disable_smp_support(); return 0; } early_param("maxcpus", maxcpus); #if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS) /* Setup number of possible processor ids */ unsigned int nr_cpu_ids __read_mostly = NR_CPUS; EXPORT_SYMBOL(nr_cpu_ids); #endif /* An arch may set nr_cpu_ids earlier if needed, so this would be redundant */ void __init setup_nr_cpu_ids(void) { set_nr_cpu_ids(find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1); } /* Called by boot processor to activate the rest. */ void __init smp_init(void) { int num_nodes, num_cpus; idle_threads_init(); cpuhp_threads_init(); pr_info("Bringing up secondary CPUs ...\n"); bringup_nonboot_cpus(setup_max_cpus); num_nodes = num_online_nodes(); num_cpus = num_online_cpus(); pr_info("Brought up %d node%s, %d CPU%s\n", num_nodes, str_plural(num_nodes), num_cpus, str_plural(num_cpus)); /* Any cleanup work */ smp_cpus_done(setup_max_cpus); } /* * on_each_cpu_cond(): Call a function on each processor for which * the supplied function cond_func returns true, optionally waiting * for all the required CPUs to finish. This may include the local * processor. * @cond_func: A callback function that is passed a cpu id and * the info parameter. The function is called * with preemption disabled. The function should * return a blooean value indicating whether to IPI * the specified CPU. * @func: The function to run on all applicable CPUs. * This must be fast and non-blocking. * @info: An arbitrary pointer to pass to both functions. * @wait: If true, wait (atomically) until function has * completed on other CPUs. * * Preemption is disabled to protect against CPUs going offline but not online. * CPUs going online during the call will not be seen or sent an IPI. * * You must not call this function with disabled interrupts or * from a hardware interrupt handler or from a bottom half handler. */ void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func, void *info, bool wait, const struct cpumask *mask) { unsigned int scf_flags = SCF_RUN_LOCAL; if (wait) scf_flags |= SCF_WAIT; preempt_disable(); smp_call_function_many_cond(mask, func, info, scf_flags, cond_func); preempt_enable(); } EXPORT_SYMBOL(on_each_cpu_cond_mask); static void do_nothing(void *unused) { } /** * kick_all_cpus_sync - Force all cpus out of idle * * Used to synchronize the update of pm_idle function pointer. It's * called after the pointer is updated and returns after the dummy * callback function has been executed on all cpus. The execution of * the function can only happen on the remote cpus after they have * left the idle function which had been called via pm_idle function * pointer. So it's guaranteed that nothing uses the previous pointer * anymore. */ void kick_all_cpus_sync(void) { /* Make sure the change is visible before we kick the cpus */ smp_mb(); smp_call_function(do_nothing, NULL, 1); } EXPORT_SYMBOL_GPL(kick_all_cpus_sync); /** * wake_up_all_idle_cpus - break all cpus out of idle * wake_up_all_idle_cpus try to break all cpus which is in idle state even * including idle polling cpus, for non-idle cpus, we will do nothing * for them. */ void wake_up_all_idle_cpus(void) { int cpu; for_each_possible_cpu(cpu) { preempt_disable(); if (cpu != smp_processor_id() && cpu_online(cpu)) wake_up_if_idle(cpu); preempt_enable(); } } EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus); /** * struct smp_call_on_cpu_struct - Call a function on a specific CPU * @work: &work_struct * @done: &completion to signal * @func: function to call * @data: function's data argument * @ret: return value from @func * @cpu: target CPU (%-1 for any CPU) * * Used to call a function on a specific cpu and wait for it to return. * Optionally make sure the call is done on a specified physical cpu via vcpu * pinning in order to support virtualized environments. */ struct smp_call_on_cpu_struct { struct work_struct work; struct completion done; int (*func)(void *); void *data; int ret; int cpu; }; static void smp_call_on_cpu_callback(struct work_struct *work) { struct smp_call_on_cpu_struct *sscs; sscs = container_of(work, struct smp_call_on_cpu_struct, work); if (sscs->cpu >= 0) hypervisor_pin_vcpu(sscs->cpu); sscs->ret = sscs->func(sscs->data); if (sscs->cpu >= 0) hypervisor_pin_vcpu(-1); complete(&sscs->done); } int smp_call_on_cpu(unsigned int cpu, int (*func)(void *), void *par, bool phys) { struct smp_call_on_cpu_struct sscs = { .done = COMPLETION_INITIALIZER_ONSTACK(sscs.done), .func = func, .data = par, .cpu = phys ? cpu : -1, }; INIT_WORK_ONSTACK(&sscs.work, smp_call_on_cpu_callback); if (cpu >= nr_cpu_ids || !cpu_online(cpu)) return -ENXIO; queue_work_on(cpu, system_wq, &sscs.work); wait_for_completion(&sscs.done); destroy_work_on_stack(&sscs.work); return sscs.ret; } EXPORT_SYMBOL_GPL(smp_call_on_cpu); |
| 208 1 4 4 4 4 35 31 39 13 11 1 1 10 10 11 11 11 6 10 11 10 2 11 10 10 15 11 8 9 8 3 1 5 2 2 21 19 2 1 21 4 35 26 26 26 26 26 26 2 2 26 25 26 26 8 40 33 26 8 15 40 6 6 6 1 5 2 3 5 6 23 22 3 19 20 23 20 19 22 32 34 15 5 3 16 17 17 15 16 15 16 17 17 17 20 20 20 20 20 20 20 13 3 29 3 27 7 39 8 41 40 40 40 1 39 40 40 40 17 23 40 9 8 40 40 11 34 35 13 32 3 32 24 40 12 3 3 10 12 11 3 1 6 6 5 5 1 3 2 6 5 7 7 5 5 5 15 15 14 1 2 1 1 15 15 11 4 1 1 1 4 8 11 3 2 1 2 2 2 2 11 15 11 1 1 1 11 11 8 11 8 8 8 8 5 3 8 8 15 15 246 15 15 15 15 5 5 5 5 8 12 8 2 1 1 8 6 8 10 11 11 9 10 10 4 7 7 7 7 7 3 12 2 1 7 8 6 2 3 2 5 4 1 2 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 | // SPDX-License-Identifier: GPL-2.0 /* * Implement CPU time clocks for the POSIX clock interface. */ #include <linux/sched/signal.h> #include <linux/sched/cputime.h> #include <linux/posix-timers.h> #include <linux/errno.h> #include <linux/math64.h> #include <linux/uaccess.h> #include <linux/kernel_stat.h> #include <trace/events/timer.h> #include <linux/tick.h> #include <linux/workqueue.h> #include <linux/compat.h> #include <linux/sched/deadline.h> #include <linux/task_work.h> #include "posix-timers.h" static void posix_cpu_timer_rearm(struct k_itimer *timer); void posix_cputimers_group_init(struct posix_cputimers *pct, u64 cpu_limit) { posix_cputimers_init(pct); if (cpu_limit != RLIM_INFINITY) { pct->bases[CPUCLOCK_PROF].nextevt = cpu_limit * NSEC_PER_SEC; pct->timers_active = true; } } /* * Called after updating RLIMIT_CPU to run cpu timer and update * tsk->signal->posix_cputimers.bases[clock].nextevt expiration cache if * necessary. Needs siglock protection since other code may update the * expiration cache as well. * * Returns 0 on success, -ESRCH on failure. Can fail if the task is exiting and * we cannot lock_task_sighand. Cannot fail if task is current. */ int update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new) { u64 nsecs = rlim_new * NSEC_PER_SEC; unsigned long irq_fl; if (!lock_task_sighand(task, &irq_fl)) return -ESRCH; set_process_cpu_timer(task, CPUCLOCK_PROF, &nsecs, NULL); unlock_task_sighand(task, &irq_fl); return 0; } /* * Functions for validating access to tasks. */ static struct pid *pid_for_clock(const clockid_t clock, bool gettime) { const bool thread = !!CPUCLOCK_PERTHREAD(clock); const pid_t upid = CPUCLOCK_PID(clock); struct pid *pid; if (CPUCLOCK_WHICH(clock) >= CPUCLOCK_MAX) return NULL; /* * If the encoded PID is 0, then the timer is targeted at current * or the process to which current belongs. */ if (upid == 0) return thread ? task_pid(current) : task_tgid(current); pid = find_vpid(upid); if (!pid) return NULL; if (thread) { struct task_struct *tsk = pid_task(pid, PIDTYPE_PID); return (tsk && same_thread_group(tsk, current)) ? pid : NULL; } /* * For clock_gettime(PROCESS) allow finding the process by * with the pid of the current task. The code needs the tgid * of the process so that pid_task(pid, PIDTYPE_TGID) can be * used to find the process. */ if (gettime && (pid == task_pid(current))) return task_tgid(current); /* * For processes require that pid identifies a process. */ return pid_has_task(pid, PIDTYPE_TGID) ? pid : NULL; } static inline int validate_clock_permissions(const clockid_t clock) { int ret; rcu_read_lock(); ret = pid_for_clock(clock, false) ? 0 : -EINVAL; rcu_read_unlock(); return ret; } static inline enum pid_type clock_pid_type(const clockid_t clock) { return CPUCLOCK_PERTHREAD(clock) ? PIDTYPE_PID : PIDTYPE_TGID; } static inline struct task_struct *cpu_timer_task_rcu(struct k_itimer *timer) { return pid_task(timer->it.cpu.pid, clock_pid_type(timer->it_clock)); } /* * Update expiry time from increment, and increase overrun count, * given the current clock sample. */ static u64 bump_cpu_timer(struct k_itimer *timer, u64 now) { u64 delta, incr, expires = timer->it.cpu.node.expires; int i; if (!timer->it_interval) return expires; if (now < expires) return expires; incr = timer->it_interval; delta = now + incr - expires; /* Don't use (incr*2 < delta), incr*2 might overflow. */ for (i = 0; incr < delta - incr; i++) incr = incr << 1; for (; i >= 0; incr >>= 1, i--) { if (delta < incr) continue; timer->it.cpu.node.expires += incr; timer->it_overrun += 1LL << i; delta -= incr; } return timer->it.cpu.node.expires; } /* Check whether all cache entries contain U64_MAX, i.e. eternal expiry time */ static inline bool expiry_cache_is_inactive(const struct posix_cputimers *pct) { return !(~pct->bases[CPUCLOCK_PROF].nextevt | ~pct->bases[CPUCLOCK_VIRT].nextevt | ~pct->bases[CPUCLOCK_SCHED].nextevt); } static int posix_cpu_clock_getres(const clockid_t which_clock, struct timespec64 *tp) { int error = validate_clock_permissions(which_clock); if (!error) { tp->tv_sec = 0; tp->tv_nsec = ((NSEC_PER_SEC + HZ - 1) / HZ); if (CPUCLOCK_WHICH(which_clock) == CPUCLOCK_SCHED) { /* * If sched_clock is using a cycle counter, we * don't have any idea of its true resolution * exported, but it is much more than 1s/HZ. */ tp->tv_nsec = 1; } } return error; } static int posix_cpu_clock_set(const clockid_t clock, const struct timespec64 *tp) { int error = validate_clock_permissions(clock); /* * You can never reset a CPU clock, but we check for other errors * in the call before failing with EPERM. */ return error ? : -EPERM; } /* * Sample a per-thread clock for the given task. clkid is validated. */ static u64 cpu_clock_sample(const clockid_t clkid, struct task_struct *p) { u64 utime, stime; if (clkid == CPUCLOCK_SCHED) return task_sched_runtime(p); task_cputime(p, &utime, &stime); switch (clkid) { case CPUCLOCK_PROF: return utime + stime; case CPUCLOCK_VIRT: return utime; default: WARN_ON_ONCE(1); } return 0; } static inline void store_samples(u64 *samples, u64 stime, u64 utime, u64 rtime) { samples[CPUCLOCK_PROF] = stime + utime; samples[CPUCLOCK_VIRT] = utime; samples[CPUCLOCK_SCHED] = rtime; } static void task_sample_cputime(struct task_struct *p, u64 *samples) { u64 stime, utime; task_cputime(p, &utime, &stime); store_samples(samples, stime, utime, p->se.sum_exec_runtime); } static void proc_sample_cputime_atomic(struct task_cputime_atomic *at, u64 *samples) { u64 stime, utime, rtime; utime = atomic64_read(&at->utime); stime = atomic64_read(&at->stime); rtime = atomic64_read(&at->sum_exec_runtime); store_samples(samples, stime, utime, rtime); } /* * Set cputime to sum_cputime if sum_cputime > cputime. Use cmpxchg * to avoid race conditions with concurrent updates to cputime. */ static inline void __update_gt_cputime(atomic64_t *cputime, u64 sum_cputime) { u64 curr_cputime = atomic64_read(cputime); do { if (sum_cputime <= curr_cputime) return; } while (!atomic64_try_cmpxchg(cputime, &curr_cputime, sum_cputime)); } static void update_gt_cputime(struct task_cputime_atomic *cputime_atomic, struct task_cputime *sum) { __update_gt_cputime(&cputime_atomic->utime, sum->utime); __update_gt_cputime(&cputime_atomic->stime, sum->stime); __update_gt_cputime(&cputime_atomic->sum_exec_runtime, sum->sum_exec_runtime); } /** * thread_group_sample_cputime - Sample cputime for a given task * @tsk: Task for which cputime needs to be started * @samples: Storage for time samples * * Called from sys_getitimer() to calculate the expiry time of an active * timer. That means group cputime accounting is already active. Called * with task sighand lock held. * * Updates @times with an uptodate sample of the thread group cputimes. */ void thread_group_sample_cputime(struct task_struct *tsk, u64 *samples) { struct thread_group_cputimer *cputimer = &tsk->signal->cputimer; struct posix_cputimers *pct = &tsk->signal->posix_cputimers; WARN_ON_ONCE(!pct->timers_active); proc_sample_cputime_atomic(&cputimer->cputime_atomic, samples); } /** * thread_group_start_cputime - Start cputime and return a sample * @tsk: Task for which cputime needs to be started * @samples: Storage for time samples * * The thread group cputime accounting is avoided when there are no posix * CPU timers armed. Before starting a timer it's required to check whether * the time accounting is active. If not, a full update of the atomic * accounting store needs to be done and the accounting enabled. * * Updates @times with an uptodate sample of the thread group cputimes. */ static void thread_group_start_cputime(struct task_struct *tsk, u64 *samples) { struct thread_group_cputimer *cputimer = &tsk->signal->cputimer; struct posix_cputimers *pct = &tsk->signal->posix_cputimers; lockdep_assert_task_sighand_held(tsk); /* Check if cputimer isn't running. This is accessed without locking. */ if (!READ_ONCE(pct->timers_active)) { struct task_cputime sum; /* * The POSIX timer interface allows for absolute time expiry * values through the TIMER_ABSTIME flag, therefore we have * to synchronize the timer to the clock every time we start it. */ thread_group_cputime(tsk, &sum); update_gt_cputime(&cputimer->cputime_atomic, &sum); /* * We're setting timers_active without a lock. Ensure this * only gets written to in one operation. We set it after * update_gt_cputime() as a small optimization, but * barriers are not required because update_gt_cputime() * can handle concurrent updates. */ WRITE_ONCE(pct->timers_active, true); } proc_sample_cputime_atomic(&cputimer->cputime_atomic, samples); } static void __thread_group_cputime(struct task_struct *tsk, u64 *samples) { struct task_cputime ct; thread_group_cputime(tsk, &ct); store_samples(samples, ct.stime, ct.utime, ct.sum_exec_runtime); } /* * Sample a process (thread group) clock for the given task clkid. If the * group's cputime accounting is already enabled, read the atomic * store. Otherwise a full update is required. clkid is already validated. */ static u64 cpu_clock_sample_group(const clockid_t clkid, struct task_struct *p, bool start) { struct thread_group_cputimer *cputimer = &p->signal->cputimer; struct posix_cputimers *pct = &p->signal->posix_cputimers; u64 samples[CPUCLOCK_MAX]; if (!READ_ONCE(pct->timers_active)) { if (start) thread_group_start_cputime(p, samples); else __thread_group_cputime(p, samples); } else { proc_sample_cputime_atomic(&cputimer->cputime_atomic, samples); } return samples[clkid]; } static int posix_cpu_clock_get(const clockid_t clock, struct timespec64 *tp) { const clockid_t clkid = CPUCLOCK_WHICH(clock); struct task_struct *tsk; u64 t; rcu_read_lock(); tsk = pid_task(pid_for_clock(clock, true), clock_pid_type(clock)); if (!tsk) { rcu_read_unlock(); return -EINVAL; } if (CPUCLOCK_PERTHREAD(clock)) t = cpu_clock_sample(clkid, tsk); else t = cpu_clock_sample_group(clkid, tsk, false); rcu_read_unlock(); *tp = ns_to_timespec64(t); return 0; } /* * Validate the clockid_t for a new CPU-clock timer, and initialize the timer. * This is called from sys_timer_create() and do_cpu_nanosleep() with the * new timer already all-zeros initialized. */ static int posix_cpu_timer_create(struct k_itimer *new_timer) { static struct lock_class_key posix_cpu_timers_key; struct pid *pid; rcu_read_lock(); pid = pid_for_clock(new_timer->it_clock, false); if (!pid) { rcu_read_unlock(); return -EINVAL; } /* * If posix timer expiry is handled in task work context then * timer::it_lock can be taken without disabling interrupts as all * other locking happens in task context. This requires a separate * lock class key otherwise regular posix timer expiry would record * the lock class being taken in interrupt context and generate a * false positive warning. */ if (IS_ENABLED(CONFIG_POSIX_CPU_TIMERS_TASK_WORK)) lockdep_set_class(&new_timer->it_lock, &posix_cpu_timers_key); new_timer->kclock = &clock_posix_cpu; timerqueue_init(&new_timer->it.cpu.node); new_timer->it.cpu.pid = get_pid(pid); rcu_read_unlock(); return 0; } static struct posix_cputimer_base *timer_base(struct k_itimer *timer, struct task_struct *tsk) { int clkidx = CPUCLOCK_WHICH(timer->it_clock); if (CPUCLOCK_PERTHREAD(timer->it_clock)) return tsk->posix_cputimers.bases + clkidx; else return tsk->signal->posix_cputimers.bases + clkidx; } /* * Force recalculating the base earliest expiration on the next tick. * This will also re-evaluate the need to keep around the process wide * cputime counter and tick dependency and eventually shut these down * if necessary. */ static void trigger_base_recalc_expires(struct k_itimer *timer, struct task_struct *tsk) { struct posix_cputimer_base *base = timer_base(timer, tsk); base->nextevt = 0; } /* * Dequeue the timer and reset the base if it was its earliest expiration. * It makes sure the next tick recalculates the base next expiration so we * don't keep the costly process wide cputime counter around for a random * amount of time, along with the tick dependency. * * If another timer gets queued between this and the next tick, its * expiration will update the base next event if necessary on the next * tick. */ static void disarm_timer(struct k_itimer *timer, struct task_struct *p) { struct cpu_timer *ctmr = &timer->it.cpu; struct posix_cputimer_base *base; if (!cpu_timer_dequeue(ctmr)) return; base = timer_base(timer, p); if (cpu_timer_getexpires(ctmr) == base->nextevt) trigger_base_recalc_expires(timer, p); } /* * Clean up a CPU-clock timer that is about to be destroyed. * This is called from timer deletion with the timer already locked. * If we return TIMER_RETRY, it's necessary to release the timer's lock * and try again. (This happens when the timer is in the middle of firing.) */ static int posix_cpu_timer_del(struct k_itimer *timer) { struct cpu_timer *ctmr = &timer->it.cpu; struct sighand_struct *sighand; struct task_struct *p; unsigned long flags; int ret = 0; rcu_read_lock(); p = cpu_timer_task_rcu(timer); if (!p) goto out; /* * Protect against sighand release/switch in exit/exec and process/ * thread timer list entry concurrent read/writes. */ sighand = lock_task_sighand(p, &flags); if (unlikely(sighand == NULL)) { /* * This raced with the reaping of the task. The exit cleanup * should have removed this timer from the timer queue. */ WARN_ON_ONCE(ctmr->head || timerqueue_node_queued(&ctmr->node)); } else { if (timer->it.cpu.firing) { /* * Prevent signal delivery. The timer cannot be dequeued * because it is on the firing list which is not protected * by sighand->lock. The delivery path is waiting for * the timer lock. So go back, unlock and retry. */ timer->it.cpu.firing = false; ret = TIMER_RETRY; } else { disarm_timer(timer, p); } unlock_task_sighand(p, &flags); } out: rcu_read_unlock(); if (!ret) { put_pid(ctmr->pid); timer->it_status = POSIX_TIMER_DISARMED; } return ret; } static void cleanup_timerqueue(struct timerqueue_head *head) { struct timerqueue_node *node; struct cpu_timer *ctmr; while ((node = timerqueue_getnext(head))) { timerqueue_del(head, node); ctmr = container_of(node, struct cpu_timer, node); ctmr->head = NULL; } } /* * Clean out CPU timers which are still armed when a thread exits. The * timers are only removed from the list. No other updates are done. The * corresponding posix timers are still accessible, but cannot be rearmed. * * This must be called with the siglock held. */ static void cleanup_timers(struct posix_cputimers *pct) { cleanup_timerqueue(&pct->bases[CPUCLOCK_PROF].tqhead); cleanup_timerqueue(&pct->bases[CPUCLOCK_VIRT].tqhead); cleanup_timerqueue(&pct->bases[CPUCLOCK_SCHED].tqhead); } /* * These are both called with the siglock held, when the current thread * is being reaped. When the final (leader) thread in the group is reaped, * posix_cpu_timers_exit_group will be called after posix_cpu_timers_exit. */ void posix_cpu_timers_exit(struct task_struct *tsk) { cleanup_timers(&tsk->posix_cputimers); } void posix_cpu_timers_exit_group(struct task_struct *tsk) { cleanup_timers(&tsk->signal->posix_cputimers); } /* * Insert the timer on the appropriate list before any timers that * expire later. This must be called with the sighand lock held. */ static void arm_timer(struct k_itimer *timer, struct task_struct *p) { struct posix_cputimer_base *base = timer_base(timer, p); struct cpu_timer *ctmr = &timer->it.cpu; u64 newexp = cpu_timer_getexpires(ctmr); timer->it_status = POSIX_TIMER_ARMED; if (!cpu_timer_enqueue(&base->tqhead, ctmr)) return; /* * We are the new earliest-expiring POSIX 1.b timer, hence * need to update expiration cache. Take into account that * for process timers we share expiration cache with itimers * and RLIMIT_CPU and for thread timers with RLIMIT_RTTIME. */ if (newexp < base->nextevt) base->nextevt = newexp; if (CPUCLOCK_PERTHREAD(timer->it_clock)) tick_dep_set_task(p, TICK_DEP_BIT_POSIX_TIMER); else tick_dep_set_signal(p, TICK_DEP_BIT_POSIX_TIMER); } /* * The timer is locked, fire it and arrange for its reload. */ static void cpu_timer_fire(struct k_itimer *timer) { struct cpu_timer *ctmr = &timer->it.cpu; timer->it_status = POSIX_TIMER_DISARMED; if (unlikely(ctmr->nanosleep)) { /* * This a special case for clock_nanosleep, * not a normal timer from sys_timer_create. */ wake_up_process(timer->it_process); cpu_timer_setexpires(ctmr, 0); } else { posix_timer_queue_signal(timer); /* Disable oneshot timers */ if (!timer->it_interval) cpu_timer_setexpires(ctmr, 0); } } static void __posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp, u64 now); /* * Guts of sys_timer_settime for CPU timers. * This is called with the timer locked and interrupts disabled. * If we return TIMER_RETRY, it's necessary to release the timer's lock * and try again. (This happens when the timer is in the middle of firing.) */ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags, struct itimerspec64 *new, struct itimerspec64 *old) { bool sigev_none = timer->it_sigev_notify == SIGEV_NONE; clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock); struct cpu_timer *ctmr = &timer->it.cpu; u64 old_expires, new_expires, now; struct sighand_struct *sighand; struct task_struct *p; unsigned long flags; int ret = 0; rcu_read_lock(); p = cpu_timer_task_rcu(timer); if (!p) { /* * If p has just been reaped, we can no * longer get any information about it at all. */ rcu_read_unlock(); return -ESRCH; } /* * Use the to_ktime conversion because that clamps the maximum * value to KTIME_MAX and avoid multiplication overflows. */ new_expires = ktime_to_ns(timespec64_to_ktime(new->it_value)); /* * Protect against sighand release/switch in exit/exec and p->cpu_timers * and p->signal->cpu_timers read/write in arm_timer() */ sighand = lock_task_sighand(p, &flags); /* * If p has just been reaped, we can no * longer get any information about it at all. */ if (unlikely(sighand == NULL)) { rcu_read_unlock(); return -ESRCH; } /* Retrieve the current expiry time before disarming the timer */ old_expires = cpu_timer_getexpires(ctmr); if (unlikely(timer->it.cpu.firing)) { /* * Prevent signal delivery. The timer cannot be dequeued * because it is on the firing list which is not protected * by sighand->lock. The delivery path is waiting for * the timer lock. So go back, unlock and retry. */ timer->it.cpu.firing = false; ret = TIMER_RETRY; } else { cpu_timer_dequeue(ctmr); timer->it_status = POSIX_TIMER_DISARMED; } /* * Sample the current clock for saving the previous setting * and for rearming the timer. */ if (CPUCLOCK_PERTHREAD(timer->it_clock)) now = cpu_clock_sample(clkid, p); else now = cpu_clock_sample_group(clkid, p, !sigev_none); /* Retrieve the previous expiry value if requested. */ if (old) { old->it_value = (struct timespec64){ }; if (old_expires) __posix_cpu_timer_get(timer, old, now); } /* Retry if the timer expiry is running concurrently */ if (unlikely(ret)) { unlock_task_sighand(p, &flags); goto out; } /* Convert relative expiry time to absolute */ if (new_expires && !(timer_flags & TIMER_ABSTIME)) new_expires += now; /* Set the new expiry time (might be 0) */ cpu_timer_setexpires(ctmr, new_expires); /* * Arm the timer if it is not disabled, the new expiry value has * not yet expired and the timer requires signal delivery. * SIGEV_NONE timers are never armed. In case the timer is not * armed, enforce the reevaluation of the timer base so that the * process wide cputime counter can be disabled eventually. */ if (likely(!sigev_none)) { if (new_expires && now < new_expires) arm_timer(timer, p); else trigger_base_recalc_expires(timer, p); } unlock_task_sighand(p, &flags); posix_timer_set_common(timer, new); /* * If the new expiry time was already in the past the timer was not * queued. Fire it immediately even if the thread never runs to * accumulate more time on this clock. */ if (!sigev_none && new_expires && now >= new_expires) cpu_timer_fire(timer); out: rcu_read_unlock(); return ret; } static void __posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp, u64 now) { bool sigev_none = timer->it_sigev_notify == SIGEV_NONE; u64 expires, iv = timer->it_interval; /* * Make sure that interval timers are moved forward for the * following cases: * - SIGEV_NONE timers which are never armed * - Timers which expired, but the signal has not yet been * delivered */ if (iv && timer->it_status != POSIX_TIMER_ARMED) expires = bump_cpu_timer(timer, now); else expires = cpu_timer_getexpires(&timer->it.cpu); /* * Expired interval timers cannot have a remaining time <= 0. * The kernel has to move them forward so that the next * timer expiry is > @now. */ if (now < expires) { itp->it_value = ns_to_timespec64(expires - now); } else { /* * A single shot SIGEV_NONE timer must return 0, when it is * expired! Timers which have a real signal delivery mode * must return a remaining time greater than 0 because the * signal has not yet been delivered. */ if (!sigev_none) itp->it_value.tv_nsec = 1; } } static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp) { clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock); struct task_struct *p; u64 now; rcu_read_lock(); p = cpu_timer_task_rcu(timer); if (p && cpu_timer_getexpires(&timer->it.cpu)) { itp->it_interval = ktime_to_timespec64(timer->it_interval); if (CPUCLOCK_PERTHREAD(timer->it_clock)) now = cpu_clock_sample(clkid, p); else now = cpu_clock_sample_group(clkid, p, false); __posix_cpu_timer_get(timer, itp, now); } rcu_read_unlock(); } #define MAX_COLLECTED 20 static u64 collect_timerqueue(struct timerqueue_head *head, struct list_head *firing, u64 now) { struct timerqueue_node *next; int i = 0; while ((next = timerqueue_getnext(head))) { struct cpu_timer *ctmr; u64 expires; ctmr = container_of(next, struct cpu_timer, node); expires = cpu_timer_getexpires(ctmr); /* Limit the number of timers to expire at once */ if (++i == MAX_COLLECTED || now < expires) return expires; ctmr->firing = true; /* See posix_cpu_timer_wait_running() */ rcu_assign_pointer(ctmr->handling, current); cpu_timer_dequeue(ctmr); list_add_tail(&ctmr->elist, firing); } return U64_MAX; } static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples, struct list_head *firing) { struct posix_cputimer_base *base = pct->bases; int i; for (i = 0; i < CPUCLOCK_MAX; i++, base++) { base->nextevt = collect_timerqueue(&base->tqhead, firing, samples[i]); } } static inline void check_dl_overrun(struct task_struct *tsk) { if (tsk->dl.dl_overrun) { tsk->dl.dl_overrun = 0; send_signal_locked(SIGXCPU, SEND_SIG_PRIV, tsk, PIDTYPE_TGID); } } static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard) { if (time < limit) return false; if (print_fatal_signals) { pr_info("%s Watchdog Timeout (%s): %s[%d]\n", rt ? "RT" : "CPU", hard ? "hard" : "soft", current->comm, task_pid_nr(current)); } send_signal_locked(signo, SEND_SIG_PRIV, current, PIDTYPE_TGID); return true; } /* * Check for any per-thread CPU timers that have fired and move them off * the tsk->cpu_timers[N] list onto the firing list. Here we update the * tsk->it_*_expires values to reflect the remaining thread CPU timers. */ static void check_thread_timers(struct task_struct *tsk, struct list_head *firing) { struct posix_cputimers *pct = &tsk->posix_cputimers; u64 samples[CPUCLOCK_MAX]; unsigned long soft; if (dl_task(tsk)) check_dl_overrun(tsk); if (expiry_cache_is_inactive(pct)) return; task_sample_cputime(tsk, samples); collect_posix_cputimers(pct, samples, firing); /* * Check for the special case thread timers. */ soft = task_rlimit(tsk, RLIMIT_RTTIME); if (soft != RLIM_INFINITY) { /* Task RT timeout is accounted in jiffies. RTTIME is usec */ unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ); unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME); /* At the hard limit, send SIGKILL. No further action. */ if (hard != RLIM_INFINITY && check_rlimit(rttime, hard, SIGKILL, true, true)) return; /* At the soft limit, send a SIGXCPU every second */ if (check_rlimit(rttime, soft, SIGXCPU, true, false)) { soft += USEC_PER_SEC; tsk->signal->rlim[RLIMIT_RTTIME].rlim_cur = soft; } } if (expiry_cache_is_inactive(pct)) tick_dep_clear_task(tsk, TICK_DEP_BIT_POSIX_TIMER); } static inline void stop_process_timers(struct signal_struct *sig) { struct posix_cputimers *pct = &sig->posix_cputimers; /* Turn off the active flag. This is done without locking. */ WRITE_ONCE(pct->timers_active, false); tick_dep_clear_signal(sig, TICK_DEP_BIT_POSIX_TIMER); } static void check_cpu_itimer(struct task_struct *tsk, struct cpu_itimer *it, u64 *expires, u64 cur_time, int signo) { if (!it->expires) return; if (cur_time >= it->expires) { if (it->incr) it->expires += it->incr; else it->expires = 0; trace_itimer_expire(signo == SIGPROF ? ITIMER_PROF : ITIMER_VIRTUAL, task_tgid(tsk), cur_time); send_signal_locked(signo, SEND_SIG_PRIV, tsk, PIDTYPE_TGID); } if (it->expires && it->expires < *expires) *expires = it->expires; } /* * Check for any per-thread CPU timers that have fired and move them * off the tsk->*_timers list onto the firing list. Per-thread timers * have already been taken off. */ static void check_process_timers(struct task_struct *tsk, struct list_head *firing) { struct signal_struct *const sig = tsk->signal; struct posix_cputimers *pct = &sig->posix_cputimers; u64 samples[CPUCLOCK_MAX]; unsigned long soft; /* * If there are no active process wide timers (POSIX 1.b, itimers, * RLIMIT_CPU) nothing to check. Also skip the process wide timer * processing when there is already another task handling them. */ if (!READ_ONCE(pct->timers_active) || pct->expiry_active) return; /* * Signify that a thread is checking for process timers. * Write access to this field is protected by the sighand lock. */ pct->expiry_active = true; /* * Collect the current process totals. Group accounting is active * so the sample can be taken directly. */ proc_sample_cputime_atomic(&sig->cputimer.cputime_atomic, samples); collect_posix_cputimers(pct, samples, firing); /* * Check for the special case process timers. */ check_cpu_itimer(tsk, &sig->it[CPUCLOCK_PROF], &pct->bases[CPUCLOCK_PROF].nextevt, samples[CPUCLOCK_PROF], SIGPROF); check_cpu_itimer(tsk, &sig->it[CPUCLOCK_VIRT], &pct->bases[CPUCLOCK_VIRT].nextevt, samples[CPUCLOCK_VIRT], SIGVTALRM); soft = task_rlimit(tsk, RLIMIT_CPU); if (soft != RLIM_INFINITY) { /* RLIMIT_CPU is in seconds. Samples are nanoseconds */ unsigned long hard = task_rlimit_max(tsk, RLIMIT_CPU); u64 ptime = samples[CPUCLOCK_PROF]; u64 softns = (u64)soft * NSEC_PER_SEC; u64 hardns = (u64)hard * NSEC_PER_SEC; /* At the hard limit, send SIGKILL. No further action. */ if (hard != RLIM_INFINITY && check_rlimit(ptime, hardns, SIGKILL, false, true)) return; /* At the soft limit, send a SIGXCPU every second */ if (check_rlimit(ptime, softns, SIGXCPU, false, false)) { sig->rlim[RLIMIT_CPU].rlim_cur = soft + 1; softns += NSEC_PER_SEC; } /* Update the expiry cache */ if (softns < pct->bases[CPUCLOCK_PROF].nextevt) pct->bases[CPUCLOCK_PROF].nextevt = softns; } if (expiry_cache_is_inactive(pct)) stop_process_timers(sig); pct->expiry_active = false; } /* * This is called from the signal code (via posixtimer_rearm) * when the last timer signal was delivered and we have to reload the timer. */ static void posix_cpu_timer_rearm(struct k_itimer *timer) { clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock); struct task_struct *p; struct sighand_struct *sighand; unsigned long flags; u64 now; rcu_read_lock(); p = cpu_timer_task_rcu(timer); if (!p) goto out; /* Protect timer list r/w in arm_timer() */ sighand = lock_task_sighand(p, &flags); if (unlikely(sighand == NULL)) goto out; /* * Fetch the current sample and update the timer's expiry time. */ if (CPUCLOCK_PERTHREAD(timer->it_clock)) now = cpu_clock_sample(clkid, p); else now = cpu_clock_sample_group(clkid, p, true); bump_cpu_timer(timer, now); /* * Now re-arm for the new expiry time. */ arm_timer(timer, p); unlock_task_sighand(p, &flags); out: rcu_read_unlock(); } /** * task_cputimers_expired - Check whether posix CPU timers are expired * * @samples: Array of current samples for the CPUCLOCK clocks * @pct: Pointer to a posix_cputimers container * * Returns true if any member of @samples is greater than the corresponding * member of @pct->bases[CLK].nextevt. False otherwise */ static inline bool task_cputimers_expired(const u64 *samples, struct posix_cputimers *pct) { int i; for (i = 0; i < CPUCLOCK_MAX; i++) { if (samples[i] >= pct->bases[i].nextevt) return true; } return false; } /** * fastpath_timer_check - POSIX CPU timers fast path. * * @tsk: The task (thread) being checked. * * Check the task and thread group timers. If both are zero (there are no * timers set) return false. Otherwise snapshot the task and thread group * timers and compare them with the corresponding expiration times. Return * true if a timer has expired, else return false. */ static inline bool fastpath_timer_check(struct task_struct *tsk) { struct posix_cputimers *pct = &tsk->posix_cputimers; struct signal_struct *sig; if (!expiry_cache_is_inactive(pct)) { u64 samples[CPUCLOCK_MAX]; task_sample_cputime(tsk, samples); if (task_cputimers_expired(samples, pct)) return true; } sig = tsk->signal; pct = &sig->posix_cputimers; /* * Check if thread group timers expired when timers are active and * no other thread in the group is already handling expiry for * thread group cputimers. These fields are read without the * sighand lock. However, this is fine because this is meant to be * a fastpath heuristic to determine whether we should try to * acquire the sighand lock to handle timer expiry. * * In the worst case scenario, if concurrently timers_active is set * or expiry_active is cleared, but the current thread doesn't see * the change yet, the timer checks are delayed until the next * thread in the group gets a scheduler interrupt to handle the * timer. This isn't an issue in practice because these types of * delays with signals actually getting sent are expected. */ if (READ_ONCE(pct->timers_active) && !READ_ONCE(pct->expiry_active)) { u64 samples[CPUCLOCK_MAX]; proc_sample_cputime_atomic(&sig->cputimer.cputime_atomic, samples); if (task_cputimers_expired(samples, pct)) return true; } if (dl_task(tsk) && tsk->dl.dl_overrun) return true; return false; } static void handle_posix_cpu_timers(struct task_struct *tsk); #ifdef CONFIG_POSIX_CPU_TIMERS_TASK_WORK static void posix_cpu_timers_work(struct callback_head *work) { struct posix_cputimers_work *cw = container_of(work, typeof(*cw), work); mutex_lock(&cw->mutex); handle_posix_cpu_timers(current); mutex_unlock(&cw->mutex); } /* * Invoked from the posix-timer core when a cancel operation failed because * the timer is marked firing. The caller holds rcu_read_lock(), which * protects the timer and the task which is expiring it from being freed. */ static void posix_cpu_timer_wait_running(struct k_itimer *timr) { struct task_struct *tsk = rcu_dereference(timr->it.cpu.handling); /* Has the handling task completed expiry already? */ if (!tsk) return; /* Ensure that the task cannot go away */ get_task_struct(tsk); /* Now drop the RCU protection so the mutex can be locked */ rcu_read_unlock(); /* Wait on the expiry mutex */ mutex_lock(&tsk->posix_cputimers_work.mutex); /* Release it immediately again. */ mutex_unlock(&tsk->posix_cputimers_work.mutex); /* Drop the task reference. */ put_task_struct(tsk); /* Relock RCU so the callsite is balanced */ rcu_read_lock(); } static void posix_cpu_timer_wait_running_nsleep(struct k_itimer *timr) { /* Ensure that timr->it.cpu.handling task cannot go away */ rcu_read_lock(); spin_unlock_irq(&timr->it_lock); posix_cpu_timer_wait_running(timr); rcu_read_unlock(); /* @timr is on stack and is valid */ spin_lock_irq(&timr->it_lock); } /* * Clear existing posix CPU timers task work. */ void clear_posix_cputimers_work(struct task_struct *p) { /* * A copied work entry from the old task is not meaningful, clear it. * N.B. init_task_work will not do this. */ memset(&p->posix_cputimers_work.work, 0, sizeof(p->posix_cputimers_work.work)); init_task_work(&p->posix_cputimers_work.work, posix_cpu_timers_work); mutex_init(&p->posix_cputimers_work.mutex); p->posix_cputimers_work.scheduled = false; } /* * Initialize posix CPU timers task work in init task. Out of line to * keep the callback static and to avoid header recursion hell. */ void __init posix_cputimers_init_work(void) { clear_posix_cputimers_work(current); } /* * Note: All operations on tsk->posix_cputimer_work.scheduled happen either * in hard interrupt context or in task context with interrupts * disabled. Aside of that the writer/reader interaction is always in the * context of the current task, which means they are strict per CPU. */ static inline bool posix_cpu_timers_work_scheduled(struct task_struct *tsk) { return tsk->posix_cputimers_work.scheduled; } static inline void __run_posix_cpu_timers(struct task_struct *tsk) { if (WARN_ON_ONCE(tsk->posix_cputimers_work.scheduled)) return; /* Schedule task work to actually expire the timers */ tsk->posix_cputimers_work.scheduled = true; task_work_add(tsk, &tsk->posix_cputimers_work.work, TWA_RESUME); } static inline bool posix_cpu_timers_enable_work(struct task_struct *tsk, unsigned long start) { bool ret = true; /* * On !RT kernels interrupts are disabled while collecting expired * timers, so no tick can happen and the fast path check can be * reenabled without further checks. */ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) { tsk->posix_cputimers_work.scheduled = false; return true; } /* * On RT enabled kernels ticks can happen while the expired timers * are collected under sighand lock. But any tick which observes * the CPUTIMERS_WORK_SCHEDULED bit set, does not run the fastpath * checks. So reenabling the tick work has do be done carefully: * * Disable interrupts and run the fast path check if jiffies have * advanced since the collecting of expired timers started. If * jiffies have not advanced or the fast path check did not find * newly expired timers, reenable the fast path check in the timer * interrupt. If there are newly expired timers, return false and * let the collection loop repeat. */ local_irq_disable(); if (start != jiffies && fastpath_timer_check(tsk)) ret = false; else tsk->posix_cputimers_work.scheduled = false; local_irq_enable(); return ret; } #else /* CONFIG_POSIX_CPU_TIMERS_TASK_WORK */ static inline void __run_posix_cpu_timers(struct task_struct *tsk) { lockdep_posixtimer_enter(); handle_posix_cpu_timers(tsk); lockdep_posixtimer_exit(); } static void posix_cpu_timer_wait_running(struct k_itimer *timr) { cpu_relax(); } static void posix_cpu_timer_wait_running_nsleep(struct k_itimer *timr) { spin_unlock_irq(&timr->it_lock); cpu_relax(); spin_lock_irq(&timr->it_lock); } static inline bool posix_cpu_timers_work_scheduled(struct task_struct *tsk) { return false; } static inline bool posix_cpu_timers_enable_work(struct task_struct *tsk, unsigned long start) { return true; } #endif /* CONFIG_POSIX_CPU_TIMERS_TASK_WORK */ static void handle_posix_cpu_timers(struct task_struct *tsk) { struct k_itimer *timer, *next; unsigned long flags, start; LIST_HEAD(firing); if (!lock_task_sighand(tsk, &flags)) return; do { /* * On RT locking sighand lock does not disable interrupts, * so this needs to be careful vs. ticks. Store the current * jiffies value. */ start = READ_ONCE(jiffies); barrier(); /* * Here we take off tsk->signal->cpu_timers[N] and * tsk->cpu_timers[N] all the timers that are firing, and * put them on the firing list. */ check_thread_timers(tsk, &firing); check_process_timers(tsk, &firing); /* * The above timer checks have updated the expiry cache and * because nothing can have queued or modified timers after * sighand lock was taken above it is guaranteed to be * consistent. So the next timer interrupt fastpath check * will find valid data. * * If timer expiry runs in the timer interrupt context then * the loop is not relevant as timers will be directly * expired in interrupt context. The stub function below * returns always true which allows the compiler to * optimize the loop out. * * If timer expiry is deferred to task work context then * the following rules apply: * * - On !RT kernels no tick can have happened on this CPU * after sighand lock was acquired because interrupts are * disabled. So reenabling task work before dropping * sighand lock and reenabling interrupts is race free. * * - On RT kernels ticks might have happened but the tick * work ignored posix CPU timer handling because the * CPUTIMERS_WORK_SCHEDULED bit is set. Reenabling work * must be done very carefully including a check whether * ticks have happened since the start of the timer * expiry checks. posix_cpu_timers_enable_work() takes * care of that and eventually lets the expiry checks * run again. */ } while (!posix_cpu_timers_enable_work(tsk, start)); /* * We must release sighand lock before taking any timer's lock. * There is a potential race with timer deletion here, as the * siglock now protects our private firing list. We have set * the firing flag in each timer, so that a deletion attempt * that gets the timer lock before we do will give it up and * spin until we've taken care of that timer below. */ unlock_task_sighand(tsk, &flags); /* * Now that all the timers on our list have the firing flag, * no one will touch their list entries but us. We'll take * each timer's lock before clearing its firing flag, so no * timer call will interfere. */ list_for_each_entry_safe(timer, next, &firing, it.cpu.elist) { bool cpu_firing; /* * spin_lock() is sufficient here even independent of the * expiry context. If expiry happens in hard interrupt * context it's obvious. For task work context it's safe * because all other operations on timer::it_lock happen in * task context (syscall or exit). */ spin_lock(&timer->it_lock); list_del_init(&timer->it.cpu.elist); cpu_firing = timer->it.cpu.firing; timer->it.cpu.firing = false; /* * If the firing flag is cleared then this raced with a * timer rearm/delete operation. So don't generate an * event. */ if (likely(cpu_firing)) cpu_timer_fire(timer); /* See posix_cpu_timer_wait_running() */ rcu_assign_pointer(timer->it.cpu.handling, NULL); spin_unlock(&timer->it_lock); } } /* * This is called from the timer interrupt handler. The irq handler has * already updated our counts. We need to check if any timers fire now. * Interrupts are disabled. */ void run_posix_cpu_timers(void) { struct task_struct *tsk = current; lockdep_assert_irqs_disabled(); /* * Ensure that release_task(tsk) can't happen while * handle_posix_cpu_timers() is running. Otherwise, a concurrent * posix_cpu_timer_del() may fail to lock_task_sighand(tsk) and * miss timer->it.cpu.firing != 0. */ if (tsk->exit_state) return; /* * If the actual expiry is deferred to task work context and the * work is already scheduled there is no point to do anything here. */ if (posix_cpu_timers_work_scheduled(tsk)) return; /* * The fast path checks that there are no expired thread or thread * group timers. If that's so, just return. */ if (!fastpath_timer_check(tsk)) return; __run_posix_cpu_timers(tsk); } /* * Set one of the process-wide special case CPU timers or RLIMIT_CPU. * The tsk->sighand->siglock must be held by the caller. */ void set_process_cpu_timer(struct task_struct *tsk, unsigned int clkid, u64 *newval, u64 *oldval) { u64 now, *nextevt; if (WARN_ON_ONCE(clkid >= CPUCLOCK_SCHED)) return; nextevt = &tsk->signal->posix_cputimers.bases[clkid].nextevt; now = cpu_clock_sample_group(clkid, tsk, true); if (oldval) { /* * We are setting itimer. The *oldval is absolute and we update * it to be relative, *newval argument is relative and we update * it to be absolute. */ if (*oldval) { if (*oldval <= now) { /* Just about to fire. */ *oldval = TICK_NSEC; } else { *oldval -= now; } } if (*newval) *newval += now; } /* * Update expiration cache if this is the earliest timer. CPUCLOCK_PROF * expiry cache is also used by RLIMIT_CPU!. */ if (*newval < *nextevt) *nextevt = *newval; tick_dep_set_signal(tsk, TICK_DEP_BIT_POSIX_TIMER); } static int do_cpu_nanosleep(const clockid_t which_clock, int flags, const struct timespec64 *rqtp) { struct itimerspec64 it; struct k_itimer timer; u64 expires; int error; /* * Set up a temporary timer and then wait for it to go off. */ memset(&timer, 0, sizeof timer); spin_lock_init(&timer.it_lock); timer.it_clock = which_clock; timer.it_overrun = -1; error = posix_cpu_timer_create(&timer); timer.it_process = current; timer.it.cpu.nanosleep = true; if (!error) { static struct itimerspec64 zero_it; struct restart_block *restart; memset(&it, 0, sizeof(it)); it.it_value = *rqtp; spin_lock_irq(&timer.it_lock); error = posix_cpu_timer_set(&timer, flags, &it, NULL); if (error) { spin_unlock_irq(&timer.it_lock); return error; } while (!signal_pending(current)) { if (!cpu_timer_getexpires(&timer.it.cpu)) { /* * Our timer fired and was reset, below * deletion can not fail. */ posix_cpu_timer_del(&timer); spin_unlock_irq(&timer.it_lock); return 0; } /* * Block until cpu_timer_fire (or a signal) wakes us. */ __set_current_state(TASK_INTERRUPTIBLE); spin_unlock_irq(&timer.it_lock); schedule(); spin_lock_irq(&timer.it_lock); } /* * We were interrupted by a signal. */ expires = cpu_timer_getexpires(&timer.it.cpu); error = posix_cpu_timer_set(&timer, 0, &zero_it, &it); if (!error) { /* Timer is now unarmed, deletion can not fail. */ posix_cpu_timer_del(&timer); } else { while (error == TIMER_RETRY) { posix_cpu_timer_wait_running_nsleep(&timer); error = posix_cpu_timer_del(&timer); } } spin_unlock_irq(&timer.it_lock); if ((it.it_value.tv_sec | it.it_value.tv_nsec) == 0) { /* * It actually did fire already. */ return 0; } error = -ERESTART_RESTARTBLOCK; /* * Report back to the user the time still remaining. */ restart = ¤t->restart_block; restart->nanosleep.expires = expires; if (restart->nanosleep.type != TT_NONE) error = nanosleep_copyout(restart, &it.it_value); } return error; } static long posix_cpu_nsleep_restart(struct restart_block *restart_block); static int posix_cpu_nsleep(const clockid_t which_clock, int flags, const struct timespec64 *rqtp) { struct restart_block *restart_block = ¤t->restart_block; int error; /* * Diagnose required errors first. */ if (CPUCLOCK_PERTHREAD(which_clock) && (CPUCLOCK_PID(which_clock) == 0 || CPUCLOCK_PID(which_clock) == task_pid_vnr(current))) return -EINVAL; error = do_cpu_nanosleep(which_clock, flags, rqtp); if (error == -ERESTART_RESTARTBLOCK) { if (flags & TIMER_ABSTIME) return -ERESTARTNOHAND; restart_block->nanosleep.clockid = which_clock; set_restart_fn(restart_block, posix_cpu_nsleep_restart); } return error; } static long posix_cpu_nsleep_restart(struct restart_block *restart_block) { clockid_t which_clock = restart_block->nanosleep.clockid; struct timespec64 t; t = ns_to_timespec64(restart_block->nanosleep.expires); return do_cpu_nanosleep(which_clock, TIMER_ABSTIME, &t); } #define PROCESS_CLOCK make_process_cpuclock(0, CPUCLOCK_SCHED) #define THREAD_CLOCK make_thread_cpuclock(0, CPUCLOCK_SCHED) static int process_cpu_clock_getres(const clockid_t which_clock, struct timespec64 *tp) { return posix_cpu_clock_getres(PROCESS_CLOCK, tp); } static int process_cpu_clock_get(const clockid_t which_clock, struct timespec64 *tp) { return posix_cpu_clock_get(PROCESS_CLOCK, tp); } static int process_cpu_timer_create(struct k_itimer *timer) { timer->it_clock = PROCESS_CLOCK; return posix_cpu_timer_create(timer); } static int process_cpu_nsleep(const clockid_t which_clock, int flags, const struct timespec64 *rqtp) { return posix_cpu_nsleep(PROCESS_CLOCK, flags, rqtp); } static int thread_cpu_clock_getres(const clockid_t which_clock, struct timespec64 *tp) { return posix_cpu_clock_getres(THREAD_CLOCK, tp); } static int thread_cpu_clock_get(const clockid_t which_clock, struct timespec64 *tp) { return posix_cpu_clock_get(THREAD_CLOCK, tp); } static int thread_cpu_timer_create(struct k_itimer *timer) { timer->it_clock = THREAD_CLOCK; return posix_cpu_timer_create(timer); } const struct k_clock clock_posix_cpu = { .clock_getres = posix_cpu_clock_getres, .clock_set = posix_cpu_clock_set, .clock_get_timespec = posix_cpu_clock_get, .timer_create = posix_cpu_timer_create, .nsleep = posix_cpu_nsleep, .timer_set = posix_cpu_timer_set, .timer_del = posix_cpu_timer_del, .timer_get = posix_cpu_timer_get, .timer_rearm = posix_cpu_timer_rearm, .timer_wait_running = posix_cpu_timer_wait_running, }; const struct k_clock clock_process = { .clock_getres = process_cpu_clock_getres, .clock_get_timespec = process_cpu_clock_get, .timer_create = process_cpu_timer_create, .nsleep = process_cpu_nsleep, }; const struct k_clock clock_thread = { .clock_getres = thread_cpu_clock_getres, .clock_get_timespec = thread_cpu_clock_get, .timer_create = thread_cpu_timer_create, }; |
| 170 169 170 170 169 108 109 3 1 23 24 24 24 24 26 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Digital Audio (PCM) abstract layer * Copyright (c) by Jaroslav Kysela <perex@perex.cz> */ #include <linux/time.h> #include <linux/gcd.h> #include <sound/core.h> #include <sound/pcm.h> #include <sound/timer.h> #include "pcm_local.h" /* * Timer functions */ void snd_pcm_timer_resolution_change(struct snd_pcm_substream *substream) { unsigned long rate, mult, fsize, l, post; struct snd_pcm_runtime *runtime = substream->runtime; mult = 1000000000; rate = runtime->rate; if (snd_BUG_ON(!rate)) return; l = gcd(mult, rate); mult /= l; rate /= l; fsize = runtime->period_size; if (snd_BUG_ON(!fsize)) return; l = gcd(rate, fsize); rate /= l; fsize /= l; post = 1; while ((mult * fsize) / fsize != mult) { mult /= 2; post *= 2; } if (rate == 0) { pcm_err(substream->pcm, "pcm timer resolution out of range (rate = %u, period_size = %lu)\n", runtime->rate, runtime->period_size); runtime->timer_resolution = -1; return; } runtime->timer_resolution = (mult * fsize / rate) * post; } static unsigned long snd_pcm_timer_resolution(struct snd_timer * timer) { struct snd_pcm_substream *substream; substream = timer->private_data; return substream->runtime ? substream->runtime->timer_resolution : 0; } static int snd_pcm_timer_start(struct snd_timer * timer) { struct snd_pcm_substream *substream; substream = snd_timer_chip(timer); substream->timer_running = 1; return 0; } static int snd_pcm_timer_stop(struct snd_timer * timer) { struct snd_pcm_substream *substream; substream = snd_timer_chip(timer); substream->timer_running = 0; return 0; } static const struct snd_timer_hardware snd_pcm_timer = { .flags = SNDRV_TIMER_HW_AUTO | SNDRV_TIMER_HW_SLAVE, .resolution = 0, .ticks = 1, .c_resolution = snd_pcm_timer_resolution, .start = snd_pcm_timer_start, .stop = snd_pcm_timer_stop, }; /* * Init functions */ static void snd_pcm_timer_free(struct snd_timer *timer) { struct snd_pcm_substream *substream = timer->private_data; substream->timer = NULL; } void snd_pcm_timer_init(struct snd_pcm_substream *substream) { struct snd_timer_id tid; struct snd_timer *timer; tid.dev_sclass = SNDRV_TIMER_SCLASS_NONE; tid.dev_class = SNDRV_TIMER_CLASS_PCM; tid.card = substream->pcm->card->number; tid.device = substream->pcm->device; tid.subdevice = (substream->number << 1) | (substream->stream & 1); if (snd_timer_new(substream->pcm->card, "PCM", &tid, &timer) < 0) return; sprintf(timer->name, "PCM %s %i-%i-%i", snd_pcm_direction_name(substream->stream), tid.card, tid.device, tid.subdevice); timer->hw = snd_pcm_timer; if (snd_device_register(timer->card, timer) < 0) { snd_device_free(timer->card, timer); return; } timer->private_data = substream; timer->private_free = snd_pcm_timer_free; substream->timer = timer; } void snd_pcm_timer_done(struct snd_pcm_substream *substream) { if (substream->timer) { snd_device_free(substream->pcm->card, substream->timer); substream->timer = NULL; } } |
| 16 16 15 20 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 | // SPDX-License-Identifier: GPL-2.0 #include <crypto/internal/hash.h> #include <linux/init.h> #include <linux/module.h> #include <linux/xxhash.h> #include <linux/unaligned.h> #define XXHASH64_BLOCK_SIZE 32 #define XXHASH64_DIGEST_SIZE 8 struct xxhash64_tfm_ctx { u64 seed; }; struct xxhash64_desc_ctx { struct xxh64_state xxhstate; }; static int xxhash64_setkey(struct crypto_shash *tfm, const u8 *key, unsigned int keylen) { struct xxhash64_tfm_ctx *tctx = crypto_shash_ctx(tfm); if (keylen != sizeof(tctx->seed)) return -EINVAL; tctx->seed = get_unaligned_le64(key); return 0; } static int xxhash64_init(struct shash_desc *desc) { struct xxhash64_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); struct xxhash64_desc_ctx *dctx = shash_desc_ctx(desc); xxh64_reset(&dctx->xxhstate, tctx->seed); return 0; } static int xxhash64_update(struct shash_desc *desc, const u8 *data, unsigned int length) { struct xxhash64_desc_ctx *dctx = shash_desc_ctx(desc); xxh64_update(&dctx->xxhstate, data, length); return 0; } static int xxhash64_final(struct shash_desc *desc, u8 *out) { struct xxhash64_desc_ctx *dctx = shash_desc_ctx(desc); put_unaligned_le64(xxh64_digest(&dctx->xxhstate), out); return 0; } static int xxhash64_digest(struct shash_desc *desc, const u8 *data, unsigned int length, u8 *out) { struct xxhash64_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); put_unaligned_le64(xxh64(data, length, tctx->seed), out); return 0; } static struct shash_alg alg = { .digestsize = XXHASH64_DIGEST_SIZE, .setkey = xxhash64_setkey, .init = xxhash64_init, .update = xxhash64_update, .final = xxhash64_final, .digest = xxhash64_digest, .descsize = sizeof(struct xxhash64_desc_ctx), .base = { .cra_name = "xxhash64", .cra_driver_name = "xxhash64-generic", .cra_priority = 100, .cra_flags = CRYPTO_ALG_OPTIONAL_KEY, .cra_blocksize = XXHASH64_BLOCK_SIZE, .cra_ctxsize = sizeof(struct xxhash64_tfm_ctx), .cra_module = THIS_MODULE, } }; static int __init xxhash_mod_init(void) { return crypto_register_shash(&alg); } static void __exit xxhash_mod_fini(void) { crypto_unregister_shash(&alg); } module_init(xxhash_mod_init); module_exit(xxhash_mod_fini); MODULE_AUTHOR("Nikolay Borisov <nborisov@suse.com>"); MODULE_DESCRIPTION("xxhash calculations wrapper for lib/xxhash.c"); MODULE_LICENSE("GPL"); MODULE_ALIAS_CRYPTO("xxhash64"); MODULE_ALIAS_CRYPTO("xxhash64-generic"); |
| 4 4 102 94 103 21 21 21 21 35 34 34 7 30 34 32 32 34 34 11 11 35 24 34 71 71 71 12 2 2 71 71 71 71 12 69 71 6 6 70 21 15 67 15 71 7 45 45 32 45 13 13 32 45 32 45 18 12 6 2 2 2 45 3 45 45 32 1 1 45 6 112 111 32 92 95 92 95 152 15 152 152 149 114 114 91 23 91 91 90 8 90 90 88 6 3 3 88 88 78 35 55 55 9 46 22 25 25 15 15 4 21 12 15 15 12 25 24 1 23 22 22 33 23 30 51 7 50 51 51 47 42 6 6 6 6 6 6 6 5 6 6 6 6 6 6 6 25 2 16 16 4 12 1 1 12 16 4 16 12 1 1 1 16 16 2 16 16 1 1 84 81 45 143 25 143 6 71 65 18 56 21 68 152 7 218 26 218 13 218 218 19 13 12 12 12 19 19 1 18 4 2 14 1 1 1 14 14 14 1 14 8 7 2 1 12 12 2 2 12 12 12 12 10 10 1 1 1 10 10 10 1 1 1 1 1 13 19 218 42 42 42 42 42 42 2 2 2 2 2 2 2 2 2 28 28 23 21 20 19 28 15 15 15 15 11 8 15 28 28 16 16 16 16 4 4 4 4 4 4 4 1 1 4 4 15 15 15 15 15 14 1 1 1 15 15 15 14 4 4 11 6 11 11 14 14 14 14 1 1 1 1 14 15 15 7 8 8 7 7 3 3 4 7 7 1 1 1 6 6 6 6 6 6 6 6 6 6 6 2 6 6 6 6 6 8 6 5 1 6 6 3 6 2 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 6 7 7 7 7 7 2 6 6 7 7 6 6 6 1 1 1 1 6 6 6 6 6 6 6 6 1 1 1 6 1 1 6 5 6 6 6 9 10 10 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 | // SPDX-License-Identifier: GPL-2.0 /* * * Copyright (C) 2019-2021 Paragon Software GmbH, All rights reserved. * * TODO: Merge attr_set_size/attr_data_get_block/attr_allocate_frame? */ #include <linux/fs.h> #include <linux/slab.h> #include <linux/kernel.h> #include "debug.h" #include "ntfs.h" #include "ntfs_fs.h" /* * You can set external NTFS_MIN_LOG2_OF_CLUMP/NTFS_MAX_LOG2_OF_CLUMP to manage * preallocate algorithm. */ #ifndef NTFS_MIN_LOG2_OF_CLUMP #define NTFS_MIN_LOG2_OF_CLUMP 16 #endif #ifndef NTFS_MAX_LOG2_OF_CLUMP #define NTFS_MAX_LOG2_OF_CLUMP 26 #endif // 16M #define NTFS_CLUMP_MIN (1 << (NTFS_MIN_LOG2_OF_CLUMP + 8)) // 16G #define NTFS_CLUMP_MAX (1ull << (NTFS_MAX_LOG2_OF_CLUMP + 8)) static inline u64 get_pre_allocated(u64 size) { u32 clump; u8 align_shift; u64 ret; if (size <= NTFS_CLUMP_MIN) { clump = 1 << NTFS_MIN_LOG2_OF_CLUMP; align_shift = NTFS_MIN_LOG2_OF_CLUMP; } else if (size >= NTFS_CLUMP_MAX) { clump = 1 << NTFS_MAX_LOG2_OF_CLUMP; align_shift = NTFS_MAX_LOG2_OF_CLUMP; } else { align_shift = NTFS_MIN_LOG2_OF_CLUMP - 1 + __ffs(size >> (8 + NTFS_MIN_LOG2_OF_CLUMP)); clump = 1u << align_shift; } ret = (((size + clump - 1) >> align_shift)) << align_shift; return ret; } /* * attr_load_runs - Load all runs stored in @attr. */ static int attr_load_runs(struct ATTRIB *attr, struct ntfs_inode *ni, struct runs_tree *run, const CLST *vcn) { int err; CLST svcn = le64_to_cpu(attr->nres.svcn); CLST evcn = le64_to_cpu(attr->nres.evcn); u32 asize; u16 run_off; if (svcn >= evcn + 1 || run_is_mapped_full(run, svcn, evcn)) return 0; if (vcn && (evcn < *vcn || *vcn < svcn)) return -EINVAL; asize = le32_to_cpu(attr->size); run_off = le16_to_cpu(attr->nres.run_off); if (run_off > asize) return -EINVAL; err = run_unpack_ex(run, ni->mi.sbi, ni->mi.rno, svcn, evcn, vcn ? *vcn : svcn, Add2Ptr(attr, run_off), asize - run_off); if (err < 0) return err; return 0; } /* * run_deallocate_ex - Deallocate clusters. */ static int run_deallocate_ex(struct ntfs_sb_info *sbi, struct runs_tree *run, CLST vcn, CLST len, CLST *done, bool trim) { int err = 0; CLST vcn_next, vcn0 = vcn, lcn, clen, dn = 0; size_t idx; if (!len) goto out; if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) { failed: run_truncate(run, vcn0); err = -EINVAL; goto out; } for (;;) { if (clen > len) clen = len; if (!clen) { err = -EINVAL; goto out; } if (lcn != SPARSE_LCN) { if (sbi) { /* mark bitmap range [lcn + clen) as free and trim clusters. */ mark_as_free_ex(sbi, lcn, clen, trim); } dn += clen; } len -= clen; if (!len) break; vcn_next = vcn + clen; if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) || vcn != vcn_next) { /* Save memory - don't load entire run. */ goto failed; } } out: if (done) *done += dn; return err; } /* * attr_allocate_clusters - Find free space, mark it as used and store in @run. */ int attr_allocate_clusters(struct ntfs_sb_info *sbi, struct runs_tree *run, CLST vcn, CLST lcn, CLST len, CLST *pre_alloc, enum ALLOCATE_OPT opt, CLST *alen, const size_t fr, CLST *new_lcn, CLST *new_len) { int err; CLST flen, vcn0 = vcn, pre = pre_alloc ? *pre_alloc : 0; size_t cnt = run->count; for (;;) { err = ntfs_look_for_free_space(sbi, lcn, len + pre, &lcn, &flen, opt); if (err == -ENOSPC && pre) { pre = 0; if (*pre_alloc) *pre_alloc = 0; continue; } if (err) goto out; if (vcn == vcn0) { /* Return the first fragment. */ if (new_lcn) *new_lcn = lcn; if (new_len) *new_len = flen; } /* Add new fragment into run storage. */ if (!run_add_entry(run, vcn, lcn, flen, opt & ALLOCATE_MFT)) { /* Undo last 'ntfs_look_for_free_space' */ mark_as_free_ex(sbi, lcn, len, false); err = -ENOMEM; goto out; } if (opt & ALLOCATE_ZERO) { u8 shift = sbi->cluster_bits - SECTOR_SHIFT; err = blkdev_issue_zeroout(sbi->sb->s_bdev, (sector_t)lcn << shift, (sector_t)flen << shift, GFP_NOFS, 0); if (err) goto out; } vcn += flen; if (flen >= len || (opt & ALLOCATE_MFT) || (fr && run->count - cnt >= fr)) { *alen = vcn - vcn0; return 0; } len -= flen; } out: /* Undo 'ntfs_look_for_free_space' */ if (vcn - vcn0) { run_deallocate_ex(sbi, run, vcn0, vcn - vcn0, NULL, false); run_truncate(run, vcn0); } return err; } /* * attr_make_nonresident * * If page is not NULL - it is already contains resident data * and locked (called from ni_write_frame()). */ int attr_make_nonresident(struct ntfs_inode *ni, struct ATTRIB *attr, struct ATTR_LIST_ENTRY *le, struct mft_inode *mi, u64 new_size, struct runs_tree *run, struct ATTRIB **ins_attr, struct page *page) { struct ntfs_sb_info *sbi; struct ATTRIB *attr_s; struct MFT_REC *rec; u32 used, asize, rsize, aoff; bool is_data; CLST len, alen; char *next; int err; if (attr->non_res) { *ins_attr = attr; return 0; } sbi = mi->sbi; rec = mi->mrec; attr_s = NULL; used = le32_to_cpu(rec->used); asize = le32_to_cpu(attr->size); next = Add2Ptr(attr, asize); aoff = PtrOffset(rec, attr); rsize = le32_to_cpu(attr->res.data_size); is_data = attr->type == ATTR_DATA && !attr->name_len; /* len - how many clusters required to store 'rsize' bytes */ if (is_attr_compressed(attr)) { u8 shift = sbi->cluster_bits + NTFS_LZNT_CUNIT; len = ((rsize + (1u << shift) - 1) >> shift) << NTFS_LZNT_CUNIT; } else { len = bytes_to_cluster(sbi, rsize); } run_init(run); /* Make a copy of original attribute. */ attr_s = kmemdup(attr, asize, GFP_NOFS); if (!attr_s) { err = -ENOMEM; goto out; } if (!len) { /* Empty resident -> Empty nonresident. */ alen = 0; } else { const char *data = resident_data(attr); err = attr_allocate_clusters(sbi, run, 0, 0, len, NULL, ALLOCATE_DEF, &alen, 0, NULL, NULL); if (err) goto out1; if (!rsize) { /* Empty resident -> Non empty nonresident. */ } else if (!is_data) { err = ntfs_sb_write_run(sbi, run, 0, data, rsize, 0); if (err) goto out2; } else if (!page) { struct address_space *mapping = ni->vfs_inode.i_mapping; struct folio *folio; folio = __filemap_get_folio( mapping, 0, FGP_LOCK | FGP_ACCESSED | FGP_CREAT, mapping_gfp_mask(mapping)); if (IS_ERR(folio)) { err = PTR_ERR(folio); goto out2; } folio_fill_tail(folio, 0, data, rsize); folio_mark_uptodate(folio); folio_mark_dirty(folio); folio_unlock(folio); folio_put(folio); } } /* Remove original attribute. */ used -= asize; memmove(attr, Add2Ptr(attr, asize), used - aoff); rec->used = cpu_to_le32(used); mi->dirty = true; if (le) al_remove_le(ni, le); err = ni_insert_nonresident(ni, attr_s->type, attr_name(attr_s), attr_s->name_len, run, 0, alen, attr_s->flags, &attr, NULL, NULL); if (err) goto out3; kfree(attr_s); attr->nres.data_size = cpu_to_le64(rsize); attr->nres.valid_size = attr->nres.data_size; *ins_attr = attr; if (is_data) ni->ni_flags &= ~NI_FLAG_RESIDENT; /* Resident attribute becomes non resident. */ return 0; out3: attr = Add2Ptr(rec, aoff); memmove(next, attr, used - aoff); memcpy(attr, attr_s, asize); rec->used = cpu_to_le32(used + asize); mi->dirty = true; out2: /* Undo: do not trim new allocated clusters. */ run_deallocate(sbi, run, false); run_close(run); out1: kfree(attr_s); out: return err; } /* * attr_set_size_res - Helper for attr_set_size(). */ static int attr_set_size_res(struct ntfs_inode *ni, struct ATTRIB *attr, struct ATTR_LIST_ENTRY *le, struct mft_inode *mi, u64 new_size, struct runs_tree *run, struct ATTRIB **ins_attr) { struct ntfs_sb_info *sbi = mi->sbi; struct MFT_REC *rec = mi->mrec; u32 used = le32_to_cpu(rec->used); u32 asize = le32_to_cpu(attr->size); u32 aoff = PtrOffset(rec, attr); u32 rsize = le32_to_cpu(attr->res.data_size); u32 tail = used - aoff - asize; char *next = Add2Ptr(attr, asize); s64 dsize = ALIGN(new_size, 8) - ALIGN(rsize, 8); if (dsize < 0) { memmove(next + dsize, next, tail); } else if (dsize > 0) { if (used + dsize > sbi->max_bytes_per_attr) return attr_make_nonresident(ni, attr, le, mi, new_size, run, ins_attr, NULL); memmove(next + dsize, next, tail); memset(next, 0, dsize); } if (new_size > rsize) memset(Add2Ptr(resident_data(attr), rsize), 0, new_size - rsize); rec->used = cpu_to_le32(used + dsize); attr->size = cpu_to_le32(asize + dsize); attr->res.data_size = cpu_to_le32(new_size); mi->dirty = true; *ins_attr = attr; return 0; } /* * attr_set_size - Change the size of attribute. * * Extend: * - Sparse/compressed: No allocated clusters. * - Normal: Append allocated and preallocated new clusters. * Shrink: * - No deallocate if @keep_prealloc is set. */ int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE type, const __le16 *name, u8 name_len, struct runs_tree *run, u64 new_size, const u64 *new_valid, bool keep_prealloc, struct ATTRIB **ret) { int err = 0; struct ntfs_sb_info *sbi = ni->mi.sbi; u8 cluster_bits = sbi->cluster_bits; bool is_mft = ni->mi.rno == MFT_REC_MFT && type == ATTR_DATA && !name_len; u64 old_valid, old_size, old_alloc, new_alloc, new_alloc_tmp; struct ATTRIB *attr = NULL, *attr_b; struct ATTR_LIST_ENTRY *le, *le_b; struct mft_inode *mi, *mi_b; CLST alen, vcn, lcn, new_alen, old_alen, svcn, evcn; CLST next_svcn, pre_alloc = -1, done = 0; bool is_ext, is_bad = false; bool dirty = false; u32 align; struct MFT_REC *rec; again: alen = 0; le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, type, name, name_len, NULL, &mi_b); if (!attr_b) { err = -ENOENT; goto bad_inode; } if (!attr_b->non_res) { err = attr_set_size_res(ni, attr_b, le_b, mi_b, new_size, run, &attr_b); if (err) return err; /* Return if file is still resident. */ if (!attr_b->non_res) { dirty = true; goto ok1; } /* Layout of records may be changed, so do a full search. */ goto again; } is_ext = is_attr_ext(attr_b); align = sbi->cluster_size; if (is_ext) align <<= attr_b->nres.c_unit; old_valid = le64_to_cpu(attr_b->nres.valid_size); old_size = le64_to_cpu(attr_b->nres.data_size); old_alloc = le64_to_cpu(attr_b->nres.alloc_size); again_1: old_alen = old_alloc >> cluster_bits; new_alloc = (new_size + align - 1) & ~(u64)(align - 1); new_alen = new_alloc >> cluster_bits; if (keep_prealloc && new_size < old_size) { attr_b->nres.data_size = cpu_to_le64(new_size); mi_b->dirty = dirty = true; goto ok; } vcn = old_alen - 1; svcn = le64_to_cpu(attr_b->nres.svcn); evcn = le64_to_cpu(attr_b->nres.evcn); if (svcn <= vcn && vcn <= evcn) { attr = attr_b; le = le_b; mi = mi_b; } else if (!le_b) { err = -EINVAL; goto bad_inode; } else { le = le_b; attr = ni_find_attr(ni, attr_b, &le, type, name, name_len, &vcn, &mi); if (!attr) { err = -EINVAL; goto bad_inode; } next_le_1: svcn = le64_to_cpu(attr->nres.svcn); evcn = le64_to_cpu(attr->nres.evcn); } /* * Here we have: * attr,mi,le - last attribute segment (containing 'vcn'). * attr_b,mi_b,le_b - base (primary) attribute segment. */ next_le: rec = mi->mrec; err = attr_load_runs(attr, ni, run, NULL); if (err) goto out; if (new_size > old_size) { CLST to_allocate; size_t free; if (new_alloc <= old_alloc) { attr_b->nres.data_size = cpu_to_le64(new_size); mi_b->dirty = dirty = true; goto ok; } /* * Add clusters. In simple case we have to: * - allocate space (vcn, lcn, len) * - update packed run in 'mi' * - update attr->nres.evcn * - update attr_b->nres.data_size/attr_b->nres.alloc_size */ to_allocate = new_alen - old_alen; add_alloc_in_same_attr_seg: lcn = 0; if (is_mft) { /* MFT allocates clusters from MFT zone. */ pre_alloc = 0; } else if (is_ext) { /* No preallocate for sparse/compress. */ pre_alloc = 0; } else if (pre_alloc == -1) { pre_alloc = 0; if (type == ATTR_DATA && !name_len && sbi->options->prealloc) { pre_alloc = bytes_to_cluster( sbi, get_pre_allocated( new_size)) - new_alen; } /* Get the last LCN to allocate from. */ if (old_alen && !run_lookup_entry(run, vcn, &lcn, NULL, NULL)) { lcn = SPARSE_LCN; } if (lcn == SPARSE_LCN) lcn = 0; else if (lcn) lcn += 1; free = wnd_zeroes(&sbi->used.bitmap); if (to_allocate > free) { err = -ENOSPC; goto out; } if (pre_alloc && to_allocate + pre_alloc > free) pre_alloc = 0; } vcn = old_alen; if (is_ext) { if (!run_add_entry(run, vcn, SPARSE_LCN, to_allocate, false)) { err = -ENOMEM; goto out; } alen = to_allocate; } else { /* ~3 bytes per fragment. */ err = attr_allocate_clusters( sbi, run, vcn, lcn, to_allocate, &pre_alloc, is_mft ? ALLOCATE_MFT : ALLOCATE_DEF, &alen, is_mft ? 0 : (sbi->record_size - le32_to_cpu(rec->used) + 8) / 3 + 1, NULL, NULL); if (err) goto out; } done += alen; vcn += alen; if (to_allocate > alen) to_allocate -= alen; else to_allocate = 0; pack_runs: err = mi_pack_runs(mi, attr, run, vcn - svcn); if (err) goto undo_1; next_svcn = le64_to_cpu(attr->nres.evcn) + 1; new_alloc_tmp = (u64)next_svcn << cluster_bits; attr_b->nres.alloc_size = cpu_to_le64(new_alloc_tmp); mi_b->dirty = dirty = true; if (next_svcn >= vcn && !to_allocate) { /* Normal way. Update attribute and exit. */ attr_b->nres.data_size = cpu_to_le64(new_size); goto ok; } /* At least two MFT to avoid recursive loop. */ if (is_mft && next_svcn == vcn && ((u64)done << sbi->cluster_bits) >= 2 * sbi->record_size) { new_size = new_alloc_tmp; attr_b->nres.data_size = attr_b->nres.alloc_size; goto ok; } if (le32_to_cpu(rec->used) < sbi->record_size) { old_alen = next_svcn; evcn = old_alen - 1; goto add_alloc_in_same_attr_seg; } attr_b->nres.data_size = attr_b->nres.alloc_size; if (new_alloc_tmp < old_valid) attr_b->nres.valid_size = attr_b->nres.data_size; if (type == ATTR_LIST) { err = ni_expand_list(ni); if (err) goto undo_2; if (next_svcn < vcn) goto pack_runs; /* Layout of records is changed. */ goto again; } if (!ni->attr_list.size) { err = ni_create_attr_list(ni); /* In case of error layout of records is not changed. */ if (err) goto undo_2; /* Layout of records is changed. */ } if (next_svcn >= vcn) { /* This is MFT data, repeat. */ goto again; } /* Insert new attribute segment. */ err = ni_insert_nonresident(ni, type, name, name_len, run, next_svcn, vcn - next_svcn, attr_b->flags, &attr, &mi, NULL); /* * Layout of records maybe changed. * Find base attribute to update. */ le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, type, name, name_len, NULL, &mi_b); if (!attr_b) { err = -EINVAL; goto bad_inode; } if (err) { /* ni_insert_nonresident failed. */ attr = NULL; goto undo_2; } /* keep runs for $MFT::$ATTR_DATA and $MFT::$ATTR_BITMAP. */ if (ni->mi.rno != MFT_REC_MFT) run_truncate_head(run, evcn + 1); svcn = le64_to_cpu(attr->nres.svcn); evcn = le64_to_cpu(attr->nres.evcn); /* * Attribute is in consistency state. * Save this point to restore to if next steps fail. */ old_valid = old_size = old_alloc = (u64)vcn << cluster_bits; attr_b->nres.valid_size = attr_b->nres.data_size = attr_b->nres.alloc_size = cpu_to_le64(old_size); mi_b->dirty = dirty = true; goto again_1; } if (new_size != old_size || (new_alloc != old_alloc && !keep_prealloc)) { /* * Truncate clusters. In simple case we have to: * - update packed run in 'mi' * - update attr->nres.evcn * - update attr_b->nres.data_size/attr_b->nres.alloc_size * - mark and trim clusters as free (vcn, lcn, len) */ CLST dlen = 0; vcn = max(svcn, new_alen); new_alloc_tmp = (u64)vcn << cluster_bits; if (vcn > svcn) { err = mi_pack_runs(mi, attr, run, vcn - svcn); if (err) goto out; } else if (le && le->vcn) { u16 le_sz = le16_to_cpu(le->size); /* * NOTE: List entries for one attribute are always * the same size. We deal with last entry (vcn==0) * and it is not first in entries array * (list entry for std attribute always first). * So it is safe to step back. */ mi_remove_attr(NULL, mi, attr); if (!al_remove_le(ni, le)) { err = -EINVAL; goto bad_inode; } le = (struct ATTR_LIST_ENTRY *)((u8 *)le - le_sz); } else { attr->nres.evcn = cpu_to_le64((u64)vcn - 1); mi->dirty = true; } attr_b->nres.alloc_size = cpu_to_le64(new_alloc_tmp); if (vcn == new_alen) { attr_b->nres.data_size = cpu_to_le64(new_size); if (new_size < old_valid) attr_b->nres.valid_size = attr_b->nres.data_size; } else { if (new_alloc_tmp <= le64_to_cpu(attr_b->nres.data_size)) attr_b->nres.data_size = attr_b->nres.alloc_size; if (new_alloc_tmp < le64_to_cpu(attr_b->nres.valid_size)) attr_b->nres.valid_size = attr_b->nres.alloc_size; } mi_b->dirty = dirty = true; err = run_deallocate_ex(sbi, run, vcn, evcn - vcn + 1, &dlen, true); if (err) goto out; if (is_ext) { /* dlen - really deallocated clusters. */ le64_sub_cpu(&attr_b->nres.total_size, ((u64)dlen << cluster_bits)); } run_truncate(run, vcn); if (new_alloc_tmp <= new_alloc) goto ok; old_size = new_alloc_tmp; vcn = svcn - 1; if (le == le_b) { attr = attr_b; mi = mi_b; evcn = svcn - 1; svcn = 0; goto next_le; } if (le->type != type || le->name_len != name_len || memcmp(le_name(le), name, name_len * sizeof(short))) { err = -EINVAL; goto bad_inode; } err = ni_load_mi(ni, le, &mi); if (err) goto out; attr = mi_find_attr(ni, mi, NULL, type, name, name_len, &le->id); if (!attr) { err = -EINVAL; goto bad_inode; } goto next_le_1; } ok: if (new_valid) { __le64 valid = cpu_to_le64(min(*new_valid, new_size)); if (attr_b->nres.valid_size != valid) { attr_b->nres.valid_size = valid; mi_b->dirty = true; } } ok1: if (ret) *ret = attr_b; if (((type == ATTR_DATA && !name_len) || (type == ATTR_ALLOC && name == I30_NAME))) { /* Update inode_set_bytes. */ if (attr_b->non_res) { new_alloc = le64_to_cpu(attr_b->nres.alloc_size); if (inode_get_bytes(&ni->vfs_inode) != new_alloc) { inode_set_bytes(&ni->vfs_inode, new_alloc); dirty = true; } } /* Don't forget to update duplicate information in parent. */ if (dirty) { ni->ni_flags |= NI_FLAG_UPDATE_PARENT; mark_inode_dirty(&ni->vfs_inode); } } return 0; undo_2: vcn -= alen; attr_b->nres.data_size = cpu_to_le64(old_size); attr_b->nres.valid_size = cpu_to_le64(old_valid); attr_b->nres.alloc_size = cpu_to_le64(old_alloc); /* Restore 'attr' and 'mi'. */ if (attr) goto restore_run; if (le64_to_cpu(attr_b->nres.svcn) <= svcn && svcn <= le64_to_cpu(attr_b->nres.evcn)) { attr = attr_b; le = le_b; mi = mi_b; } else if (!le_b) { err = -EINVAL; goto bad_inode; } else { le = le_b; attr = ni_find_attr(ni, attr_b, &le, type, name, name_len, &svcn, &mi); if (!attr) goto bad_inode; } restore_run: if (mi_pack_runs(mi, attr, run, evcn - svcn + 1)) is_bad = true; undo_1: run_deallocate_ex(sbi, run, vcn, alen, NULL, false); run_truncate(run, vcn); out: if (is_bad) { bad_inode: _ntfs_bad_inode(&ni->vfs_inode); } return err; } /* * attr_data_get_block - Returns 'lcn' and 'len' for given 'vcn'. * * @new == NULL means just to get current mapping for 'vcn' * @new != NULL means allocate real cluster if 'vcn' maps to hole * @zero - zeroout new allocated clusters * * NOTE: * - @new != NULL is called only for sparsed or compressed attributes. * - new allocated clusters are zeroed via blkdev_issue_zeroout. */ int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *lcn, CLST *len, bool *new, bool zero) { int err = 0; struct runs_tree *run = &ni->file.run; struct ntfs_sb_info *sbi; u8 cluster_bits; struct ATTRIB *attr, *attr_b; struct ATTR_LIST_ENTRY *le, *le_b; struct mft_inode *mi, *mi_b; CLST hint, svcn, to_alloc, evcn1, next_svcn, asize, end, vcn0, alen; CLST alloc, evcn; unsigned fr; u64 total_size, total_size0; int step = 0; if (new) *new = false; /* Try to find in cache. */ down_read(&ni->file.run_lock); if (!run_lookup_entry(run, vcn, lcn, len, NULL)) *len = 0; up_read(&ni->file.run_lock); if (*len && (*lcn != SPARSE_LCN || !new)) return 0; /* Fast normal way without allocation. */ /* No cluster in cache or we need to allocate cluster in hole. */ sbi = ni->mi.sbi; cluster_bits = sbi->cluster_bits; ni_lock(ni); down_write(&ni->file.run_lock); /* Repeat the code above (under write lock). */ if (!run_lookup_entry(run, vcn, lcn, len, NULL)) *len = 0; if (*len) { if (*lcn != SPARSE_LCN || !new) goto out; /* normal way without allocation. */ if (clen > *len) clen = *len; } le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) { err = -ENOENT; goto out; } if (!attr_b->non_res) { *lcn = RESIDENT_LCN; *len = 1; goto out; } asize = le64_to_cpu(attr_b->nres.alloc_size) >> cluster_bits; if (vcn >= asize) { if (new) { err = -EINVAL; } else { *len = 1; *lcn = SPARSE_LCN; } goto out; } svcn = le64_to_cpu(attr_b->nres.svcn); evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1; attr = attr_b; le = le_b; mi = mi_b; if (le_b && (vcn < svcn || evcn1 <= vcn)) { attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn, &mi); if (!attr) { err = -EINVAL; goto out; } svcn = le64_to_cpu(attr->nres.svcn); evcn1 = le64_to_cpu(attr->nres.evcn) + 1; } /* Load in cache actual information. */ err = attr_load_runs(attr, ni, run, NULL); if (err) goto out; /* Check for compressed frame. */ err = attr_is_frame_compressed(ni, attr_b, vcn >> NTFS_LZNT_CUNIT, &hint, run); if (err) goto out; if (hint) { /* if frame is compressed - don't touch it. */ *lcn = COMPRESSED_LCN; /* length to the end of frame. */ *len = NTFS_LZNT_CLUSTERS - (vcn & (NTFS_LZNT_CLUSTERS - 1)); err = 0; goto out; } if (!*len) { if (run_lookup_entry(run, vcn, lcn, len, NULL)) { if (*lcn != SPARSE_LCN || !new) goto ok; /* Slow normal way without allocation. */ if (clen > *len) clen = *len; } else if (!new) { /* Here we may return -ENOENT. * In any case caller gets zero length. */ goto ok; } } if (!is_attr_ext(attr_b)) { /* The code below only for sparsed or compressed attributes. */ err = -EINVAL; goto out; } vcn0 = vcn; to_alloc = clen; fr = (sbi->record_size - le32_to_cpu(mi->mrec->used) + 8) / 3 + 1; /* Allocate frame aligned clusters. * ntfs.sys usually uses 16 clusters per frame for sparsed or compressed. * ntfs3 uses 1 cluster per frame for new created sparsed files. */ if (attr_b->nres.c_unit) { CLST clst_per_frame = 1u << attr_b->nres.c_unit; CLST cmask = ~(clst_per_frame - 1); /* Get frame aligned vcn and to_alloc. */ vcn = vcn0 & cmask; to_alloc = ((vcn0 + clen + clst_per_frame - 1) & cmask) - vcn; if (fr < clst_per_frame) fr = clst_per_frame; zero = true; /* Check if 'vcn' and 'vcn0' in different attribute segments. */ if (vcn < svcn || evcn1 <= vcn) { struct ATTRIB *attr2; /* Load runs for truncated vcn. */ attr2 = ni_find_attr(ni, attr_b, &le_b, ATTR_DATA, NULL, 0, &vcn, &mi); if (!attr2) { err = -EINVAL; goto out; } evcn1 = le64_to_cpu(attr2->nres.evcn) + 1; err = attr_load_runs(attr2, ni, run, NULL); if (err) goto out; } } if (vcn + to_alloc > asize) to_alloc = asize - vcn; /* Get the last LCN to allocate from. */ hint = 0; if (vcn > evcn1) { if (!run_add_entry(run, evcn1, SPARSE_LCN, vcn - evcn1, false)) { err = -ENOMEM; goto out; } } else if (vcn && !run_lookup_entry(run, vcn - 1, &hint, NULL, NULL)) { hint = -1; } /* Allocate and zeroout new clusters. */ err = attr_allocate_clusters(sbi, run, vcn, hint + 1, to_alloc, NULL, zero ? ALLOCATE_ZERO : ALLOCATE_DEF, &alen, fr, lcn, len); if (err) goto out; *new = true; step = 1; end = vcn + alen; /* Save 'total_size0' to restore if error. */ total_size0 = le64_to_cpu(attr_b->nres.total_size); total_size = total_size0 + ((u64)alen << cluster_bits); if (vcn != vcn0) { if (!run_lookup_entry(run, vcn0, lcn, len, NULL)) { err = -EINVAL; goto out; } if (*lcn == SPARSE_LCN) { /* Internal error. Should not happened. */ WARN_ON(1); err = -EINVAL; goto out; } /* Check case when vcn0 + len overlaps new allocated clusters. */ if (vcn0 + *len > end) *len = end - vcn0; } repack: err = mi_pack_runs(mi, attr, run, max(end, evcn1) - svcn); if (err) goto out; attr_b->nres.total_size = cpu_to_le64(total_size); inode_set_bytes(&ni->vfs_inode, total_size); ni->ni_flags |= NI_FLAG_UPDATE_PARENT; mi_b->dirty = true; mark_inode_dirty(&ni->vfs_inode); /* Stored [vcn : next_svcn) from [vcn : end). */ next_svcn = le64_to_cpu(attr->nres.evcn) + 1; if (end <= evcn1) { if (next_svcn == evcn1) { /* Normal way. Update attribute and exit. */ goto ok; } /* Add new segment [next_svcn : evcn1 - next_svcn). */ if (!ni->attr_list.size) { err = ni_create_attr_list(ni); if (err) goto undo1; /* Layout of records is changed. */ le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) { err = -ENOENT; goto out; } attr = attr_b; le = le_b; mi = mi_b; goto repack; } } /* * The code below may require additional cluster (to extend attribute list) * and / or one MFT record * It is too complex to undo operations if -ENOSPC occurs deep inside * in 'ni_insert_nonresident'. * Return in advance -ENOSPC here if there are no free cluster and no free MFT. */ if (!ntfs_check_for_free_space(sbi, 1, 1)) { /* Undo step 1. */ err = -ENOSPC; goto undo1; } step = 2; svcn = evcn1; /* Estimate next attribute. */ attr = ni_find_attr(ni, attr, &le, ATTR_DATA, NULL, 0, &svcn, &mi); if (!attr) { /* Insert new attribute segment. */ goto ins_ext; } /* Try to update existed attribute segment. */ alloc = bytes_to_cluster(sbi, le64_to_cpu(attr_b->nres.alloc_size)); evcn = le64_to_cpu(attr->nres.evcn); if (end < next_svcn) end = next_svcn; while (end > evcn) { /* Remove segment [svcn : evcn). */ mi_remove_attr(NULL, mi, attr); if (!al_remove_le(ni, le)) { err = -EINVAL; goto out; } if (evcn + 1 >= alloc) { /* Last attribute segment. */ evcn1 = evcn + 1; goto ins_ext; } if (ni_load_mi(ni, le, &mi)) { attr = NULL; goto out; } attr = mi_find_attr(ni, mi, NULL, ATTR_DATA, NULL, 0, &le->id); if (!attr) { err = -EINVAL; goto out; } svcn = le64_to_cpu(attr->nres.svcn); evcn = le64_to_cpu(attr->nres.evcn); } if (end < svcn) end = svcn; err = attr_load_runs(attr, ni, run, &end); if (err) goto out; evcn1 = evcn + 1; attr->nres.svcn = cpu_to_le64(next_svcn); err = mi_pack_runs(mi, attr, run, evcn1 - next_svcn); if (err) goto out; le->vcn = cpu_to_le64(next_svcn); ni->attr_list.dirty = true; mi->dirty = true; next_svcn = le64_to_cpu(attr->nres.evcn) + 1; ins_ext: if (evcn1 > next_svcn) { err = ni_insert_nonresident(ni, ATTR_DATA, NULL, 0, run, next_svcn, evcn1 - next_svcn, attr_b->flags, &attr, &mi, NULL); if (err) goto out; } ok: run_truncate_around(run, vcn); out: if (err && step > 1) { /* Too complex to restore. */ _ntfs_bad_inode(&ni->vfs_inode); } up_write(&ni->file.run_lock); ni_unlock(ni); return err; undo1: /* Undo step1. */ attr_b->nres.total_size = cpu_to_le64(total_size0); inode_set_bytes(&ni->vfs_inode, total_size0); if (run_deallocate_ex(sbi, run, vcn, alen, NULL, false) || !run_add_entry(run, vcn, SPARSE_LCN, alen, false) || mi_pack_runs(mi, attr, run, max(end, evcn1) - svcn)) { _ntfs_bad_inode(&ni->vfs_inode); } goto out; } int attr_data_read_resident(struct ntfs_inode *ni, struct folio *folio) { u64 vbo; struct ATTRIB *attr; u32 data_size; size_t len; attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, NULL); if (!attr) return -EINVAL; if (attr->non_res) return E_NTFS_NONRESIDENT; vbo = folio->index << PAGE_SHIFT; data_size = le32_to_cpu(attr->res.data_size); if (vbo > data_size) len = 0; else len = min(data_size - vbo, folio_size(folio)); folio_fill_tail(folio, 0, resident_data(attr) + vbo, len); folio_mark_uptodate(folio); return 0; } int attr_data_write_resident(struct ntfs_inode *ni, struct folio *folio) { u64 vbo; struct mft_inode *mi; struct ATTRIB *attr; u32 data_size; attr = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, &mi); if (!attr) return -EINVAL; if (attr->non_res) { /* Return special error code to check this case. */ return E_NTFS_NONRESIDENT; } vbo = folio->index << PAGE_SHIFT; data_size = le32_to_cpu(attr->res.data_size); if (vbo < data_size) { char *data = resident_data(attr); size_t len = min(data_size - vbo, folio_size(folio)); memcpy_from_folio(data + vbo, folio, 0, len); mi->dirty = true; } ni->i_valid = data_size; return 0; } /* * attr_load_runs_vcn - Load runs with VCN. */ int attr_load_runs_vcn(struct ntfs_inode *ni, enum ATTR_TYPE type, const __le16 *name, u8 name_len, struct runs_tree *run, CLST vcn) { struct ATTRIB *attr; int err; CLST svcn, evcn; u16 ro; if (!ni) { /* Is record corrupted? */ return -ENOENT; } attr = ni_find_attr(ni, NULL, NULL, type, name, name_len, &vcn, NULL); if (!attr) { /* Is record corrupted? */ return -ENOENT; } svcn = le64_to_cpu(attr->nres.svcn); evcn = le64_to_cpu(attr->nres.evcn); if (evcn < vcn || vcn < svcn) { /* Is record corrupted? */ return -EINVAL; } ro = le16_to_cpu(attr->nres.run_off); if (ro > le32_to_cpu(attr->size)) return -EINVAL; err = run_unpack_ex(run, ni->mi.sbi, ni->mi.rno, svcn, evcn, svcn, Add2Ptr(attr, ro), le32_to_cpu(attr->size) - ro); if (err < 0) return err; return 0; } /* * attr_load_runs_range - Load runs for given range [from to). */ int attr_load_runs_range(struct ntfs_inode *ni, enum ATTR_TYPE type, const __le16 *name, u8 name_len, struct runs_tree *run, u64 from, u64 to) { struct ntfs_sb_info *sbi = ni->mi.sbi; u8 cluster_bits = sbi->cluster_bits; CLST vcn; CLST vcn_last = (to - 1) >> cluster_bits; CLST lcn, clen; int err; for (vcn = from >> cluster_bits; vcn <= vcn_last; vcn += clen) { if (!run_lookup_entry(run, vcn, &lcn, &clen, NULL)) { err = attr_load_runs_vcn(ni, type, name, name_len, run, vcn); if (err) return err; clen = 0; /* Next run_lookup_entry(vcn) must be success. */ } } return 0; } #ifdef CONFIG_NTFS3_LZX_XPRESS /* * attr_wof_frame_info * * Read header of Xpress/LZX file to get info about frame. */ int attr_wof_frame_info(struct ntfs_inode *ni, struct ATTRIB *attr, struct runs_tree *run, u64 frame, u64 frames, u8 frame_bits, u32 *ondisk_size, u64 *vbo_data) { struct ntfs_sb_info *sbi = ni->mi.sbi; u64 vbo[2], off[2], wof_size; u32 voff; u8 bytes_per_off; char *addr; struct folio *folio; int i, err; __le32 *off32; __le64 *off64; if (ni->vfs_inode.i_size < 0x100000000ull) { /* File starts with array of 32 bit offsets. */ bytes_per_off = sizeof(__le32); vbo[1] = frame << 2; *vbo_data = frames << 2; } else { /* File starts with array of 64 bit offsets. */ bytes_per_off = sizeof(__le64); vbo[1] = frame << 3; *vbo_data = frames << 3; } /* * Read 4/8 bytes at [vbo - 4(8)] == offset where compressed frame starts. * Read 4/8 bytes at [vbo] == offset where compressed frame ends. */ if (!attr->non_res) { if (vbo[1] + bytes_per_off > le32_to_cpu(attr->res.data_size)) { _ntfs_bad_inode(&ni->vfs_inode); return -EINVAL; } addr = resident_data(attr); if (bytes_per_off == sizeof(__le32)) { off32 = Add2Ptr(addr, vbo[1]); off[0] = vbo[1] ? le32_to_cpu(off32[-1]) : 0; off[1] = le32_to_cpu(off32[0]); } else { off64 = Add2Ptr(addr, vbo[1]); off[0] = vbo[1] ? le64_to_cpu(off64[-1]) : 0; off[1] = le64_to_cpu(off64[0]); } *vbo_data += off[0]; *ondisk_size = off[1] - off[0]; return 0; } wof_size = le64_to_cpu(attr->nres.data_size); down_write(&ni->file.run_lock); folio = ni->file.offs_folio; if (!folio) { folio = folio_alloc(GFP_KERNEL, 0); if (!folio) { err = -ENOMEM; goto out; } folio->index = -1; ni->file.offs_folio = folio; } folio_lock(folio); addr = folio_address(folio); if (vbo[1]) { voff = vbo[1] & (PAGE_SIZE - 1); vbo[0] = vbo[1] - bytes_per_off; i = 0; } else { voff = 0; vbo[0] = 0; off[0] = 0; i = 1; } do { pgoff_t index = vbo[i] >> PAGE_SHIFT; if (index != folio->index) { struct page *page = &folio->page; u64 from = vbo[i] & ~(u64)(PAGE_SIZE - 1); u64 to = min(from + PAGE_SIZE, wof_size); err = attr_load_runs_range(ni, ATTR_DATA, WOF_NAME, ARRAY_SIZE(WOF_NAME), run, from, to); if (err) goto out1; err = ntfs_bio_pages(sbi, run, &page, 1, from, to - from, REQ_OP_READ); if (err) { folio->index = -1; goto out1; } folio->index = index; } if (i) { if (bytes_per_off == sizeof(__le32)) { off32 = Add2Ptr(addr, voff); off[1] = le32_to_cpu(*off32); } else { off64 = Add2Ptr(addr, voff); off[1] = le64_to_cpu(*off64); } } else if (!voff) { if (bytes_per_off == sizeof(__le32)) { off32 = Add2Ptr(addr, PAGE_SIZE - sizeof(u32)); off[0] = le32_to_cpu(*off32); } else { off64 = Add2Ptr(addr, PAGE_SIZE - sizeof(u64)); off[0] = le64_to_cpu(*off64); } } else { /* Two values in one page. */ if (bytes_per_off == sizeof(__le32)) { off32 = Add2Ptr(addr, voff); off[0] = le32_to_cpu(off32[-1]); off[1] = le32_to_cpu(off32[0]); } else { off64 = Add2Ptr(addr, voff); off[0] = le64_to_cpu(off64[-1]); off[1] = le64_to_cpu(off64[0]); } break; } } while (++i < 2); *vbo_data += off[0]; *ondisk_size = off[1] - off[0]; out1: folio_unlock(folio); out: up_write(&ni->file.run_lock); return err; } #endif /* * attr_is_frame_compressed - Used to detect compressed frame. * * attr - base (primary) attribute segment. * run - run to use, usually == &ni->file.run. * Only base segments contains valid 'attr->nres.c_unit' */ int attr_is_frame_compressed(struct ntfs_inode *ni, struct ATTRIB *attr, CLST frame, CLST *clst_data, struct runs_tree *run) { int err; u32 clst_frame; CLST clen, lcn, vcn, alen, slen, vcn_next; size_t idx; *clst_data = 0; if (!is_attr_compressed(attr)) return 0; if (!attr->non_res) return 0; clst_frame = 1u << attr->nres.c_unit; vcn = frame * clst_frame; if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) { err = attr_load_runs_vcn(ni, attr->type, attr_name(attr), attr->name_len, run, vcn); if (err) return err; if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) return -EINVAL; } if (lcn == SPARSE_LCN) { /* Sparsed frame. */ return 0; } if (clen >= clst_frame) { /* * The frame is not compressed 'cause * it does not contain any sparse clusters. */ *clst_data = clst_frame; return 0; } alen = bytes_to_cluster(ni->mi.sbi, le64_to_cpu(attr->nres.alloc_size)); slen = 0; *clst_data = clen; /* * The frame is compressed if *clst_data + slen >= clst_frame. * Check next fragments. */ while ((vcn += clen) < alen) { vcn_next = vcn; if (!run_get_entry(run, ++idx, &vcn, &lcn, &clen) || vcn_next != vcn) { err = attr_load_runs_vcn(ni, attr->type, attr_name(attr), attr->name_len, run, vcn_next); if (err) return err; vcn = vcn_next; if (!run_lookup_entry(run, vcn, &lcn, &clen, &idx)) return -EINVAL; } if (lcn == SPARSE_LCN) { slen += clen; } else { if (slen) { /* * Data_clusters + sparse_clusters = * not enough for frame. */ return -EINVAL; } *clst_data += clen; } if (*clst_data + slen >= clst_frame) { if (!slen) { /* * There is no sparsed clusters in this frame * so it is not compressed. */ *clst_data = clst_frame; } else { /* Frame is compressed. */ } break; } } return 0; } /* * attr_allocate_frame - Allocate/free clusters for @frame. * * Assumed: down_write(&ni->file.run_lock); */ int attr_allocate_frame(struct ntfs_inode *ni, CLST frame, size_t compr_size, u64 new_valid) { int err = 0; struct runs_tree *run = &ni->file.run; struct ntfs_sb_info *sbi = ni->mi.sbi; struct ATTRIB *attr = NULL, *attr_b; struct ATTR_LIST_ENTRY *le, *le_b; struct mft_inode *mi, *mi_b; CLST svcn, evcn1, next_svcn, len; CLST vcn, end, clst_data; u64 total_size, valid_size, data_size; le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) return -ENOENT; if (!is_attr_ext(attr_b)) return -EINVAL; vcn = frame << NTFS_LZNT_CUNIT; total_size = le64_to_cpu(attr_b->nres.total_size); svcn = le64_to_cpu(attr_b->nres.svcn); evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1; data_size = le64_to_cpu(attr_b->nres.data_size); if (svcn <= vcn && vcn < evcn1) { attr = attr_b; le = le_b; mi = mi_b; } else if (!le_b) { err = -EINVAL; goto out; } else { le = le_b; attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn, &mi); if (!attr) { err = -EINVAL; goto out; } svcn = le64_to_cpu(attr->nres.svcn); evcn1 = le64_to_cpu(attr->nres.evcn) + 1; } err = attr_load_runs(attr, ni, run, NULL); if (err) goto out; err = attr_is_frame_compressed(ni, attr_b, frame, &clst_data, run); if (err) goto out; total_size -= (u64)clst_data << sbi->cluster_bits; len = bytes_to_cluster(sbi, compr_size); if (len == clst_data) goto out; if (len < clst_data) { err = run_deallocate_ex(sbi, run, vcn + len, clst_data - len, NULL, true); if (err) goto out; if (!run_add_entry(run, vcn + len, SPARSE_LCN, clst_data - len, false)) { err = -ENOMEM; goto out; } end = vcn + clst_data; /* Run contains updated range [vcn + len : end). */ } else { CLST alen, hint = 0; /* Get the last LCN to allocate from. */ if (vcn + clst_data && !run_lookup_entry(run, vcn + clst_data - 1, &hint, NULL, NULL)) { hint = -1; } err = attr_allocate_clusters(sbi, run, vcn + clst_data, hint + 1, len - clst_data, NULL, ALLOCATE_DEF, &alen, 0, NULL, NULL); if (err) goto out; end = vcn + len; /* Run contains updated range [vcn + clst_data : end). */ } total_size += (u64)len << sbi->cluster_bits; repack: err = mi_pack_runs(mi, attr, run, max(end, evcn1) - svcn); if (err) goto out; attr_b->nres.total_size = cpu_to_le64(total_size); inode_set_bytes(&ni->vfs_inode, total_size); ni->ni_flags |= NI_FLAG_UPDATE_PARENT; mi_b->dirty = true; mark_inode_dirty(&ni->vfs_inode); /* Stored [vcn : next_svcn) from [vcn : end). */ next_svcn = le64_to_cpu(attr->nres.evcn) + 1; if (end <= evcn1) { if (next_svcn == evcn1) { /* Normal way. Update attribute and exit. */ goto ok; } /* Add new segment [next_svcn : evcn1 - next_svcn). */ if (!ni->attr_list.size) { err = ni_create_attr_list(ni); if (err) goto out; /* Layout of records is changed. */ le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) { err = -ENOENT; goto out; } attr = attr_b; le = le_b; mi = mi_b; goto repack; } } svcn = evcn1; /* Estimate next attribute. */ attr = ni_find_attr(ni, attr, &le, ATTR_DATA, NULL, 0, &svcn, &mi); if (attr) { CLST alloc = bytes_to_cluster( sbi, le64_to_cpu(attr_b->nres.alloc_size)); CLST evcn = le64_to_cpu(attr->nres.evcn); if (end < next_svcn) end = next_svcn; while (end > evcn) { /* Remove segment [svcn : evcn). */ mi_remove_attr(NULL, mi, attr); if (!al_remove_le(ni, le)) { err = -EINVAL; goto out; } if (evcn + 1 >= alloc) { /* Last attribute segment. */ evcn1 = evcn + 1; goto ins_ext; } if (ni_load_mi(ni, le, &mi)) { attr = NULL; goto out; } attr = mi_find_attr(ni, mi, NULL, ATTR_DATA, NULL, 0, &le->id); if (!attr) { err = -EINVAL; goto out; } svcn = le64_to_cpu(attr->nres.svcn); evcn = le64_to_cpu(attr->nres.evcn); } if (end < svcn) end = svcn; err = attr_load_runs(attr, ni, run, &end); if (err) goto out; evcn1 = evcn + 1; attr->nres.svcn = cpu_to_le64(next_svcn); err = mi_pack_runs(mi, attr, run, evcn1 - next_svcn); if (err) goto out; le->vcn = cpu_to_le64(next_svcn); ni->attr_list.dirty = true; mi->dirty = true; next_svcn = le64_to_cpu(attr->nres.evcn) + 1; } ins_ext: if (evcn1 > next_svcn) { err = ni_insert_nonresident(ni, ATTR_DATA, NULL, 0, run, next_svcn, evcn1 - next_svcn, attr_b->flags, &attr, &mi, NULL); if (err) goto out; } ok: run_truncate_around(run, vcn); out: if (attr_b) { if (new_valid > data_size) new_valid = data_size; valid_size = le64_to_cpu(attr_b->nres.valid_size); if (new_valid != valid_size) { attr_b->nres.valid_size = cpu_to_le64(valid_size); mi_b->dirty = true; } } return err; } /* * attr_collapse_range - Collapse range in file. */ int attr_collapse_range(struct ntfs_inode *ni, u64 vbo, u64 bytes) { int err = 0; struct runs_tree *run = &ni->file.run; struct ntfs_sb_info *sbi = ni->mi.sbi; struct ATTRIB *attr = NULL, *attr_b; struct ATTR_LIST_ENTRY *le, *le_b; struct mft_inode *mi, *mi_b; CLST svcn, evcn1, len, dealloc, alen; CLST vcn, end; u64 valid_size, data_size, alloc_size, total_size; u32 mask; __le16 a_flags; if (!bytes) return 0; le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) return -ENOENT; if (!attr_b->non_res) { /* Attribute is resident. Nothing to do? */ return 0; } data_size = le64_to_cpu(attr_b->nres.data_size); alloc_size = le64_to_cpu(attr_b->nres.alloc_size); a_flags = attr_b->flags; if (is_attr_ext(attr_b)) { total_size = le64_to_cpu(attr_b->nres.total_size); mask = (sbi->cluster_size << attr_b->nres.c_unit) - 1; } else { total_size = alloc_size; mask = sbi->cluster_mask; } if ((vbo & mask) || (bytes & mask)) { /* Allow to collapse only cluster aligned ranges. */ return -EINVAL; } if (vbo > data_size) return -EINVAL; down_write(&ni->file.run_lock); if (vbo + bytes >= data_size) { u64 new_valid = min(ni->i_valid, vbo); /* Simple truncate file at 'vbo'. */ truncate_setsize(&ni->vfs_inode, vbo); err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, vbo, &new_valid, true, NULL); if (!err && new_valid < ni->i_valid) ni->i_valid = new_valid; goto out; } /* * Enumerate all attribute segments and collapse. */ alen = alloc_size >> sbi->cluster_bits; vcn = vbo >> sbi->cluster_bits; len = bytes >> sbi->cluster_bits; end = vcn + len; dealloc = 0; svcn = le64_to_cpu(attr_b->nres.svcn); evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1; if (svcn <= vcn && vcn < evcn1) { attr = attr_b; le = le_b; mi = mi_b; } else if (!le_b) { err = -EINVAL; goto out; } else { le = le_b; attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn, &mi); if (!attr) { err = -EINVAL; goto out; } svcn = le64_to_cpu(attr->nres.svcn); evcn1 = le64_to_cpu(attr->nres.evcn) + 1; } for (;;) { if (svcn >= end) { /* Shift VCN- */ attr->nres.svcn = cpu_to_le64(svcn - len); attr->nres.evcn = cpu_to_le64(evcn1 - 1 - len); if (le) { le->vcn = attr->nres.svcn; ni->attr_list.dirty = true; } mi->dirty = true; } else if (svcn < vcn || end < evcn1) { CLST vcn1, eat, next_svcn; /* Collapse a part of this attribute segment. */ err = attr_load_runs(attr, ni, run, &svcn); if (err) goto out; vcn1 = max(vcn, svcn); eat = min(end, evcn1) - vcn1; err = run_deallocate_ex(sbi, run, vcn1, eat, &dealloc, true); if (err) goto out; if (!run_collapse_range(run, vcn1, eat)) { err = -ENOMEM; goto out; } if (svcn >= vcn) { /* Shift VCN */ attr->nres.svcn = cpu_to_le64(vcn); if (le) { le->vcn = attr->nres.svcn; ni->attr_list.dirty = true; } } err = mi_pack_runs(mi, attr, run, evcn1 - svcn - eat); if (err) goto out; next_svcn = le64_to_cpu(attr->nres.evcn) + 1; if (next_svcn + eat < evcn1) { err = ni_insert_nonresident( ni, ATTR_DATA, NULL, 0, run, next_svcn, evcn1 - eat - next_svcn, a_flags, &attr, &mi, &le); if (err) goto out; /* Layout of records maybe changed. */ attr_b = NULL; } /* Free all allocated memory. */ run_truncate(run, 0); } else { u16 le_sz; u16 roff = le16_to_cpu(attr->nres.run_off); if (roff > le32_to_cpu(attr->size)) { err = -EINVAL; goto out; } run_unpack_ex(RUN_DEALLOCATE, sbi, ni->mi.rno, svcn, evcn1 - 1, svcn, Add2Ptr(attr, roff), le32_to_cpu(attr->size) - roff); /* Delete this attribute segment. */ mi_remove_attr(NULL, mi, attr); if (!le) break; le_sz = le16_to_cpu(le->size); if (!al_remove_le(ni, le)) { err = -EINVAL; goto out; } if (evcn1 >= alen) break; if (!svcn) { /* Load next record that contains this attribute. */ if (ni_load_mi(ni, le, &mi)) { err = -EINVAL; goto out; } /* Look for required attribute. */ attr = mi_find_attr(ni, mi, NULL, ATTR_DATA, NULL, 0, &le->id); if (!attr) { err = -EINVAL; goto out; } goto next_attr; } le = (struct ATTR_LIST_ENTRY *)((u8 *)le - le_sz); } if (evcn1 >= alen) break; attr = ni_enum_attr_ex(ni, attr, &le, &mi); if (!attr) { err = -EINVAL; goto out; } next_attr: svcn = le64_to_cpu(attr->nres.svcn); evcn1 = le64_to_cpu(attr->nres.evcn) + 1; } if (!attr_b) { le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) { err = -ENOENT; goto out; } } data_size -= bytes; valid_size = ni->i_valid; if (vbo + bytes <= valid_size) valid_size -= bytes; else if (vbo < valid_size) valid_size = vbo; attr_b->nres.alloc_size = cpu_to_le64(alloc_size - bytes); attr_b->nres.data_size = cpu_to_le64(data_size); attr_b->nres.valid_size = cpu_to_le64(min(valid_size, data_size)); total_size -= (u64)dealloc << sbi->cluster_bits; if (is_attr_ext(attr_b)) attr_b->nres.total_size = cpu_to_le64(total_size); mi_b->dirty = true; /* Update inode size. */ ni->i_valid = valid_size; i_size_write(&ni->vfs_inode, data_size); inode_set_bytes(&ni->vfs_inode, total_size); ni->ni_flags |= NI_FLAG_UPDATE_PARENT; mark_inode_dirty(&ni->vfs_inode); out: up_write(&ni->file.run_lock); if (err) _ntfs_bad_inode(&ni->vfs_inode); return err; } /* * attr_punch_hole * * Not for normal files. */ int attr_punch_hole(struct ntfs_inode *ni, u64 vbo, u64 bytes, u32 *frame_size) { int err = 0; struct runs_tree *run = &ni->file.run; struct ntfs_sb_info *sbi = ni->mi.sbi; struct ATTRIB *attr = NULL, *attr_b; struct ATTR_LIST_ENTRY *le, *le_b; struct mft_inode *mi, *mi_b; CLST svcn, evcn1, vcn, len, end, alen, hole, next_svcn; u64 total_size, alloc_size; u32 mask; __le16 a_flags; struct runs_tree run2; if (!bytes) return 0; le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) return -ENOENT; if (!attr_b->non_res) { u32 data_size = le32_to_cpu(attr_b->res.data_size); u32 from, to; if (vbo > data_size) return 0; from = vbo; to = min_t(u64, vbo + bytes, data_size); memset(Add2Ptr(resident_data(attr_b), from), 0, to - from); return 0; } if (!is_attr_ext(attr_b)) return -EOPNOTSUPP; alloc_size = le64_to_cpu(attr_b->nres.alloc_size); total_size = le64_to_cpu(attr_b->nres.total_size); if (vbo >= alloc_size) { /* NOTE: It is allowed. */ return 0; } mask = (sbi->cluster_size << attr_b->nres.c_unit) - 1; bytes += vbo; if (bytes > alloc_size) bytes = alloc_size; bytes -= vbo; if ((vbo & mask) || (bytes & mask)) { /* We have to zero a range(s). */ if (frame_size == NULL) { /* Caller insists range is aligned. */ return -EINVAL; } *frame_size = mask + 1; return E_NTFS_NOTALIGNED; } down_write(&ni->file.run_lock); run_init(&run2); run_truncate(run, 0); /* * Enumerate all attribute segments and punch hole where necessary. */ alen = alloc_size >> sbi->cluster_bits; vcn = vbo >> sbi->cluster_bits; len = bytes >> sbi->cluster_bits; end = vcn + len; hole = 0; svcn = le64_to_cpu(attr_b->nres.svcn); evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1; a_flags = attr_b->flags; if (svcn <= vcn && vcn < evcn1) { attr = attr_b; le = le_b; mi = mi_b; } else if (!le_b) { err = -EINVAL; goto bad_inode; } else { le = le_b; attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn, &mi); if (!attr) { err = -EINVAL; goto bad_inode; } svcn = le64_to_cpu(attr->nres.svcn); evcn1 = le64_to_cpu(attr->nres.evcn) + 1; } while (svcn < end) { CLST vcn1, zero, hole2 = hole; err = attr_load_runs(attr, ni, run, &svcn); if (err) goto done; vcn1 = max(vcn, svcn); zero = min(end, evcn1) - vcn1; /* * Check range [vcn1 + zero). * Calculate how many clusters there are. * Don't do any destructive actions. */ err = run_deallocate_ex(NULL, run, vcn1, zero, &hole2, false); if (err) goto done; /* Check if required range is already hole. */ if (hole2 == hole) goto next_attr; /* Make a clone of run to undo. */ err = run_clone(run, &run2); if (err) goto done; /* Make a hole range (sparse) [vcn1 + zero). */ if (!run_add_entry(run, vcn1, SPARSE_LCN, zero, false)) { err = -ENOMEM; goto done; } /* Update run in attribute segment. */ err = mi_pack_runs(mi, attr, run, evcn1 - svcn); if (err) goto done; next_svcn = le64_to_cpu(attr->nres.evcn) + 1; if (next_svcn < evcn1) { /* Insert new attribute segment. */ err = ni_insert_nonresident(ni, ATTR_DATA, NULL, 0, run, next_svcn, evcn1 - next_svcn, a_flags, &attr, &mi, &le); if (err) goto undo_punch; /* Layout of records maybe changed. */ attr_b = NULL; } /* Real deallocate. Should not fail. */ run_deallocate_ex(sbi, &run2, vcn1, zero, &hole, true); next_attr: /* Free all allocated memory. */ run_truncate(run, 0); if (evcn1 >= alen) break; /* Get next attribute segment. */ attr = ni_enum_attr_ex(ni, attr, &le, &mi); if (!attr) { err = -EINVAL; goto bad_inode; } svcn = le64_to_cpu(attr->nres.svcn); evcn1 = le64_to_cpu(attr->nres.evcn) + 1; } done: if (!hole) goto out; if (!attr_b) { attr_b = ni_find_attr(ni, NULL, NULL, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) { err = -EINVAL; goto bad_inode; } } total_size -= (u64)hole << sbi->cluster_bits; attr_b->nres.total_size = cpu_to_le64(total_size); mi_b->dirty = true; /* Update inode size. */ inode_set_bytes(&ni->vfs_inode, total_size); ni->ni_flags |= NI_FLAG_UPDATE_PARENT; mark_inode_dirty(&ni->vfs_inode); out: run_close(&run2); up_write(&ni->file.run_lock); return err; bad_inode: _ntfs_bad_inode(&ni->vfs_inode); goto out; undo_punch: /* * Restore packed runs. * 'mi_pack_runs' should not fail, cause we restore original. */ if (mi_pack_runs(mi, attr, &run2, evcn1 - svcn)) goto bad_inode; goto done; } /* * attr_insert_range - Insert range (hole) in file. * Not for normal files. */ int attr_insert_range(struct ntfs_inode *ni, u64 vbo, u64 bytes) { int err = 0; struct runs_tree *run = &ni->file.run; struct ntfs_sb_info *sbi = ni->mi.sbi; struct ATTRIB *attr = NULL, *attr_b; struct ATTR_LIST_ENTRY *le, *le_b; struct mft_inode *mi, *mi_b; CLST vcn, svcn, evcn1, len, next_svcn; u64 data_size, alloc_size; u32 mask; __le16 a_flags; if (!bytes) return 0; le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) return -ENOENT; if (!is_attr_ext(attr_b)) { /* It was checked above. See fallocate. */ return -EOPNOTSUPP; } if (!attr_b->non_res) { data_size = le32_to_cpu(attr_b->res.data_size); alloc_size = data_size; mask = sbi->cluster_mask; /* cluster_size - 1 */ } else { data_size = le64_to_cpu(attr_b->nres.data_size); alloc_size = le64_to_cpu(attr_b->nres.alloc_size); mask = (sbi->cluster_size << attr_b->nres.c_unit) - 1; } if (vbo >= data_size) { /* * Insert range after the file size is not allowed. * If the offset is equal to or greater than the end of * file, an error is returned. For such operations (i.e., inserting * a hole at the end of file), ftruncate(2) should be used. */ return -EINVAL; } if ((vbo & mask) || (bytes & mask)) { /* Allow to insert only frame aligned ranges. */ return -EINVAL; } /* * valid_size <= data_size <= alloc_size * Check alloc_size for maximum possible. */ if (bytes > sbi->maxbytes_sparse - alloc_size) return -EFBIG; vcn = vbo >> sbi->cluster_bits; len = bytes >> sbi->cluster_bits; down_write(&ni->file.run_lock); if (!attr_b->non_res) { err = attr_set_size(ni, ATTR_DATA, NULL, 0, run, data_size + bytes, NULL, false, NULL); le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) { err = -EINVAL; goto bad_inode; } if (err) goto out; if (!attr_b->non_res) { /* Still resident. */ char *data = Add2Ptr(attr_b, le16_to_cpu(attr_b->res.data_off)); memmove(data + bytes, data, bytes); memset(data, 0, bytes); goto done; } /* Resident files becomes nonresident. */ data_size = le64_to_cpu(attr_b->nres.data_size); alloc_size = le64_to_cpu(attr_b->nres.alloc_size); } /* * Enumerate all attribute segments and shift start vcn. */ a_flags = attr_b->flags; svcn = le64_to_cpu(attr_b->nres.svcn); evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1; if (svcn <= vcn && vcn < evcn1) { attr = attr_b; le = le_b; mi = mi_b; } else if (!le_b) { err = -EINVAL; goto bad_inode; } else { le = le_b; attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn, &mi); if (!attr) { err = -EINVAL; goto bad_inode; } svcn = le64_to_cpu(attr->nres.svcn); evcn1 = le64_to_cpu(attr->nres.evcn) + 1; } run_truncate(run, 0); /* clear cached values. */ err = attr_load_runs(attr, ni, run, NULL); if (err) goto out; if (!run_insert_range(run, vcn, len)) { err = -ENOMEM; goto out; } /* Try to pack in current record as much as possible. */ err = mi_pack_runs(mi, attr, run, evcn1 + len - svcn); if (err) goto out; next_svcn = le64_to_cpu(attr->nres.evcn) + 1; while ((attr = ni_enum_attr_ex(ni, attr, &le, &mi)) && attr->type == ATTR_DATA && !attr->name_len) { le64_add_cpu(&attr->nres.svcn, len); le64_add_cpu(&attr->nres.evcn, len); if (le) { le->vcn = attr->nres.svcn; ni->attr_list.dirty = true; } mi->dirty = true; } if (next_svcn < evcn1 + len) { err = ni_insert_nonresident(ni, ATTR_DATA, NULL, 0, run, next_svcn, evcn1 + len - next_svcn, a_flags, NULL, NULL, NULL); le_b = NULL; attr_b = ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) { err = -EINVAL; goto bad_inode; } if (err) { /* ni_insert_nonresident failed. Try to undo. */ goto undo_insert_range; } } /* * Update primary attribute segment. */ if (vbo <= ni->i_valid) ni->i_valid += bytes; attr_b->nres.data_size = cpu_to_le64(data_size + bytes); attr_b->nres.alloc_size = cpu_to_le64(alloc_size + bytes); /* ni->valid may be not equal valid_size (temporary). */ if (ni->i_valid > data_size + bytes) attr_b->nres.valid_size = attr_b->nres.data_size; else attr_b->nres.valid_size = cpu_to_le64(ni->i_valid); mi_b->dirty = true; done: i_size_write(&ni->vfs_inode, ni->vfs_inode.i_size + bytes); ni->ni_flags |= NI_FLAG_UPDATE_PARENT; mark_inode_dirty(&ni->vfs_inode); out: run_truncate(run, 0); /* clear cached values. */ up_write(&ni->file.run_lock); return err; bad_inode: _ntfs_bad_inode(&ni->vfs_inode); goto out; undo_insert_range: svcn = le64_to_cpu(attr_b->nres.svcn); evcn1 = le64_to_cpu(attr_b->nres.evcn) + 1; if (svcn <= vcn && vcn < evcn1) { attr = attr_b; le = le_b; mi = mi_b; } else if (!le_b) { goto bad_inode; } else { le = le_b; attr = ni_find_attr(ni, attr_b, &le, ATTR_DATA, NULL, 0, &vcn, &mi); if (!attr) { goto bad_inode; } svcn = le64_to_cpu(attr->nres.svcn); evcn1 = le64_to_cpu(attr->nres.evcn) + 1; } if (attr_load_runs(attr, ni, run, NULL)) goto bad_inode; if (!run_collapse_range(run, vcn, len)) goto bad_inode; if (mi_pack_runs(mi, attr, run, evcn1 + len - svcn)) goto bad_inode; while ((attr = ni_enum_attr_ex(ni, attr, &le, &mi)) && attr->type == ATTR_DATA && !attr->name_len) { le64_sub_cpu(&attr->nres.svcn, len); le64_sub_cpu(&attr->nres.evcn, len); if (le) { le->vcn = attr->nres.svcn; ni->attr_list.dirty = true; } mi->dirty = true; } goto out; } /* * attr_force_nonresident * * Convert default data attribute into non resident form. */ int attr_force_nonresident(struct ntfs_inode *ni) { int err; struct ATTRIB *attr; struct ATTR_LIST_ENTRY *le = NULL; struct mft_inode *mi; attr = ni_find_attr(ni, NULL, &le, ATTR_DATA, NULL, 0, NULL, &mi); if (!attr) { _ntfs_bad_inode(&ni->vfs_inode); return -ENOENT; } if (attr->non_res) { /* Already non resident. */ return 0; } down_write(&ni->file.run_lock); err = attr_make_nonresident(ni, attr, le, mi, le32_to_cpu(attr->res.data_size), &ni->file.run, &attr, NULL); up_write(&ni->file.run_lock); return err; } |
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _BCACHEFS_BBPOS_H #define _BCACHEFS_BBPOS_H #include "bbpos_types.h" #include "bkey_methods.h" #include "btree_cache.h" static inline int bbpos_cmp(struct bbpos l, struct bbpos r) { return cmp_int(l.btree, r.btree) ?: bpos_cmp(l.pos, r.pos); } static inline struct bbpos bbpos_successor(struct bbpos pos) { if (bpos_cmp(pos.pos, SPOS_MAX)) { pos.pos = bpos_successor(pos.pos); return pos; } if (pos.btree != BTREE_ID_NR) { pos.btree++; pos.pos = POS_MIN; return pos; } BUG(); } static inline void bch2_bbpos_to_text(struct printbuf *out, struct bbpos pos) { bch2_btree_id_to_text(out, pos.btree); prt_char(out, ':'); bch2_bpos_to_text(out, pos.pos); } #endif /* _BCACHEFS_BBPOS_H */ |
| 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 | // SPDX-License-Identifier: GPL-2.0 /* net/atm/proc.c - ATM /proc interface * * Written 1995-2000 by Werner Almesberger, EPFL LRC/ICA * * seq_file api usage by romieu@fr.zoreil.com * * Evaluating the efficiency of the whole thing if left as an exercise to * the reader. */ #include <linux/module.h> /* for EXPORT_SYMBOL */ #include <linux/string.h> #include <linux/types.h> #include <linux/mm.h> #include <linux/fs.h> #include <linux/stat.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> #include <linux/errno.h> #include <linux/atm.h> #include <linux/atmdev.h> #include <linux/netdevice.h> #include <linux/atmclip.h> #include <linux/init.h> /* for __init */ #include <linux/slab.h> #include <net/net_namespace.h> #include <net/atmclip.h> #include <linux/uaccess.h> #include <linux/param.h> /* for HZ */ #include <linux/atomic.h> #include "resources.h" #include "common.h" /* atm_proc_init prototype */ #include "signaling.h" /* to get sigd - ugly too */ static ssize_t proc_dev_atm_read(struct file *file, char __user *buf, size_t count, loff_t *pos); static const struct proc_ops atm_dev_proc_ops = { .proc_read = proc_dev_atm_read, .proc_lseek = noop_llseek, }; static void add_stats(struct seq_file *seq, const char *aal, const struct k_atm_aal_stats *stats) { seq_printf(seq, "%s ( %d %d %d %d %d )", aal, atomic_read(&stats->tx), atomic_read(&stats->tx_err), atomic_read(&stats->rx), atomic_read(&stats->rx_err), atomic_read(&stats->rx_drop)); } static void atm_dev_info(struct seq_file *seq, const struct atm_dev *dev) { int i; seq_printf(seq, "%3d %-8s", dev->number, dev->type); for (i = 0; i < ESI_LEN; i++) seq_printf(seq, "%02x", dev->esi[i]); seq_puts(seq, " "); add_stats(seq, "0", &dev->stats.aal0); seq_puts(seq, " "); add_stats(seq, "5", &dev->stats.aal5); seq_printf(seq, "\t[%d]", refcount_read(&dev->refcnt)); seq_putc(seq, '\n'); } struct vcc_state { int bucket; struct sock *sk; }; static inline int compare_family(struct sock *sk, int family) { return !family || (sk->sk_family == family); } static int __vcc_walk(struct sock **sock, int family, int *bucket, loff_t l) { struct sock *sk = *sock; if (sk == SEQ_START_TOKEN) { for (*bucket = 0; *bucket < VCC_HTABLE_SIZE; ++*bucket) { struct hlist_head *head = &vcc_hash[*bucket]; sk = hlist_empty(head) ? NULL : __sk_head(head); if (sk) break; } l--; } try_again: for (; sk; sk = sk_next(sk)) { l -= compare_family(sk, family); if (l < 0) goto out; } if (!sk && ++*bucket < VCC_HTABLE_SIZE) { sk = sk_head(&vcc_hash[*bucket]); goto try_again; } sk = SEQ_START_TOKEN; out: *sock = sk; return (l < 0); } static inline void *vcc_walk(struct seq_file *seq, loff_t l) { struct vcc_state *state = seq->private; int family = (uintptr_t)(pde_data(file_inode(seq->file))); return __vcc_walk(&state->sk, family, &state->bucket, l) ? state : NULL; } static void *vcc_seq_start(struct seq_file *seq, loff_t *pos) __acquires(vcc_sklist_lock) { struct vcc_state *state = seq->private; loff_t left = *pos; read_lock(&vcc_sklist_lock); state->sk = SEQ_START_TOKEN; return left ? vcc_walk(seq, left) : SEQ_START_TOKEN; } static void vcc_seq_stop(struct seq_file *seq, void *v) __releases(vcc_sklist_lock) { read_unlock(&vcc_sklist_lock); } static void *vcc_seq_next(struct seq_file *seq, void *v, loff_t *pos) { v = vcc_walk(seq, 1); (*pos)++; return v; } static void pvc_info(struct seq_file *seq, struct atm_vcc *vcc) { static const char *const class_name[] = { "off", "UBR", "CBR", "VBR", "ABR"}; static const char *const aal_name[] = { "---", "1", "2", "3/4", /* 0- 3 */ "???", "5", "???", "???", /* 4- 7 */ "???", "???", "???", "???", /* 8-11 */ "???", "0", "???", "???"}; /* 12-15 */ seq_printf(seq, "%3d %3d %5d %-3s %7d %-5s %7d %-6s", vcc->dev->number, vcc->vpi, vcc->vci, vcc->qos.aal >= ARRAY_SIZE(aal_name) ? "err" : aal_name[vcc->qos.aal], vcc->qos.rxtp.min_pcr, class_name[vcc->qos.rxtp.traffic_class], vcc->qos.txtp.min_pcr, class_name[vcc->qos.txtp.traffic_class]); if (test_bit(ATM_VF_IS_CLIP, &vcc->flags)) { struct clip_vcc *clip_vcc = CLIP_VCC(vcc); struct net_device *dev; dev = clip_vcc->entry ? clip_vcc->entry->neigh->dev : NULL; seq_printf(seq, "CLIP, Itf:%s, Encap:", dev ? dev->name : "none?"); seq_printf(seq, "%s", clip_vcc->encap ? "LLC/SNAP" : "None"); } seq_putc(seq, '\n'); } static const char *vcc_state(struct atm_vcc *vcc) { static const char *const map[] = { ATM_VS2TXT_MAP }; return map[ATM_VF2VS(vcc->flags)]; } static void vcc_info(struct seq_file *seq, struct atm_vcc *vcc) { struct sock *sk = sk_atm(vcc); seq_printf(seq, "%pK ", vcc); if (!vcc->dev) seq_printf(seq, "Unassigned "); else seq_printf(seq, "%3d %3d %5d ", vcc->dev->number, vcc->vpi, vcc->vci); switch (sk->sk_family) { case AF_ATMPVC: seq_printf(seq, "PVC"); break; case AF_ATMSVC: seq_printf(seq, "SVC"); break; default: seq_printf(seq, "%3d", sk->sk_family); } seq_printf(seq, " %04lx %5d %7d/%7d %7d/%7d [%d]\n", vcc->flags, sk->sk_err, sk_wmem_alloc_get(sk), sk->sk_sndbuf, sk_rmem_alloc_get(sk), sk->sk_rcvbuf, refcount_read(&sk->sk_refcnt)); } static void svc_info(struct seq_file *seq, struct atm_vcc *vcc) { if (!vcc->dev) seq_printf(seq, sizeof(void *) == 4 ? "N/A@%pK%10s" : "N/A@%pK%2s", vcc, ""); else seq_printf(seq, "%3d %3d %5d ", vcc->dev->number, vcc->vpi, vcc->vci); seq_printf(seq, "%-10s ", vcc_state(vcc)); seq_printf(seq, "%s%s", vcc->remote.sas_addr.pub, *vcc->remote.sas_addr.pub && *vcc->remote.sas_addr.prv ? "+" : ""); if (*vcc->remote.sas_addr.prv) { int i; for (i = 0; i < ATM_ESA_LEN; i++) seq_printf(seq, "%02x", vcc->remote.sas_addr.prv[i]); } seq_putc(seq, '\n'); } static int atm_dev_seq_show(struct seq_file *seq, void *v) { static char atm_dev_banner[] = "Itf Type ESI/\"MAC\"addr " "AAL(TX,err,RX,err,drop) ... [refcnt]\n"; if (v == &atm_devs) seq_puts(seq, atm_dev_banner); else { struct atm_dev *dev = list_entry(v, struct atm_dev, dev_list); atm_dev_info(seq, dev); } return 0; } static const struct seq_operations atm_dev_seq_ops = { .start = atm_dev_seq_start, .next = atm_dev_seq_next, .stop = atm_dev_seq_stop, .show = atm_dev_seq_show, }; static int pvc_seq_show(struct seq_file *seq, void *v) { static char atm_pvc_banner[] = "Itf VPI VCI AAL RX(PCR,Class) TX(PCR,Class)\n"; if (v == SEQ_START_TOKEN) seq_puts(seq, atm_pvc_banner); else { struct vcc_state *state = seq->private; struct atm_vcc *vcc = atm_sk(state->sk); pvc_info(seq, vcc); } return 0; } static const struct seq_operations pvc_seq_ops = { .start = vcc_seq_start, .next = vcc_seq_next, .stop = vcc_seq_stop, .show = pvc_seq_show, }; static int vcc_seq_show(struct seq_file *seq, void *v) { if (v == SEQ_START_TOKEN) { seq_printf(seq, sizeof(void *) == 4 ? "%-8s%s" : "%-16s%s", "Address ", "Itf VPI VCI Fam Flags Reply " "Send buffer Recv buffer [refcnt]\n"); } else { struct vcc_state *state = seq->private; struct atm_vcc *vcc = atm_sk(state->sk); vcc_info(seq, vcc); } return 0; } static const struct seq_operations vcc_seq_ops = { .start = vcc_seq_start, .next = vcc_seq_next, .stop = vcc_seq_stop, .show = vcc_seq_show, }; static int svc_seq_show(struct seq_file *seq, void *v) { static const char atm_svc_banner[] = "Itf VPI VCI State Remote\n"; if (v == SEQ_START_TOKEN) seq_puts(seq, atm_svc_banner); else { struct vcc_state *state = seq->private; struct atm_vcc *vcc = atm_sk(state->sk); svc_info(seq, vcc); } return 0; } static const struct seq_operations svc_seq_ops = { .start = vcc_seq_start, .next = vcc_seq_next, .stop = vcc_seq_stop, .show = svc_seq_show, }; static ssize_t proc_dev_atm_read(struct file *file, char __user *buf, size_t count, loff_t *pos) { struct atm_dev *dev; unsigned long page; int length; if (count == 0) return 0; page = get_zeroed_page(GFP_KERNEL); if (!page) return -ENOMEM; dev = pde_data(file_inode(file)); if (!dev->ops->proc_read) length = -EINVAL; else { length = dev->ops->proc_read(dev, pos, (char *)page); if (length > count) length = -EINVAL; } if (length >= 0) { if (copy_to_user(buf, (char *)page, length)) length = -EFAULT; (*pos)++; } free_page(page); return length; } struct proc_dir_entry *atm_proc_root; EXPORT_SYMBOL(atm_proc_root); int atm_proc_dev_register(struct atm_dev *dev) { int error; /* No proc info */ if (!dev->ops->proc_read) return 0; error = -ENOMEM; dev->proc_name = kasprintf(GFP_KERNEL, "%s:%d", dev->type, dev->number); if (!dev->proc_name) goto err_out; dev->proc_entry = proc_create_data(dev->proc_name, 0, atm_proc_root, &atm_dev_proc_ops, dev); if (!dev->proc_entry) goto err_free_name; return 0; err_free_name: kfree(dev->proc_name); err_out: return error; } void atm_proc_dev_deregister(struct atm_dev *dev) { if (!dev->ops->proc_read) return; remove_proc_entry(dev->proc_name, atm_proc_root); kfree(dev->proc_name); } int __init atm_proc_init(void) { atm_proc_root = proc_net_mkdir(&init_net, "atm", init_net.proc_net); if (!atm_proc_root) return -ENOMEM; proc_create_seq("devices", 0444, atm_proc_root, &atm_dev_seq_ops); proc_create_seq_private("pvc", 0444, atm_proc_root, &pvc_seq_ops, sizeof(struct vcc_state), (void *)(uintptr_t)PF_ATMPVC); proc_create_seq_private("svc", 0444, atm_proc_root, &svc_seq_ops, sizeof(struct vcc_state), (void *)(uintptr_t)PF_ATMSVC); proc_create_seq_private("vc", 0444, atm_proc_root, &vcc_seq_ops, sizeof(struct vcc_state), NULL); return 0; } void atm_proc_exit(void) { remove_proc_subtree("atm", init_net.proc_net); } |
| 2 1036 1043 1037 1035 216 216 204 201 202 1040 580 1043 1041 1034 1040 2 1039 1041 1038 1045 1041 1039 1039 1040 1039 1039 1037 1038 1034 1041 1043 201 201 200 195 200 197 200 200 196 195 2 2 184 3 5 196 196 138 114 138 180 1 181 83 197 198 181 143 140 22 141 70 176 126 81 176 198 204 202 201 218 217 217 217 215 211 212 212 207 210 204 201 212 108 82 108 79 107 219 220 220 220 218 218 210 212 3 208 209 202 199 198 118 190 177 177 97 20 212 5 4 4 4 4 3 3 3 3 3 3 1 4 3 2 2 1 1 4 1 15 6 9 2 7 7 7 9 15 223 222 222 221 219 217 217 15 15 13 211 3 1 2 1 1 1 1 2 3 221 230 227 226 226 221 222 14 13 14 4 4 4 1 4 3 1 2 1 1 2 2 2 2 2 2 2 2 2 15 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 | // SPDX-License-Identifier: GPL-2.0-only /* * Packet matching code. * * Copyright (C) 1999 Paul `Rusty' Russell & Michael J. Neuling * Copyright (C) 2000-2005 Netfilter Core Team <coreteam@netfilter.org> * Copyright (c) 2006-2010 Patrick McHardy <kaber@trash.net> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/kernel.h> #include <linux/capability.h> #include <linux/in.h> #include <linux/skbuff.h> #include <linux/kmod.h> #include <linux/vmalloc.h> #include <linux/netdevice.h> #include <linux/module.h> #include <linux/poison.h> #include <net/ipv6.h> #include <net/compat.h> #include <linux/uaccess.h> #include <linux/mutex.h> #include <linux/proc_fs.h> #include <linux/err.h> #include <linux/cpumask.h> #include <linux/netfilter_ipv6/ip6_tables.h> #include <linux/netfilter/x_tables.h> #include <net/netfilter/nf_log.h> #include "../../netfilter/xt_repldata.h" MODULE_LICENSE("GPL"); MODULE_AUTHOR("Netfilter Core Team <coreteam@netfilter.org>"); MODULE_DESCRIPTION("IPv6 packet filter"); void *ip6t_alloc_initial_table(const struct xt_table *info) { return xt_alloc_initial_table(ip6t, IP6T); } EXPORT_SYMBOL_GPL(ip6t_alloc_initial_table); /* Returns whether matches rule or not. */ /* Performance critical - called for every packet */ static inline bool ip6_packet_match(const struct sk_buff *skb, const char *indev, const char *outdev, const struct ip6t_ip6 *ip6info, unsigned int *protoff, u16 *fragoff, bool *hotdrop) { unsigned long ret; const struct ipv6hdr *ipv6 = ipv6_hdr(skb); if (NF_INVF(ip6info, IP6T_INV_SRCIP, ipv6_masked_addr_cmp(&ipv6->saddr, &ip6info->smsk, &ip6info->src)) || NF_INVF(ip6info, IP6T_INV_DSTIP, ipv6_masked_addr_cmp(&ipv6->daddr, &ip6info->dmsk, &ip6info->dst))) return false; ret = ifname_compare_aligned(indev, ip6info->iniface, ip6info->iniface_mask); if (NF_INVF(ip6info, IP6T_INV_VIA_IN, ret != 0)) return false; ret = ifname_compare_aligned(outdev, ip6info->outiface, ip6info->outiface_mask); if (NF_INVF(ip6info, IP6T_INV_VIA_OUT, ret != 0)) return false; /* ... might want to do something with class and flowlabel here ... */ /* look for the desired protocol header */ if (ip6info->flags & IP6T_F_PROTO) { int protohdr; unsigned short _frag_off; protohdr = ipv6_find_hdr(skb, protoff, -1, &_frag_off, NULL); if (protohdr < 0) { if (_frag_off == 0) *hotdrop = true; return false; } *fragoff = _frag_off; if (ip6info->proto == protohdr) { if (ip6info->invflags & IP6T_INV_PROTO) return false; return true; } /* We need match for the '-p all', too! */ if ((ip6info->proto != 0) && !(ip6info->invflags & IP6T_INV_PROTO)) return false; } return true; } /* should be ip6 safe */ static bool ip6_checkentry(const struct ip6t_ip6 *ipv6) { if (ipv6->flags & ~IP6T_F_MASK) return false; if (ipv6->invflags & ~IP6T_INV_MASK) return false; return true; } static unsigned int ip6t_error(struct sk_buff *skb, const struct xt_action_param *par) { net_info_ratelimited("error: `%s'\n", (const char *)par->targinfo); return NF_DROP; } static inline struct ip6t_entry * get_entry(const void *base, unsigned int offset) { return (struct ip6t_entry *)(base + offset); } /* All zeroes == unconditional rule. */ /* Mildly perf critical (only if packet tracing is on) */ static inline bool unconditional(const struct ip6t_entry *e) { static const struct ip6t_ip6 uncond; return e->target_offset == sizeof(struct ip6t_entry) && memcmp(&e->ipv6, &uncond, sizeof(uncond)) == 0; } static inline const struct xt_entry_target * ip6t_get_target_c(const struct ip6t_entry *e) { return ip6t_get_target((struct ip6t_entry *)e); } #if IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TRACE) /* This cries for unification! */ static const char *const hooknames[] = { [NF_INET_PRE_ROUTING] = "PREROUTING", [NF_INET_LOCAL_IN] = "INPUT", [NF_INET_FORWARD] = "FORWARD", [NF_INET_LOCAL_OUT] = "OUTPUT", [NF_INET_POST_ROUTING] = "POSTROUTING", }; enum nf_ip_trace_comments { NF_IP6_TRACE_COMMENT_RULE, NF_IP6_TRACE_COMMENT_RETURN, NF_IP6_TRACE_COMMENT_POLICY, }; static const char *const comments[] = { [NF_IP6_TRACE_COMMENT_RULE] = "rule", [NF_IP6_TRACE_COMMENT_RETURN] = "return", [NF_IP6_TRACE_COMMENT_POLICY] = "policy", }; static const struct nf_loginfo trace_loginfo = { .type = NF_LOG_TYPE_LOG, .u = { .log = { .level = LOGLEVEL_WARNING, .logflags = NF_LOG_DEFAULT_MASK, }, }, }; /* Mildly perf critical (only if packet tracing is on) */ static inline int get_chainname_rulenum(const struct ip6t_entry *s, const struct ip6t_entry *e, const char *hookname, const char **chainname, const char **comment, unsigned int *rulenum) { const struct xt_standard_target *t = (void *)ip6t_get_target_c(s); if (strcmp(t->target.u.kernel.target->name, XT_ERROR_TARGET) == 0) { /* Head of user chain: ERROR target with chainname */ *chainname = t->target.data; (*rulenum) = 0; } else if (s == e) { (*rulenum)++; if (unconditional(s) && strcmp(t->target.u.kernel.target->name, XT_STANDARD_TARGET) == 0 && t->verdict < 0) { /* Tail of chains: STANDARD target (return/policy) */ *comment = *chainname == hookname ? comments[NF_IP6_TRACE_COMMENT_POLICY] : comments[NF_IP6_TRACE_COMMENT_RETURN]; } return 1; } else (*rulenum)++; return 0; } static void trace_packet(struct net *net, const struct sk_buff *skb, unsigned int hook, const struct net_device *in, const struct net_device *out, const char *tablename, const struct xt_table_info *private, const struct ip6t_entry *e) { const struct ip6t_entry *root; const char *hookname, *chainname, *comment; const struct ip6t_entry *iter; unsigned int rulenum = 0; root = get_entry(private->entries, private->hook_entry[hook]); hookname = chainname = hooknames[hook]; comment = comments[NF_IP6_TRACE_COMMENT_RULE]; xt_entry_foreach(iter, root, private->size - private->hook_entry[hook]) if (get_chainname_rulenum(iter, e, hookname, &chainname, &comment, &rulenum) != 0) break; nf_log_trace(net, AF_INET6, hook, skb, in, out, &trace_loginfo, "TRACE: %s:%s:%s:%u ", tablename, chainname, comment, rulenum); } #endif static inline struct ip6t_entry * ip6t_next_entry(const struct ip6t_entry *entry) { return (void *)entry + entry->next_offset; } /* Returns one of the generic firewall policies, like NF_ACCEPT. */ unsigned int ip6t_do_table(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { const struct xt_table *table = priv; unsigned int hook = state->hook; static const char nulldevname[IFNAMSIZ] __attribute__((aligned(sizeof(long)))); /* Initializing verdict to NF_DROP keeps gcc happy. */ unsigned int verdict = NF_DROP; const char *indev, *outdev; const void *table_base; struct ip6t_entry *e, **jumpstack; unsigned int stackidx, cpu; const struct xt_table_info *private; struct xt_action_param acpar; unsigned int addend; /* Initialization */ stackidx = 0; indev = state->in ? state->in->name : nulldevname; outdev = state->out ? state->out->name : nulldevname; /* We handle fragments by dealing with the first fragment as * if it was a normal packet. All other fragments are treated * normally, except that they will NEVER match rules that ask * things we don't know, ie. tcp syn flag or ports). If the * rule is also a fragment-specific rule, non-fragments won't * match it. */ acpar.fragoff = 0; acpar.hotdrop = false; acpar.state = state; WARN_ON(!(table->valid_hooks & (1 << hook))); local_bh_disable(); addend = xt_write_recseq_begin(); private = READ_ONCE(table->private); /* Address dependency. */ cpu = smp_processor_id(); table_base = private->entries; jumpstack = (struct ip6t_entry **)private->jumpstack[cpu]; /* Switch to alternate jumpstack if we're being invoked via TEE. * TEE issues XT_CONTINUE verdict on original skb so we must not * clobber the jumpstack. * * For recursion via REJECT or SYNPROXY the stack will be clobbered * but it is no problem since absolute verdict is issued by these. */ if (static_key_false(&xt_tee_enabled)) jumpstack += private->stacksize * current->in_nf_duplicate; e = get_entry(table_base, private->hook_entry[hook]); do { const struct xt_entry_target *t; const struct xt_entry_match *ematch; struct xt_counters *counter; WARN_ON(!e); acpar.thoff = 0; if (!ip6_packet_match(skb, indev, outdev, &e->ipv6, &acpar.thoff, &acpar.fragoff, &acpar.hotdrop)) { no_match: e = ip6t_next_entry(e); continue; } xt_ematch_foreach(ematch, e) { acpar.match = ematch->u.kernel.match; acpar.matchinfo = ematch->data; if (!acpar.match->match(skb, &acpar)) goto no_match; } counter = xt_get_this_cpu_counter(&e->counters); ADD_COUNTER(*counter, skb->len, 1); t = ip6t_get_target_c(e); WARN_ON(!t->u.kernel.target); #if IS_ENABLED(CONFIG_NETFILTER_XT_TARGET_TRACE) /* The packet is traced: log it */ if (unlikely(skb->nf_trace)) trace_packet(state->net, skb, hook, state->in, state->out, table->name, private, e); #endif /* Standard target? */ if (!t->u.kernel.target->target) { int v; v = ((struct xt_standard_target *)t)->verdict; if (v < 0) { /* Pop from stack? */ if (v != XT_RETURN) { verdict = (unsigned int)(-v) - 1; break; } if (stackidx == 0) e = get_entry(table_base, private->underflow[hook]); else e = ip6t_next_entry(jumpstack[--stackidx]); continue; } if (table_base + v != ip6t_next_entry(e) && !(e->ipv6.flags & IP6T_F_GOTO)) { if (unlikely(stackidx >= private->stacksize)) { verdict = NF_DROP; break; } jumpstack[stackidx++] = e; } e = get_entry(table_base, v); continue; } acpar.target = t->u.kernel.target; acpar.targinfo = t->data; verdict = t->u.kernel.target->target(skb, &acpar); if (verdict == XT_CONTINUE) e = ip6t_next_entry(e); else /* Verdict */ break; } while (!acpar.hotdrop); xt_write_recseq_end(addend); local_bh_enable(); if (acpar.hotdrop) return NF_DROP; else return verdict; } /* Figures out from what hook each rule can be called: returns 0 if there are loops. Puts hook bitmask in comefrom. */ static int mark_source_chains(const struct xt_table_info *newinfo, unsigned int valid_hooks, void *entry0, unsigned int *offsets) { unsigned int hook; /* No recursion; use packet counter to save back ptrs (reset to 0 as we leave), and comefrom to save source hook bitmask */ for (hook = 0; hook < NF_INET_NUMHOOKS; hook++) { unsigned int pos = newinfo->hook_entry[hook]; struct ip6t_entry *e = entry0 + pos; if (!(valid_hooks & (1 << hook))) continue; /* Set initial back pointer. */ e->counters.pcnt = pos; for (;;) { const struct xt_standard_target *t = (void *)ip6t_get_target_c(e); int visited = e->comefrom & (1 << hook); if (e->comefrom & (1 << NF_INET_NUMHOOKS)) return 0; e->comefrom |= ((1 << hook) | (1 << NF_INET_NUMHOOKS)); /* Unconditional return/END. */ if ((unconditional(e) && (strcmp(t->target.u.user.name, XT_STANDARD_TARGET) == 0) && t->verdict < 0) || visited) { unsigned int oldpos, size; /* Return: backtrack through the last big jump. */ do { e->comefrom ^= (1<<NF_INET_NUMHOOKS); oldpos = pos; pos = e->counters.pcnt; e->counters.pcnt = 0; /* We're at the start. */ if (pos == oldpos) goto next; e = entry0 + pos; } while (oldpos == pos + e->next_offset); /* Move along one */ size = e->next_offset; e = entry0 + pos + size; if (pos + size >= newinfo->size) return 0; e->counters.pcnt = pos; pos += size; } else { int newpos = t->verdict; if (strcmp(t->target.u.user.name, XT_STANDARD_TARGET) == 0 && newpos >= 0) { /* This a jump; chase it. */ if (!xt_find_jump_offset(offsets, newpos, newinfo->number)) return 0; } else { /* ... this is a fallthru */ newpos = pos + e->next_offset; if (newpos >= newinfo->size) return 0; } e = entry0 + newpos; e->counters.pcnt = pos; pos = newpos; } } next: ; } return 1; } static void cleanup_match(struct xt_entry_match *m, struct net *net) { struct xt_mtdtor_param par; par.net = net; par.match = m->u.kernel.match; par.matchinfo = m->data; par.family = NFPROTO_IPV6; if (par.match->destroy != NULL) par.match->destroy(&par); module_put(par.match->me); } static int check_match(struct xt_entry_match *m, struct xt_mtchk_param *par) { const struct ip6t_ip6 *ipv6 = par->entryinfo; par->match = m->u.kernel.match; par->matchinfo = m->data; return xt_check_match(par, m->u.match_size - sizeof(*m), ipv6->proto, ipv6->invflags & IP6T_INV_PROTO); } static int find_check_match(struct xt_entry_match *m, struct xt_mtchk_param *par) { struct xt_match *match; int ret; match = xt_request_find_match(NFPROTO_IPV6, m->u.user.name, m->u.user.revision); if (IS_ERR(match)) return PTR_ERR(match); m->u.kernel.match = match; ret = check_match(m, par); if (ret) goto err; return 0; err: module_put(m->u.kernel.match->me); return ret; } static int check_target(struct ip6t_entry *e, struct net *net, const char *name) { struct xt_entry_target *t = ip6t_get_target(e); struct xt_tgchk_param par = { .net = net, .table = name, .entryinfo = e, .target = t->u.kernel.target, .targinfo = t->data, .hook_mask = e->comefrom, .family = NFPROTO_IPV6, }; return xt_check_target(&par, t->u.target_size - sizeof(*t), e->ipv6.proto, e->ipv6.invflags & IP6T_INV_PROTO); } static int find_check_entry(struct ip6t_entry *e, struct net *net, const char *name, unsigned int size, struct xt_percpu_counter_alloc_state *alloc_state) { struct xt_entry_target *t; struct xt_target *target; int ret; unsigned int j; struct xt_mtchk_param mtpar; struct xt_entry_match *ematch; if (!xt_percpu_counter_alloc(alloc_state, &e->counters)) return -ENOMEM; j = 0; memset(&mtpar, 0, sizeof(mtpar)); mtpar.net = net; mtpar.table = name; mtpar.entryinfo = &e->ipv6; mtpar.hook_mask = e->comefrom; mtpar.family = NFPROTO_IPV6; xt_ematch_foreach(ematch, e) { ret = find_check_match(ematch, &mtpar); if (ret != 0) goto cleanup_matches; ++j; } t = ip6t_get_target(e); target = xt_request_find_target(NFPROTO_IPV6, t->u.user.name, t->u.user.revision); if (IS_ERR(target)) { ret = PTR_ERR(target); goto cleanup_matches; } t->u.kernel.target = target; ret = check_target(e, net, name); if (ret) goto err; return 0; err: module_put(t->u.kernel.target->me); cleanup_matches: xt_ematch_foreach(ematch, e) { if (j-- == 0) break; cleanup_match(ematch, net); } xt_percpu_counter_free(&e->counters); return ret; } static bool check_underflow(const struct ip6t_entry *e) { const struct xt_entry_target *t; unsigned int verdict; if (!unconditional(e)) return false; t = ip6t_get_target_c(e); if (strcmp(t->u.user.name, XT_STANDARD_TARGET) != 0) return false; verdict = ((struct xt_standard_target *)t)->verdict; verdict = -verdict - 1; return verdict == NF_DROP || verdict == NF_ACCEPT; } static int check_entry_size_and_hooks(struct ip6t_entry *e, struct xt_table_info *newinfo, const unsigned char *base, const unsigned char *limit, const unsigned int *hook_entries, const unsigned int *underflows, unsigned int valid_hooks) { unsigned int h; int err; if ((unsigned long)e % __alignof__(struct ip6t_entry) != 0 || (unsigned char *)e + sizeof(struct ip6t_entry) >= limit || (unsigned char *)e + e->next_offset > limit) return -EINVAL; if (e->next_offset < sizeof(struct ip6t_entry) + sizeof(struct xt_entry_target)) return -EINVAL; if (!ip6_checkentry(&e->ipv6)) return -EINVAL; err = xt_check_entry_offsets(e, e->elems, e->target_offset, e->next_offset); if (err) return err; /* Check hooks & underflows */ for (h = 0; h < NF_INET_NUMHOOKS; h++) { if (!(valid_hooks & (1 << h))) continue; if ((unsigned char *)e - base == hook_entries[h]) newinfo->hook_entry[h] = hook_entries[h]; if ((unsigned char *)e - base == underflows[h]) { if (!check_underflow(e)) return -EINVAL; newinfo->underflow[h] = underflows[h]; } } /* Clear counters and comefrom */ e->counters = ((struct xt_counters) { 0, 0 }); e->comefrom = 0; return 0; } static void cleanup_entry(struct ip6t_entry *e, struct net *net) { struct xt_tgdtor_param par; struct xt_entry_target *t; struct xt_entry_match *ematch; /* Cleanup all matches */ xt_ematch_foreach(ematch, e) cleanup_match(ematch, net); t = ip6t_get_target(e); par.net = net; par.target = t->u.kernel.target; par.targinfo = t->data; par.family = NFPROTO_IPV6; if (par.target->destroy != NULL) par.target->destroy(&par); module_put(par.target->me); xt_percpu_counter_free(&e->counters); } /* Checks and translates the user-supplied table segment (held in newinfo) */ static int translate_table(struct net *net, struct xt_table_info *newinfo, void *entry0, const struct ip6t_replace *repl) { struct xt_percpu_counter_alloc_state alloc_state = { 0 }; struct ip6t_entry *iter; unsigned int *offsets; unsigned int i; int ret = 0; newinfo->size = repl->size; newinfo->number = repl->num_entries; /* Init all hooks to impossible value. */ for (i = 0; i < NF_INET_NUMHOOKS; i++) { newinfo->hook_entry[i] = 0xFFFFFFFF; newinfo->underflow[i] = 0xFFFFFFFF; } offsets = xt_alloc_entry_offsets(newinfo->number); if (!offsets) return -ENOMEM; i = 0; /* Walk through entries, checking offsets. */ xt_entry_foreach(iter, entry0, newinfo->size) { ret = check_entry_size_and_hooks(iter, newinfo, entry0, entry0 + repl->size, repl->hook_entry, repl->underflow, repl->valid_hooks); if (ret != 0) goto out_free; if (i < repl->num_entries) offsets[i] = (void *)iter - entry0; ++i; if (strcmp(ip6t_get_target(iter)->u.user.name, XT_ERROR_TARGET) == 0) ++newinfo->stacksize; } ret = -EINVAL; if (i != repl->num_entries) goto out_free; ret = xt_check_table_hooks(newinfo, repl->valid_hooks); if (ret) goto out_free; if (!mark_source_chains(newinfo, repl->valid_hooks, entry0, offsets)) { ret = -ELOOP; goto out_free; } kvfree(offsets); /* Finally, each sanity check must pass */ i = 0; xt_entry_foreach(iter, entry0, newinfo->size) { ret = find_check_entry(iter, net, repl->name, repl->size, &alloc_state); if (ret != 0) break; ++i; } if (ret != 0) { xt_entry_foreach(iter, entry0, newinfo->size) { if (i-- == 0) break; cleanup_entry(iter, net); } return ret; } return ret; out_free: kvfree(offsets); return ret; } static void get_counters(const struct xt_table_info *t, struct xt_counters counters[]) { struct ip6t_entry *iter; unsigned int cpu; unsigned int i; for_each_possible_cpu(cpu) { seqcount_t *s = &per_cpu(xt_recseq, cpu); i = 0; xt_entry_foreach(iter, t->entries, t->size) { struct xt_counters *tmp; u64 bcnt, pcnt; unsigned int start; tmp = xt_get_per_cpu_counter(&iter->counters, cpu); do { start = read_seqcount_begin(s); bcnt = tmp->bcnt; pcnt = tmp->pcnt; } while (read_seqcount_retry(s, start)); ADD_COUNTER(counters[i], bcnt, pcnt); ++i; cond_resched(); } } } static void get_old_counters(const struct xt_table_info *t, struct xt_counters counters[]) { struct ip6t_entry *iter; unsigned int cpu, i; for_each_possible_cpu(cpu) { i = 0; xt_entry_foreach(iter, t->entries, t->size) { const struct xt_counters *tmp; tmp = xt_get_per_cpu_counter(&iter->counters, cpu); ADD_COUNTER(counters[i], tmp->bcnt, tmp->pcnt); ++i; } cond_resched(); } } static struct xt_counters *alloc_counters(const struct xt_table *table) { unsigned int countersize; struct xt_counters *counters; const struct xt_table_info *private = table->private; /* We need atomic snapshot of counters: rest doesn't change (other than comefrom, which userspace doesn't care about). */ countersize = sizeof(struct xt_counters) * private->number; counters = vzalloc(countersize); if (counters == NULL) return ERR_PTR(-ENOMEM); get_counters(private, counters); return counters; } static int copy_entries_to_user(unsigned int total_size, const struct xt_table *table, void __user *userptr) { unsigned int off, num; const struct ip6t_entry *e; struct xt_counters *counters; const struct xt_table_info *private = table->private; int ret = 0; const void *loc_cpu_entry; counters = alloc_counters(table); if (IS_ERR(counters)) return PTR_ERR(counters); loc_cpu_entry = private->entries; /* FIXME: use iterator macros --RR */ /* ... then go back and fix counters and names */ for (off = 0, num = 0; off < total_size; off += e->next_offset, num++){ unsigned int i; const struct xt_entry_match *m; const struct xt_entry_target *t; e = loc_cpu_entry + off; if (copy_to_user(userptr + off, e, sizeof(*e))) { ret = -EFAULT; goto free_counters; } if (copy_to_user(userptr + off + offsetof(struct ip6t_entry, counters), &counters[num], sizeof(counters[num])) != 0) { ret = -EFAULT; goto free_counters; } for (i = sizeof(struct ip6t_entry); i < e->target_offset; i += m->u.match_size) { m = (void *)e + i; if (xt_match_to_user(m, userptr + off + i)) { ret = -EFAULT; goto free_counters; } } t = ip6t_get_target_c(e); if (xt_target_to_user(t, userptr + off + e->target_offset)) { ret = -EFAULT; goto free_counters; } } free_counters: vfree(counters); return ret; } #ifdef CONFIG_NETFILTER_XTABLES_COMPAT static void compat_standard_from_user(void *dst, const void *src) { int v = *(compat_int_t *)src; if (v > 0) v += xt_compat_calc_jump(AF_INET6, v); memcpy(dst, &v, sizeof(v)); } static int compat_standard_to_user(void __user *dst, const void *src) { compat_int_t cv = *(int *)src; if (cv > 0) cv -= xt_compat_calc_jump(AF_INET6, cv); return copy_to_user(dst, &cv, sizeof(cv)) ? -EFAULT : 0; } static int compat_calc_entry(const struct ip6t_entry *e, const struct xt_table_info *info, const void *base, struct xt_table_info *newinfo) { const struct xt_entry_match *ematch; const struct xt_entry_target *t; unsigned int entry_offset; int off, i, ret; off = sizeof(struct ip6t_entry) - sizeof(struct compat_ip6t_entry); entry_offset = (void *)e - base; xt_ematch_foreach(ematch, e) off += xt_compat_match_offset(ematch->u.kernel.match); t = ip6t_get_target_c(e); off += xt_compat_target_offset(t->u.kernel.target); newinfo->size -= off; ret = xt_compat_add_offset(AF_INET6, entry_offset, off); if (ret) return ret; for (i = 0; i < NF_INET_NUMHOOKS; i++) { if (info->hook_entry[i] && (e < (struct ip6t_entry *)(base + info->hook_entry[i]))) newinfo->hook_entry[i] -= off; if (info->underflow[i] && (e < (struct ip6t_entry *)(base + info->underflow[i]))) newinfo->underflow[i] -= off; } return 0; } static int compat_table_info(const struct xt_table_info *info, struct xt_table_info *newinfo) { struct ip6t_entry *iter; const void *loc_cpu_entry; int ret; if (!newinfo || !info) return -EINVAL; /* we dont care about newinfo->entries */ memcpy(newinfo, info, offsetof(struct xt_table_info, entries)); newinfo->initial_entries = 0; loc_cpu_entry = info->entries; ret = xt_compat_init_offsets(AF_INET6, info->number); if (ret) return ret; xt_entry_foreach(iter, loc_cpu_entry, info->size) { ret = compat_calc_entry(iter, info, loc_cpu_entry, newinfo); if (ret != 0) return ret; } return 0; } #endif static int get_info(struct net *net, void __user *user, const int *len) { char name[XT_TABLE_MAXNAMELEN]; struct xt_table *t; int ret; if (*len != sizeof(struct ip6t_getinfo)) return -EINVAL; if (copy_from_user(name, user, sizeof(name)) != 0) return -EFAULT; name[XT_TABLE_MAXNAMELEN-1] = '\0'; #ifdef CONFIG_NETFILTER_XTABLES_COMPAT if (in_compat_syscall()) xt_compat_lock(AF_INET6); #endif t = xt_request_find_table_lock(net, AF_INET6, name); if (!IS_ERR(t)) { struct ip6t_getinfo info; const struct xt_table_info *private = t->private; #ifdef CONFIG_NETFILTER_XTABLES_COMPAT struct xt_table_info tmp; if (in_compat_syscall()) { ret = compat_table_info(private, &tmp); xt_compat_flush_offsets(AF_INET6); private = &tmp; } #endif memset(&info, 0, sizeof(info)); info.valid_hooks = t->valid_hooks; memcpy(info.hook_entry, private->hook_entry, sizeof(info.hook_entry)); memcpy(info.underflow, private->underflow, sizeof(info.underflow)); info.num_entries = private->number; info.size = private->size; strcpy(info.name, name); if (copy_to_user(user, &info, *len) != 0) ret = -EFAULT; else ret = 0; xt_table_unlock(t); module_put(t->me); } else ret = PTR_ERR(t); #ifdef CONFIG_NETFILTER_XTABLES_COMPAT if (in_compat_syscall()) xt_compat_unlock(AF_INET6); #endif return ret; } static int get_entries(struct net *net, struct ip6t_get_entries __user *uptr, const int *len) { int ret; struct ip6t_get_entries get; struct xt_table *t; if (*len < sizeof(get)) return -EINVAL; if (copy_from_user(&get, uptr, sizeof(get)) != 0) return -EFAULT; if (*len != sizeof(struct ip6t_get_entries) + get.size) return -EINVAL; get.name[sizeof(get.name) - 1] = '\0'; t = xt_find_table_lock(net, AF_INET6, get.name); if (!IS_ERR(t)) { struct xt_table_info *private = t->private; if (get.size == private->size) ret = copy_entries_to_user(private->size, t, uptr->entrytable); else ret = -EAGAIN; module_put(t->me); xt_table_unlock(t); } else ret = PTR_ERR(t); return ret; } static int __do_replace(struct net *net, const char *name, unsigned int valid_hooks, struct xt_table_info *newinfo, unsigned int num_counters, void __user *counters_ptr) { int ret; struct xt_table *t; struct xt_table_info *oldinfo; struct xt_counters *counters; struct ip6t_entry *iter; counters = xt_counters_alloc(num_counters); if (!counters) { ret = -ENOMEM; goto out; } t = xt_request_find_table_lock(net, AF_INET6, name); if (IS_ERR(t)) { ret = PTR_ERR(t); goto free_newinfo_counters_untrans; } /* You lied! */ if (valid_hooks != t->valid_hooks) { ret = -EINVAL; goto put_module; } oldinfo = xt_replace_table(t, num_counters, newinfo, &ret); if (!oldinfo) goto put_module; /* Update module usage count based on number of rules */ if ((oldinfo->number > oldinfo->initial_entries) || (newinfo->number <= oldinfo->initial_entries)) module_put(t->me); if ((oldinfo->number > oldinfo->initial_entries) && (newinfo->number <= oldinfo->initial_entries)) module_put(t->me); xt_table_unlock(t); get_old_counters(oldinfo, counters); /* Decrease module usage counts and free resource */ xt_entry_foreach(iter, oldinfo->entries, oldinfo->size) cleanup_entry(iter, net); xt_free_table_info(oldinfo); if (copy_to_user(counters_ptr, counters, sizeof(struct xt_counters) * num_counters) != 0) { /* Silent error, can't fail, new table is already in place */ net_warn_ratelimited("ip6tables: counters copy to user failed while replacing table\n"); } vfree(counters); return 0; put_module: module_put(t->me); xt_table_unlock(t); free_newinfo_counters_untrans: vfree(counters); out: return ret; } static int do_replace(struct net *net, sockptr_t arg, unsigned int len) { int ret; struct ip6t_replace tmp; struct xt_table_info *newinfo; void *loc_cpu_entry; struct ip6t_entry *iter; if (len < sizeof(tmp)) return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT; /* overflow check */ if (tmp.num_counters >= INT_MAX / sizeof(struct xt_counters)) return -ENOMEM; if (tmp.num_counters == 0) return -EINVAL; if ((u64)len < (u64)tmp.size + sizeof(tmp)) return -EINVAL; tmp.name[sizeof(tmp.name)-1] = 0; newinfo = xt_alloc_table_info(tmp.size); if (!newinfo) return -ENOMEM; loc_cpu_entry = newinfo->entries; if (copy_from_sockptr_offset(loc_cpu_entry, arg, sizeof(tmp), tmp.size) != 0) { ret = -EFAULT; goto free_newinfo; } ret = translate_table(net, newinfo, loc_cpu_entry, &tmp); if (ret != 0) goto free_newinfo; ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo, tmp.num_counters, tmp.counters); if (ret) goto free_newinfo_untrans; return 0; free_newinfo_untrans: xt_entry_foreach(iter, loc_cpu_entry, newinfo->size) cleanup_entry(iter, net); free_newinfo: xt_free_table_info(newinfo); return ret; } static int do_add_counters(struct net *net, sockptr_t arg, unsigned int len) { unsigned int i; struct xt_counters_info tmp; struct xt_counters *paddc; struct xt_table *t; const struct xt_table_info *private; int ret = 0; struct ip6t_entry *iter; unsigned int addend; paddc = xt_copy_counters(arg, len, &tmp); if (IS_ERR(paddc)) return PTR_ERR(paddc); t = xt_find_table_lock(net, AF_INET6, tmp.name); if (IS_ERR(t)) { ret = PTR_ERR(t); goto free; } local_bh_disable(); private = t->private; if (private->number != tmp.num_counters) { ret = -EINVAL; goto unlock_up_free; } i = 0; addend = xt_write_recseq_begin(); xt_entry_foreach(iter, private->entries, private->size) { struct xt_counters *tmp; tmp = xt_get_this_cpu_counter(&iter->counters); ADD_COUNTER(*tmp, paddc[i].bcnt, paddc[i].pcnt); ++i; } xt_write_recseq_end(addend); unlock_up_free: local_bh_enable(); xt_table_unlock(t); module_put(t->me); free: vfree(paddc); return ret; } #ifdef CONFIG_NETFILTER_XTABLES_COMPAT struct compat_ip6t_replace { char name[XT_TABLE_MAXNAMELEN]; u32 valid_hooks; u32 num_entries; u32 size; u32 hook_entry[NF_INET_NUMHOOKS]; u32 underflow[NF_INET_NUMHOOKS]; u32 num_counters; compat_uptr_t counters; /* struct xt_counters * */ struct compat_ip6t_entry entries[]; }; static int compat_copy_entry_to_user(struct ip6t_entry *e, void __user **dstptr, unsigned int *size, struct xt_counters *counters, unsigned int i) { struct xt_entry_target *t; struct compat_ip6t_entry __user *ce; u_int16_t target_offset, next_offset; compat_uint_t origsize; const struct xt_entry_match *ematch; int ret = 0; origsize = *size; ce = *dstptr; if (copy_to_user(ce, e, sizeof(struct ip6t_entry)) != 0 || copy_to_user(&ce->counters, &counters[i], sizeof(counters[i])) != 0) return -EFAULT; *dstptr += sizeof(struct compat_ip6t_entry); *size -= sizeof(struct ip6t_entry) - sizeof(struct compat_ip6t_entry); xt_ematch_foreach(ematch, e) { ret = xt_compat_match_to_user(ematch, dstptr, size); if (ret != 0) return ret; } target_offset = e->target_offset - (origsize - *size); t = ip6t_get_target(e); ret = xt_compat_target_to_user(t, dstptr, size); if (ret) return ret; next_offset = e->next_offset - (origsize - *size); if (put_user(target_offset, &ce->target_offset) != 0 || put_user(next_offset, &ce->next_offset) != 0) return -EFAULT; return 0; } static int compat_find_calc_match(struct xt_entry_match *m, const struct ip6t_ip6 *ipv6, int *size) { struct xt_match *match; match = xt_request_find_match(NFPROTO_IPV6, m->u.user.name, m->u.user.revision); if (IS_ERR(match)) return PTR_ERR(match); m->u.kernel.match = match; *size += xt_compat_match_offset(match); return 0; } static void compat_release_entry(struct compat_ip6t_entry *e) { struct xt_entry_target *t; struct xt_entry_match *ematch; /* Cleanup all matches */ xt_ematch_foreach(ematch, e) module_put(ematch->u.kernel.match->me); t = compat_ip6t_get_target(e); module_put(t->u.kernel.target->me); } static int check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, struct xt_table_info *newinfo, unsigned int *size, const unsigned char *base, const unsigned char *limit) { struct xt_entry_match *ematch; struct xt_entry_target *t; struct xt_target *target; unsigned int entry_offset; unsigned int j; int ret, off; if ((unsigned long)e % __alignof__(struct compat_ip6t_entry) != 0 || (unsigned char *)e + sizeof(struct compat_ip6t_entry) >= limit || (unsigned char *)e + e->next_offset > limit) return -EINVAL; if (e->next_offset < sizeof(struct compat_ip6t_entry) + sizeof(struct compat_xt_entry_target)) return -EINVAL; if (!ip6_checkentry(&e->ipv6)) return -EINVAL; ret = xt_compat_check_entry_offsets(e, e->elems, e->target_offset, e->next_offset); if (ret) return ret; off = sizeof(struct ip6t_entry) - sizeof(struct compat_ip6t_entry); entry_offset = (void *)e - (void *)base; j = 0; xt_ematch_foreach(ematch, e) { ret = compat_find_calc_match(ematch, &e->ipv6, &off); if (ret != 0) goto release_matches; ++j; } t = compat_ip6t_get_target(e); target = xt_request_find_target(NFPROTO_IPV6, t->u.user.name, t->u.user.revision); if (IS_ERR(target)) { ret = PTR_ERR(target); goto release_matches; } t->u.kernel.target = target; off += xt_compat_target_offset(target); *size += off; ret = xt_compat_add_offset(AF_INET6, entry_offset, off); if (ret) goto out; return 0; out: module_put(t->u.kernel.target->me); release_matches: xt_ematch_foreach(ematch, e) { if (j-- == 0) break; module_put(ematch->u.kernel.match->me); } return ret; } static void compat_copy_entry_from_user(struct compat_ip6t_entry *e, void **dstptr, unsigned int *size, struct xt_table_info *newinfo, unsigned char *base) { struct xt_entry_target *t; struct ip6t_entry *de; unsigned int origsize; int h; struct xt_entry_match *ematch; origsize = *size; de = *dstptr; memcpy(de, e, sizeof(struct ip6t_entry)); memcpy(&de->counters, &e->counters, sizeof(e->counters)); *dstptr += sizeof(struct ip6t_entry); *size += sizeof(struct ip6t_entry) - sizeof(struct compat_ip6t_entry); xt_ematch_foreach(ematch, e) xt_compat_match_from_user(ematch, dstptr, size); de->target_offset = e->target_offset - (origsize - *size); t = compat_ip6t_get_target(e); xt_compat_target_from_user(t, dstptr, size); de->next_offset = e->next_offset - (origsize - *size); for (h = 0; h < NF_INET_NUMHOOKS; h++) { if ((unsigned char *)de - base < newinfo->hook_entry[h]) newinfo->hook_entry[h] -= origsize - *size; if ((unsigned char *)de - base < newinfo->underflow[h]) newinfo->underflow[h] -= origsize - *size; } } static int translate_compat_table(struct net *net, struct xt_table_info **pinfo, void **pentry0, const struct compat_ip6t_replace *compatr) { unsigned int i, j; struct xt_table_info *newinfo, *info; void *pos, *entry0, *entry1; struct compat_ip6t_entry *iter0; struct ip6t_replace repl; unsigned int size; int ret; info = *pinfo; entry0 = *pentry0; size = compatr->size; info->number = compatr->num_entries; j = 0; xt_compat_lock(AF_INET6); ret = xt_compat_init_offsets(AF_INET6, compatr->num_entries); if (ret) goto out_unlock; /* Walk through entries, checking offsets. */ xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, entry0, entry0 + compatr->size); if (ret != 0) goto out_unlock; ++j; } ret = -EINVAL; if (j != compatr->num_entries) goto out_unlock; ret = -ENOMEM; newinfo = xt_alloc_table_info(size); if (!newinfo) goto out_unlock; memset(newinfo->entries, 0, size); newinfo->number = compatr->num_entries; for (i = 0; i < NF_INET_NUMHOOKS; i++) { newinfo->hook_entry[i] = compatr->hook_entry[i]; newinfo->underflow[i] = compatr->underflow[i]; } entry1 = newinfo->entries; pos = entry1; size = compatr->size; xt_entry_foreach(iter0, entry0, compatr->size) compat_copy_entry_from_user(iter0, &pos, &size, newinfo, entry1); /* all module references in entry0 are now gone. */ xt_compat_flush_offsets(AF_INET6); xt_compat_unlock(AF_INET6); memcpy(&repl, compatr, sizeof(*compatr)); for (i = 0; i < NF_INET_NUMHOOKS; i++) { repl.hook_entry[i] = newinfo->hook_entry[i]; repl.underflow[i] = newinfo->underflow[i]; } repl.num_counters = 0; repl.counters = NULL; repl.size = newinfo->size; ret = translate_table(net, newinfo, entry1, &repl); if (ret) goto free_newinfo; *pinfo = newinfo; *pentry0 = entry1; xt_free_table_info(info); return 0; free_newinfo: xt_free_table_info(newinfo); return ret; out_unlock: xt_compat_flush_offsets(AF_INET6); xt_compat_unlock(AF_INET6); xt_entry_foreach(iter0, entry0, compatr->size) { if (j-- == 0) break; compat_release_entry(iter0); } return ret; } static int compat_do_replace(struct net *net, sockptr_t arg, unsigned int len) { int ret; struct compat_ip6t_replace tmp; struct xt_table_info *newinfo; void *loc_cpu_entry; struct ip6t_entry *iter; if (len < sizeof(tmp)) return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT; /* overflow check */ if (tmp.num_counters >= INT_MAX / sizeof(struct xt_counters)) return -ENOMEM; if (tmp.num_counters == 0) return -EINVAL; if ((u64)len < (u64)tmp.size + sizeof(tmp)) return -EINVAL; tmp.name[sizeof(tmp.name)-1] = 0; newinfo = xt_alloc_table_info(tmp.size); if (!newinfo) return -ENOMEM; loc_cpu_entry = newinfo->entries; if (copy_from_sockptr_offset(loc_cpu_entry, arg, sizeof(tmp), tmp.size) != 0) { ret = -EFAULT; goto free_newinfo; } ret = translate_compat_table(net, &newinfo, &loc_cpu_entry, &tmp); if (ret != 0) goto free_newinfo; ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo, tmp.num_counters, compat_ptr(tmp.counters)); if (ret) goto free_newinfo_untrans; return 0; free_newinfo_untrans: xt_entry_foreach(iter, loc_cpu_entry, newinfo->size) cleanup_entry(iter, net); free_newinfo: xt_free_table_info(newinfo); return ret; } struct compat_ip6t_get_entries { char name[XT_TABLE_MAXNAMELEN]; compat_uint_t size; struct compat_ip6t_entry entrytable[]; }; static int compat_copy_entries_to_user(unsigned int total_size, struct xt_table *table, void __user *userptr) { struct xt_counters *counters; const struct xt_table_info *private = table->private; void __user *pos; unsigned int size; int ret = 0; unsigned int i = 0; struct ip6t_entry *iter; counters = alloc_counters(table); if (IS_ERR(counters)) return PTR_ERR(counters); pos = userptr; size = total_size; xt_entry_foreach(iter, private->entries, total_size) { ret = compat_copy_entry_to_user(iter, &pos, &size, counters, i++); if (ret != 0) break; } vfree(counters); return ret; } static int compat_get_entries(struct net *net, struct compat_ip6t_get_entries __user *uptr, int *len) { int ret; struct compat_ip6t_get_entries get; struct xt_table *t; if (*len < sizeof(get)) return -EINVAL; if (copy_from_user(&get, uptr, sizeof(get)) != 0) return -EFAULT; if (*len != sizeof(struct compat_ip6t_get_entries) + get.size) return -EINVAL; get.name[sizeof(get.name) - 1] = '\0'; xt_compat_lock(AF_INET6); t = xt_find_table_lock(net, AF_INET6, get.name); if (!IS_ERR(t)) { const struct xt_table_info *private = t->private; struct xt_table_info info; ret = compat_table_info(private, &info); if (!ret && get.size == info.size) ret = compat_copy_entries_to_user(private->size, t, uptr->entrytable); else if (!ret) ret = -EAGAIN; xt_compat_flush_offsets(AF_INET6); module_put(t->me); xt_table_unlock(t); } else ret = PTR_ERR(t); xt_compat_unlock(AF_INET6); return ret; } #endif static int do_ip6t_set_ctl(struct sock *sk, int cmd, sockptr_t arg, unsigned int len) { int ret; if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) return -EPERM; switch (cmd) { case IP6T_SO_SET_REPLACE: #ifdef CONFIG_NETFILTER_XTABLES_COMPAT if (in_compat_syscall()) ret = compat_do_replace(sock_net(sk), arg, len); else #endif ret = do_replace(sock_net(sk), arg, len); break; case IP6T_SO_SET_ADD_COUNTERS: ret = do_add_counters(sock_net(sk), arg, len); break; default: ret = -EINVAL; } return ret; } static int do_ip6t_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) { int ret; if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) return -EPERM; switch (cmd) { case IP6T_SO_GET_INFO: ret = get_info(sock_net(sk), user, len); break; case IP6T_SO_GET_ENTRIES: #ifdef CONFIG_NETFILTER_XTABLES_COMPAT if (in_compat_syscall()) ret = compat_get_entries(sock_net(sk), user, len); else #endif ret = get_entries(sock_net(sk), user, len); break; case IP6T_SO_GET_REVISION_MATCH: case IP6T_SO_GET_REVISION_TARGET: { struct xt_get_revision rev; int target; if (*len != sizeof(rev)) { ret = -EINVAL; break; } if (copy_from_user(&rev, user, sizeof(rev)) != 0) { ret = -EFAULT; break; } rev.name[sizeof(rev.name)-1] = 0; if (cmd == IP6T_SO_GET_REVISION_TARGET) target = 1; else target = 0; try_then_request_module(xt_find_revision(AF_INET6, rev.name, rev.revision, target, &ret), "ip6t_%s", rev.name); break; } default: ret = -EINVAL; } return ret; } static void __ip6t_unregister_table(struct net *net, struct xt_table *table) { struct xt_table_info *private; void *loc_cpu_entry; struct module *table_owner = table->me; struct ip6t_entry *iter; private = xt_unregister_table(table); /* Decrease module usage counts and free resources */ loc_cpu_entry = private->entries; xt_entry_foreach(iter, loc_cpu_entry, private->size) cleanup_entry(iter, net); if (private->number > private->initial_entries) module_put(table_owner); xt_free_table_info(private); } int ip6t_register_table(struct net *net, const struct xt_table *table, const struct ip6t_replace *repl, const struct nf_hook_ops *template_ops) { struct nf_hook_ops *ops; unsigned int num_ops; int ret, i; struct xt_table_info *newinfo; struct xt_table_info bootstrap = {0}; void *loc_cpu_entry; struct xt_table *new_table; newinfo = xt_alloc_table_info(repl->size); if (!newinfo) return -ENOMEM; loc_cpu_entry = newinfo->entries; memcpy(loc_cpu_entry, repl->entries, repl->size); ret = translate_table(net, newinfo, loc_cpu_entry, repl); if (ret != 0) { xt_free_table_info(newinfo); return ret; } new_table = xt_register_table(net, table, &bootstrap, newinfo); if (IS_ERR(new_table)) { struct ip6t_entry *iter; xt_entry_foreach(iter, loc_cpu_entry, newinfo->size) cleanup_entry(iter, net); xt_free_table_info(newinfo); return PTR_ERR(new_table); } if (!template_ops) return 0; num_ops = hweight32(table->valid_hooks); if (num_ops == 0) { ret = -EINVAL; goto out_free; } ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); if (!ops) { ret = -ENOMEM; goto out_free; } for (i = 0; i < num_ops; i++) ops[i].priv = new_table; new_table->ops = ops; ret = nf_register_net_hooks(net, ops, num_ops); if (ret != 0) goto out_free; return ret; out_free: __ip6t_unregister_table(net, new_table); return ret; } void ip6t_unregister_table_pre_exit(struct net *net, const char *name) { struct xt_table *table = xt_find_table(net, NFPROTO_IPV6, name); if (table) nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } void ip6t_unregister_table_exit(struct net *net, const char *name) { struct xt_table *table = xt_find_table(net, NFPROTO_IPV6, name); if (table) __ip6t_unregister_table(net, table); } /* The built-in targets: standard (NULL) and error. */ static struct xt_target ip6t_builtin_tg[] __read_mostly = { { .name = XT_STANDARD_TARGET, .targetsize = sizeof(int), .family = NFPROTO_IPV6, #ifdef CONFIG_NETFILTER_XTABLES_COMPAT .compatsize = sizeof(compat_int_t), .compat_from_user = compat_standard_from_user, .compat_to_user = compat_standard_to_user, #endif }, { .name = XT_ERROR_TARGET, .target = ip6t_error, .targetsize = XT_FUNCTION_MAXNAMELEN, .family = NFPROTO_IPV6, }, }; static struct nf_sockopt_ops ip6t_sockopts = { .pf = PF_INET6, .set_optmin = IP6T_BASE_CTL, .set_optmax = IP6T_SO_SET_MAX+1, .set = do_ip6t_set_ctl, .get_optmin = IP6T_BASE_CTL, .get_optmax = IP6T_SO_GET_MAX+1, .get = do_ip6t_get_ctl, .owner = THIS_MODULE, }; static int __net_init ip6_tables_net_init(struct net *net) { return xt_proto_init(net, NFPROTO_IPV6); } static void __net_exit ip6_tables_net_exit(struct net *net) { xt_proto_fini(net, NFPROTO_IPV6); } static struct pernet_operations ip6_tables_net_ops = { .init = ip6_tables_net_init, .exit = ip6_tables_net_exit, }; static int __init ip6_tables_init(void) { int ret; ret = register_pernet_subsys(&ip6_tables_net_ops); if (ret < 0) goto err1; /* No one else will be downing sem now, so we won't sleep */ ret = xt_register_targets(ip6t_builtin_tg, ARRAY_SIZE(ip6t_builtin_tg)); if (ret < 0) goto err2; /* Register setsockopt */ ret = nf_register_sockopt(&ip6t_sockopts); if (ret < 0) goto err4; return 0; err4: xt_unregister_targets(ip6t_builtin_tg, ARRAY_SIZE(ip6t_builtin_tg)); err2: unregister_pernet_subsys(&ip6_tables_net_ops); err1: return ret; } static void __exit ip6_tables_fini(void) { nf_unregister_sockopt(&ip6t_sockopts); xt_unregister_targets(ip6t_builtin_tg, ARRAY_SIZE(ip6t_builtin_tg)); unregister_pernet_subsys(&ip6_tables_net_ops); } EXPORT_SYMBOL(ip6t_register_table); EXPORT_SYMBOL(ip6t_unregister_table_pre_exit); EXPORT_SYMBOL(ip6t_unregister_table_exit); EXPORT_SYMBOL(ip6t_do_table); module_init(ip6_tables_init); module_exit(ip6_tables_fini); |
| 1 1 4 9 4 22 22 20 4 8 6 5 1 32 33 33 23 22 22 22 22 22 22 22 4 5 1 1 1 1 1 1 10 22 22 22 33 10 12 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 | // SPDX-License-Identifier: GPL-2.0 #include <linux/kernel.h> #include <linux/syscalls.h> #include <linux/fdtable.h> #include <linux/string.h> #include <linux/random.h> #include <linux/module.h> #include <linux/ptrace.h> #include <linux/init.h> #include <linux/errno.h> #include <linux/cache.h> #include <linux/bug.h> #include <linux/err.h> #include <linux/kcmp.h> #include <linux/capability.h> #include <linux/list.h> #include <linux/eventpoll.h> #include <linux/file.h> #include <asm/unistd.h> /* * We don't expose the real in-memory order of objects for security reasons. * But still the comparison results should be suitable for sorting. So we * obfuscate kernel pointers values and compare the production instead. * * The obfuscation is done in two steps. First we xor the kernel pointer with * a random value, which puts pointer into a new position in a reordered space. * Secondly we multiply the xor production with a large odd random number to * permute its bits even more (the odd multiplier guarantees that the product * is unique ever after the high bits are truncated, since any odd number is * relative prime to 2^n). * * Note also that the obfuscation itself is invisible to userspace and if needed * it can be changed to an alternate scheme. */ static unsigned long cookies[KCMP_TYPES][2] __read_mostly; static long kptr_obfuscate(long v, int type) { return (v ^ cookies[type][0]) * cookies[type][1]; } /* * 0 - equal, i.e. v1 = v2 * 1 - less than, i.e. v1 < v2 * 2 - greater than, i.e. v1 > v2 * 3 - not equal but ordering unavailable (reserved for future) */ static int kcmp_ptr(void *v1, void *v2, enum kcmp_type type) { long t1, t2; t1 = kptr_obfuscate((long)v1, type); t2 = kptr_obfuscate((long)v2, type); return (t1 < t2) | ((t1 > t2) << 1); } /* The caller must have pinned the task */ static struct file * get_file_raw_ptr(struct task_struct *task, unsigned int idx) { struct file *file; file = fget_task(task, idx); if (file) fput(file); return file; } static void kcmp_unlock(struct rw_semaphore *l1, struct rw_semaphore *l2) { if (likely(l2 != l1)) up_read(l2); up_read(l1); } static int kcmp_lock(struct rw_semaphore *l1, struct rw_semaphore *l2) { int err; if (l2 > l1) swap(l1, l2); err = down_read_killable(l1); if (!err && likely(l1 != l2)) { err = down_read_killable_nested(l2, SINGLE_DEPTH_NESTING); if (err) up_read(l1); } return err; } #ifdef CONFIG_EPOLL static int kcmp_epoll_target(struct task_struct *task1, struct task_struct *task2, unsigned long idx1, struct kcmp_epoll_slot __user *uslot) { struct file *filp, *filp_epoll, *filp_tgt; struct kcmp_epoll_slot slot; if (copy_from_user(&slot, uslot, sizeof(slot))) return -EFAULT; filp = get_file_raw_ptr(task1, idx1); if (!filp) return -EBADF; filp_epoll = fget_task(task2, slot.efd); if (!filp_epoll) return -EBADF; filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); fput(filp_epoll); if (IS_ERR(filp_tgt)) return PTR_ERR(filp_tgt); return kcmp_ptr(filp, filp_tgt, KCMP_FILE); } #else static int kcmp_epoll_target(struct task_struct *task1, struct task_struct *task2, unsigned long idx1, struct kcmp_epoll_slot __user *uslot) { return -EOPNOTSUPP; } #endif SYSCALL_DEFINE5(kcmp, pid_t, pid1, pid_t, pid2, int, type, unsigned long, idx1, unsigned long, idx2) { struct task_struct *task1, *task2; int ret; rcu_read_lock(); /* * Tasks are looked up in caller's PID namespace only. */ task1 = find_task_by_vpid(pid1); task2 = find_task_by_vpid(pid2); if (unlikely(!task1 || !task2)) goto err_no_task; get_task_struct(task1); get_task_struct(task2); rcu_read_unlock(); /* * One should have enough rights to inspect task details. */ ret = kcmp_lock(&task1->signal->exec_update_lock, &task2->signal->exec_update_lock); if (ret) goto err; if (!ptrace_may_access(task1, PTRACE_MODE_READ_REALCREDS) || !ptrace_may_access(task2, PTRACE_MODE_READ_REALCREDS)) { ret = -EPERM; goto err_unlock; } switch (type) { case KCMP_FILE: { struct file *filp1, *filp2; filp1 = get_file_raw_ptr(task1, idx1); filp2 = get_file_raw_ptr(task2, idx2); if (filp1 && filp2) ret = kcmp_ptr(filp1, filp2, KCMP_FILE); else ret = -EBADF; break; } case KCMP_VM: ret = kcmp_ptr(task1->mm, task2->mm, KCMP_VM); break; case KCMP_FILES: ret = kcmp_ptr(task1->files, task2->files, KCMP_FILES); break; case KCMP_FS: ret = kcmp_ptr(task1->fs, task2->fs, KCMP_FS); break; case KCMP_SIGHAND: ret = kcmp_ptr(task1->sighand, task2->sighand, KCMP_SIGHAND); break; case KCMP_IO: ret = kcmp_ptr(task1->io_context, task2->io_context, KCMP_IO); break; case KCMP_SYSVSEM: #ifdef CONFIG_SYSVIPC ret = kcmp_ptr(task1->sysvsem.undo_list, task2->sysvsem.undo_list, KCMP_SYSVSEM); #else ret = -EOPNOTSUPP; #endif break; case KCMP_EPOLL_TFD: ret = kcmp_epoll_target(task1, task2, idx1, (void *)idx2); break; default: ret = -EINVAL; break; } err_unlock: kcmp_unlock(&task1->signal->exec_update_lock, &task2->signal->exec_update_lock); err: put_task_struct(task1); put_task_struct(task2); return ret; err_no_task: rcu_read_unlock(); return -ESRCH; } static __init int kcmp_cookies_init(void) { int i; get_random_bytes(cookies, sizeof(cookies)); for (i = 0; i < KCMP_TYPES; i++) cookies[i][1] |= (~(~0UL >> 1) | 1); return 0; } arch_initcall(kcmp_cookies_init); |
| 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (C) 2016 Parav Pandit <pandit.parav@gmail.com> */ #include "core_priv.h" /** * ib_device_register_rdmacg - register with rdma cgroup. * @device: device to register to participate in resource * accounting by rdma cgroup. * * Register with the rdma cgroup. Should be called before * exposing rdma device to user space applications to avoid * resource accounting leak. */ void ib_device_register_rdmacg(struct ib_device *device) { device->cg_device.name = device->name; rdmacg_register_device(&device->cg_device); } /** * ib_device_unregister_rdmacg - unregister with rdma cgroup. * @device: device to unregister. * * Unregister with the rdma cgroup. Should be called after * all the resources are deallocated, and after a stage when any * other resource allocation by user application cannot be done * for this device to avoid any leak in accounting. */ void ib_device_unregister_rdmacg(struct ib_device *device) { rdmacg_unregister_device(&device->cg_device); } int ib_rdmacg_try_charge(struct ib_rdmacg_object *cg_obj, struct ib_device *device, enum rdmacg_resource_type resource_index) { return rdmacg_try_charge(&cg_obj->cg, &device->cg_device, resource_index); } EXPORT_SYMBOL(ib_rdmacg_try_charge); void ib_rdmacg_uncharge(struct ib_rdmacg_object *cg_obj, struct ib_device *device, enum rdmacg_resource_type resource_index) { rdmacg_uncharge(cg_obj->cg, &device->cg_device, resource_index); } EXPORT_SYMBOL(ib_rdmacg_uncharge); |
| 2 24 22 22 21 21 11 9 9 9 6 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | /* SPDX-License-Identifier: GPL-2.0-only */ /* * V9FS FID Management * * Copyright (C) 2005 by Eric Van Hensbergen <ericvh@gmail.com> */ #ifndef FS_9P_FID_H #define FS_9P_FID_H #include <linux/list.h> #include "v9fs.h" struct p9_fid *v9fs_fid_find_inode(struct inode *inode, bool want_writeable, kuid_t uid, bool any); struct p9_fid *v9fs_fid_lookup(struct dentry *dentry); static inline struct p9_fid *v9fs_parent_fid(struct dentry *dentry) { return v9fs_fid_lookup(dentry->d_parent); } void v9fs_fid_add(struct dentry *dentry, struct p9_fid **fid); void v9fs_open_fid_add(struct inode *inode, struct p9_fid **fid); static inline struct p9_fid *clone_fid(struct p9_fid *fid) { return IS_ERR(fid) ? fid : p9_client_walk(fid, 0, NULL, 1); } static inline struct p9_fid *v9fs_fid_clone(struct dentry *dentry) { struct p9_fid *fid, *nfid; fid = v9fs_fid_lookup(dentry); if (!fid || IS_ERR(fid)) return fid; nfid = clone_fid(fid); p9_fid_put(fid); return nfid; } /** * v9fs_fid_addmodes - add cache flags to fid mode (for client use only) * @fid: fid to augment * @s_flags: session info mount flags * @s_cache: session info cache flags * @f_flags: unix open flags * * make sure mode reflects flags of underlying mounts * also qid.version == 0 reflects a synthetic or legacy file system * NOTE: these are set after open so only reflect 9p client not * underlying file system on server. */ static inline void v9fs_fid_add_modes(struct p9_fid *fid, unsigned int s_flags, unsigned int s_cache, unsigned int f_flags) { if ((!s_cache) || ((fid->qid.version == 0) && !(s_flags & V9FS_IGNORE_QV)) || (s_flags & V9FS_DIRECT_IO) || (f_flags & O_DIRECT)) { fid->mode |= P9L_DIRECT; /* no read or write cache */ } else if ((!(s_cache & CACHE_WRITEBACK)) || (f_flags & O_DSYNC) || (s_flags & V9FS_SYNC)) { fid->mode |= P9L_NOWRITECACHE; } } #endif |
| 16 16 16 15 14 1 14 13 13 13 13 6 13 13 13 5 5 5 5 5 5 5 5 5 3 3 2 5 5 5 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 | // SPDX-License-Identifier: GPL-2.0-or-later /* * IPV6 GSO/GRO offload support * Linux INET6 implementation * * UDPv6 GSO support */ #include <linux/skbuff.h> #include <linux/netdevice.h> #include <linux/indirect_call_wrapper.h> #include <net/protocol.h> #include <net/ipv6.h> #include <net/udp.h> #include <net/ip6_checksum.h> #include "ip6_offload.h" #include <net/gro.h> #include <net/gso.h> static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb, netdev_features_t features) { struct sk_buff *segs = ERR_PTR(-EINVAL); unsigned int mss; unsigned int unfrag_ip6hlen, unfrag_len; struct frag_hdr *fptr; u8 *packet_start, *prevhdr; u8 nexthdr; u8 frag_hdr_sz = sizeof(struct frag_hdr); __wsum csum; int tnl_hlen; int err; if (skb->encapsulation && skb_shinfo(skb)->gso_type & (SKB_GSO_UDP_TUNNEL|SKB_GSO_UDP_TUNNEL_CSUM)) segs = skb_udp_tunnel_segment(skb, features, true); else { const struct ipv6hdr *ipv6h; struct udphdr *uh; if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_UDP | SKB_GSO_UDP_L4))) goto out; if (!pskb_may_pull(skb, sizeof(struct udphdr))) goto out; if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) return __udp_gso_segment(skb, features, true); mss = skb_shinfo(skb)->gso_size; if (unlikely(skb->len <= mss)) goto out; /* Do software UFO. Complete and fill in the UDP checksum as HW cannot * do checksum of UDP packets sent as multiple IP fragments. */ uh = udp_hdr(skb); ipv6h = ipv6_hdr(skb); uh->check = 0; csum = skb_checksum(skb, 0, skb->len, 0); uh->check = udp_v6_check(skb->len, &ipv6h->saddr, &ipv6h->daddr, csum); if (uh->check == 0) uh->check = CSUM_MANGLED_0; skb->ip_summed = CHECKSUM_UNNECESSARY; /* If there is no outer header we can fake a checksum offload * due to the fact that we have already done the checksum in * software prior to segmenting the frame. */ if (!skb->encap_hdr_csum) features |= NETIF_F_HW_CSUM; /* Check if there is enough headroom to insert fragment header. */ tnl_hlen = skb_tnl_header_len(skb); if (skb->mac_header < (tnl_hlen + frag_hdr_sz)) { if (gso_pskb_expand_head(skb, tnl_hlen + frag_hdr_sz)) goto out; } /* Find the unfragmentable header and shift it left by frag_hdr_sz * bytes to insert fragment header. */ err = ip6_find_1stfragopt(skb, &prevhdr); if (err < 0) return ERR_PTR(err); unfrag_ip6hlen = err; nexthdr = *prevhdr; *prevhdr = NEXTHDR_FRAGMENT; unfrag_len = (skb_network_header(skb) - skb_mac_header(skb)) + unfrag_ip6hlen + tnl_hlen; packet_start = (u8 *) skb->head + SKB_GSO_CB(skb)->mac_offset; memmove(packet_start-frag_hdr_sz, packet_start, unfrag_len); SKB_GSO_CB(skb)->mac_offset -= frag_hdr_sz; skb->mac_header -= frag_hdr_sz; skb->network_header -= frag_hdr_sz; fptr = (struct frag_hdr *)(skb_network_header(skb) + unfrag_ip6hlen); fptr->nexthdr = nexthdr; fptr->reserved = 0; fptr->identification = ipv6_proxy_select_ident(dev_net(skb->dev), skb); /* Fragment the skb. ipv6 header and the remaining fields of the * fragment header are updated in ipv6_gso_segment() */ segs = skb_segment(skb, features); } out: return segs; } static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport, __be16 dport) { const struct ipv6hdr *iph = skb_gro_network_header(skb); struct net *net = dev_net_rcu(skb->dev); struct sock *sk; int iif, sdif; sk = udp_tunnel_sk(net, true); if (sk && dport == htons(sk->sk_num)) return sk; inet6_get_iif_sdif(skb, &iif, &sdif); return __udp6_lib_lookup(net, &iph->saddr, sport, &iph->daddr, dport, iif, sdif, net->ipv4.udp_table, NULL); } INDIRECT_CALLABLE_SCOPE struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb) { struct udphdr *uh = udp_gro_udphdr(skb); struct sock *sk = NULL; struct sk_buff *pp; if (unlikely(!uh)) goto flush; /* Don't bother verifying checksum if we're going to flush anyway. */ if (NAPI_GRO_CB(skb)->flush) goto skip; if (skb_gro_checksum_validate_zero_check(skb, IPPROTO_UDP, uh->check, ip6_gro_compute_pseudo)) goto flush; else if (uh->check) skb_gro_checksum_try_convert(skb, IPPROTO_UDP, ip6_gro_compute_pseudo); skip: NAPI_GRO_CB(skb)->is_ipv6 = 1; if (static_branch_unlikely(&udpv6_encap_needed_key)) sk = udp6_gro_lookup_skb(skb, uh->source, uh->dest); pp = udp_gro_receive(head, skb, uh, sk); return pp; flush: NAPI_GRO_CB(skb)->flush = 1; return NULL; } INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff) { const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation]; const struct ipv6hdr *ipv6h = (struct ipv6hdr *)(skb->data + offset); struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); /* do fraglist only if there is no outer UDP encap (or we already processed it) */ if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) { uh->len = htons(skb->len - nhoff); skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count; __skb_incr_checksum_unnecessary(skb); return 0; } if (uh->check) uh->check = ~udp_v6_check(skb->len - nhoff, &ipv6h->saddr, &ipv6h->daddr, 0); return udp_gro_complete(skb, nhoff, udp6_lib_lookup_skb); } int __init udpv6_offload_init(void) { net_hotdata.udpv6_offload = (struct net_offload) { .callbacks = { .gso_segment = udp6_ufo_fragment, .gro_receive = udp6_gro_receive, .gro_complete = udp6_gro_complete, }, }; return inet6_add_offload(&net_hotdata.udpv6_offload, IPPROTO_UDP); } int udpv6_offload_exit(void) { return inet6_del_offload(&net_hotdata.udpv6_offload, IPPROTO_UDP); } |
| 7 7 7 7 7 7 7 7 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 | // SPDX-License-Identifier: GPL-2.0-or-later /* * SHA-256, as specified in * http://csrc.nist.gov/groups/STM/cavp/documents/shs/sha256-384-512.pdf * * SHA-256 code by Jean-Luc Cooke <jlcooke@certainkey.com>. * * Copyright (c) Jean-Luc Cooke <jlcooke@certainkey.com> * Copyright (c) Andrew McDonald <andrew@mcdonald.org.uk> * Copyright (c) 2002 James Morris <jmorris@intercode.com.au> * Copyright (c) 2014 Red Hat Inc. */ #include <crypto/internal/sha2.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/string.h> #include <linux/unaligned.h> static const u32 SHA256_K[] = { 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2, }; static inline u32 Ch(u32 x, u32 y, u32 z) { return z ^ (x & (y ^ z)); } static inline u32 Maj(u32 x, u32 y, u32 z) { return (x & y) | (z & (x | y)); } #define e0(x) (ror32(x, 2) ^ ror32(x, 13) ^ ror32(x, 22)) #define e1(x) (ror32(x, 6) ^ ror32(x, 11) ^ ror32(x, 25)) #define s0(x) (ror32(x, 7) ^ ror32(x, 18) ^ (x >> 3)) #define s1(x) (ror32(x, 17) ^ ror32(x, 19) ^ (x >> 10)) static inline void LOAD_OP(int I, u32 *W, const u8 *input) { W[I] = get_unaligned_be32((__u32 *)input + I); } static inline void BLEND_OP(int I, u32 *W) { W[I] = s1(W[I-2]) + W[I-7] + s0(W[I-15]) + W[I-16]; } #define SHA256_ROUND(i, a, b, c, d, e, f, g, h) do { \ u32 t1, t2; \ t1 = h + e1(e) + Ch(e, f, g) + SHA256_K[i] + W[i]; \ t2 = e0(a) + Maj(a, b, c); \ d += t1; \ h = t1 + t2; \ } while (0) static void sha256_block_generic(u32 state[SHA256_STATE_WORDS], const u8 *input, u32 W[64]) { u32 a, b, c, d, e, f, g, h; int i; /* load the input */ for (i = 0; i < 16; i += 8) { LOAD_OP(i + 0, W, input); LOAD_OP(i + 1, W, input); LOAD_OP(i + 2, W, input); LOAD_OP(i + 3, W, input); LOAD_OP(i + 4, W, input); LOAD_OP(i + 5, W, input); LOAD_OP(i + 6, W, input); LOAD_OP(i + 7, W, input); } /* now blend */ for (i = 16; i < 64; i += 8) { BLEND_OP(i + 0, W); BLEND_OP(i + 1, W); BLEND_OP(i + 2, W); BLEND_OP(i + 3, W); BLEND_OP(i + 4, W); BLEND_OP(i + 5, W); BLEND_OP(i + 6, W); BLEND_OP(i + 7, W); } /* load the state into our registers */ a = state[0]; b = state[1]; c = state[2]; d = state[3]; e = state[4]; f = state[5]; g = state[6]; h = state[7]; /* now iterate */ for (i = 0; i < 64; i += 8) { SHA256_ROUND(i + 0, a, b, c, d, e, f, g, h); SHA256_ROUND(i + 1, h, a, b, c, d, e, f, g); SHA256_ROUND(i + 2, g, h, a, b, c, d, e, f); SHA256_ROUND(i + 3, f, g, h, a, b, c, d, e); SHA256_ROUND(i + 4, e, f, g, h, a, b, c, d); SHA256_ROUND(i + 5, d, e, f, g, h, a, b, c); SHA256_ROUND(i + 6, c, d, e, f, g, h, a, b); SHA256_ROUND(i + 7, b, c, d, e, f, g, h, a); } state[0] += a; state[1] += b; state[2] += c; state[3] += d; state[4] += e; state[5] += f; state[6] += g; state[7] += h; } void sha256_blocks_generic(u32 state[SHA256_STATE_WORDS], const u8 *data, size_t nblocks) { u32 W[64]; do { sha256_block_generic(state, data, W); data += SHA256_BLOCK_SIZE; } while (--nblocks); memzero_explicit(W, sizeof(W)); } EXPORT_SYMBOL_GPL(sha256_blocks_generic); MODULE_DESCRIPTION("SHA-256 Algorithm (generic implementation)"); MODULE_LICENSE("GPL"); |
| 10 10 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | // SPDX-License-Identifier: GPL-2.0-only /* * * Authors: * Alexander Aring <aar@pengutronix.de> * * Based on: net/wireless/sysfs.c */ #include <linux/device.h> #include <linux/rtnetlink.h> #include <net/cfg802154.h> #include "core.h" #include "sysfs.h" #include "rdev-ops.h" static inline struct cfg802154_registered_device * dev_to_rdev(struct device *dev) { return container_of(dev, struct cfg802154_registered_device, wpan_phy.dev); } #define SHOW_FMT(name, fmt, member) \ static ssize_t name ## _show(struct device *dev, \ struct device_attribute *attr, \ char *buf) \ { \ return sprintf(buf, fmt "\n", dev_to_rdev(dev)->member); \ } \ static DEVICE_ATTR_RO(name) SHOW_FMT(index, "%d", wpan_phy_idx); static ssize_t name_show(struct device *dev, struct device_attribute *attr, char *buf) { struct wpan_phy *wpan_phy = &dev_to_rdev(dev)->wpan_phy; return sprintf(buf, "%s\n", dev_name(&wpan_phy->dev)); } static DEVICE_ATTR_RO(name); static void wpan_phy_release(struct device *dev) { struct cfg802154_registered_device *rdev = dev_to_rdev(dev); cfg802154_dev_free(rdev); } static struct attribute *pmib_attrs[] = { &dev_attr_index.attr, &dev_attr_name.attr, NULL, }; ATTRIBUTE_GROUPS(pmib); #ifdef CONFIG_PM_SLEEP static int wpan_phy_suspend(struct device *dev) { struct cfg802154_registered_device *rdev = dev_to_rdev(dev); int ret = 0; if (rdev->ops->suspend) { rtnl_lock(); ret = rdev_suspend(rdev); rtnl_unlock(); } return ret; } static int wpan_phy_resume(struct device *dev) { struct cfg802154_registered_device *rdev = dev_to_rdev(dev); int ret = 0; if (rdev->ops->resume) { rtnl_lock(); ret = rdev_resume(rdev); rtnl_unlock(); } return ret; } static SIMPLE_DEV_PM_OPS(wpan_phy_pm_ops, wpan_phy_suspend, wpan_phy_resume); #define WPAN_PHY_PM_OPS (&wpan_phy_pm_ops) #else #define WPAN_PHY_PM_OPS NULL #endif const struct class wpan_phy_class = { .name = "ieee802154", .dev_release = wpan_phy_release, .dev_groups = pmib_groups, .pm = WPAN_PHY_PM_OPS, }; int wpan_phy_sysfs_init(void) { return class_register(&wpan_phy_class); } void wpan_phy_sysfs_exit(void) { class_unregister(&wpan_phy_class); } |
| 4 1 4 3 1 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 | // SPDX-License-Identifier: GPL-2.0 /* * Shared Memory Communications over RDMA (SMC-R) and RoCE * * Manage send buffer. * Producer: * Copy user space data into send buffer, if send buffer space available. * Consumer: * Trigger RDMA write into RMBE of peer and send CDC, if RMBE space available. * * Copyright IBM Corp. 2016 * * Author(s): Ursula Braun <ubraun@linux.vnet.ibm.com> */ #include <linux/net.h> #include <linux/rcupdate.h> #include <linux/workqueue.h> #include <linux/sched/signal.h> #include <net/sock.h> #include <net/tcp.h> #include "smc.h" #include "smc_wr.h" #include "smc_cdc.h" #include "smc_close.h" #include "smc_ism.h" #include "smc_tx.h" #include "smc_stats.h" #include "smc_tracepoint.h" #define SMC_TX_WORK_DELAY 0 /***************************** sndbuf producer *******************************/ /* callback implementation for sk.sk_write_space() * to wakeup sndbuf producers that blocked with smc_tx_wait(). * called under sk_socket lock. */ static void smc_tx_write_space(struct sock *sk) { struct socket *sock = sk->sk_socket; struct smc_sock *smc = smc_sk(sk); struct socket_wq *wq; /* similar to sk_stream_write_space */ if (atomic_read(&smc->conn.sndbuf_space) && sock) { if (test_bit(SOCK_NOSPACE, &sock->flags)) SMC_STAT_RMB_TX_FULL(smc, !smc->conn.lnk); clear_bit(SOCK_NOSPACE, &sock->flags); rcu_read_lock(); wq = rcu_dereference(sk->sk_wq); if (skwq_has_sleeper(wq)) wake_up_interruptible_poll(&wq->wait, EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND); if (wq && wq->fasync_list && !(sk->sk_shutdown & SEND_SHUTDOWN)) sock_wake_async(wq, SOCK_WAKE_SPACE, POLL_OUT); rcu_read_unlock(); } } /* Wakeup sndbuf producers that blocked with smc_tx_wait(). * Cf. tcp_data_snd_check()=>tcp_check_space()=>tcp_new_space(). */ void smc_tx_sndbuf_nonfull(struct smc_sock *smc) { if (smc->sk.sk_socket && test_bit(SOCK_NOSPACE, &smc->sk.sk_socket->flags)) smc->sk.sk_write_space(&smc->sk); } /* blocks sndbuf producer until at least one byte of free space available * or urgent Byte was consumed */ static int smc_tx_wait(struct smc_sock *smc, int flags) { DEFINE_WAIT_FUNC(wait, woken_wake_function); struct smc_connection *conn = &smc->conn; struct sock *sk = &smc->sk; long timeo; int rc = 0; /* similar to sk_stream_wait_memory */ timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); add_wait_queue(sk_sleep(sk), &wait); while (1) { sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk); if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN) || conn->killed || conn->local_tx_ctrl.conn_state_flags.peer_done_writing) { rc = -EPIPE; break; } if (smc_cdc_rxed_any_close(conn)) { rc = -ECONNRESET; break; } if (!timeo) { /* ensure EPOLLOUT is subsequently generated */ set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); rc = -EAGAIN; break; } if (signal_pending(current)) { rc = sock_intr_errno(timeo); break; } sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); if (atomic_read(&conn->sndbuf_space) && !conn->urg_tx_pend) break; /* at least 1 byte of free & no urgent data */ set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); sk_wait_event(sk, &timeo, READ_ONCE(sk->sk_err) || (READ_ONCE(sk->sk_shutdown) & SEND_SHUTDOWN) || smc_cdc_rxed_any_close(conn) || (atomic_read(&conn->sndbuf_space) && !conn->urg_tx_pend), &wait); } remove_wait_queue(sk_sleep(sk), &wait); return rc; } static bool smc_tx_is_corked(struct smc_sock *smc) { struct tcp_sock *tp = tcp_sk(smc->clcsock->sk); return (tp->nonagle & TCP_NAGLE_CORK) ? true : false; } /* If we have pending CDC messages, do not send: * Because CQE of this CDC message will happen shortly, it gives * a chance to coalesce future sendmsg() payload in to one RDMA Write, * without need for a timer, and with no latency trade off. * Algorithm here: * 1. First message should never cork * 2. If we have pending Tx CDC messages, wait for the first CDC * message's completion * 3. Don't cork to much data in a single RDMA Write to prevent burst * traffic, total corked message should not exceed sendbuf/2 */ static bool smc_should_autocork(struct smc_sock *smc) { struct smc_connection *conn = &smc->conn; int corking_size; corking_size = min_t(unsigned int, conn->sndbuf_desc->len >> 1, sock_net(&smc->sk)->smc.sysctl_autocorking_size); if (atomic_read(&conn->cdc_pend_tx_wr) == 0 || smc_tx_prepared_sends(conn) > corking_size) return false; return true; } static bool smc_tx_should_cork(struct smc_sock *smc, struct msghdr *msg) { struct smc_connection *conn = &smc->conn; if (smc_should_autocork(smc)) return true; /* for a corked socket defer the RDMA writes if * sndbuf_space is still available. The applications * should known how/when to uncork it. */ if ((msg->msg_flags & MSG_MORE || smc_tx_is_corked(smc)) && atomic_read(&conn->sndbuf_space)) return true; return false; } /* sndbuf producer: main API called by socket layer. * called under sock lock. */ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len) { size_t copylen, send_done = 0, send_remaining = len; size_t chunk_len, chunk_off, chunk_len_sum; struct smc_connection *conn = &smc->conn; union smc_host_cursor prep; struct sock *sk = &smc->sk; char *sndbuf_base; int tx_cnt_prep; int writespace; int rc, chunk; /* This should be in poll */ sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) { rc = -EPIPE; goto out_err; } if (sk->sk_state == SMC_INIT) return -ENOTCONN; if (len > conn->sndbuf_desc->len) SMC_STAT_RMB_TX_SIZE_SMALL(smc, !conn->lnk); if (len > conn->peer_rmbe_size) SMC_STAT_RMB_TX_PEER_SIZE_SMALL(smc, !conn->lnk); if (msg->msg_flags & MSG_OOB) SMC_STAT_INC(smc, urg_data_cnt); while (msg_data_left(msg)) { if (smc->sk.sk_shutdown & SEND_SHUTDOWN || (smc->sk.sk_err == ECONNABORTED) || conn->killed) return -EPIPE; if (smc_cdc_rxed_any_close(conn)) return send_done ?: -ECONNRESET; if (msg->msg_flags & MSG_OOB) conn->local_tx_ctrl.prod_flags.urg_data_pending = 1; if (!atomic_read(&conn->sndbuf_space) || conn->urg_tx_pend) { if (send_done) return send_done; rc = smc_tx_wait(smc, msg->msg_flags); if (rc) goto out_err; continue; } /* initialize variables for 1st iteration of subsequent loop */ /* could be just 1 byte, even after smc_tx_wait above */ writespace = atomic_read(&conn->sndbuf_space); /* not more than what user space asked for */ copylen = min_t(size_t, send_remaining, writespace); /* determine start of sndbuf */ sndbuf_base = conn->sndbuf_desc->cpu_addr; smc_curs_copy(&prep, &conn->tx_curs_prep, conn); tx_cnt_prep = prep.count; /* determine chunks where to write into sndbuf */ /* either unwrapped case, or 1st chunk of wrapped case */ chunk_len = min_t(size_t, copylen, conn->sndbuf_desc->len - tx_cnt_prep); chunk_len_sum = chunk_len; chunk_off = tx_cnt_prep; for (chunk = 0; chunk < 2; chunk++) { rc = memcpy_from_msg(sndbuf_base + chunk_off, msg, chunk_len); if (rc) { smc_sndbuf_sync_sg_for_device(conn); if (send_done) return send_done; goto out_err; } send_done += chunk_len; send_remaining -= chunk_len; if (chunk_len_sum == copylen) break; /* either on 1st or 2nd iteration */ /* prepare next (== 2nd) iteration */ chunk_len = copylen - chunk_len; /* remainder */ chunk_len_sum += chunk_len; chunk_off = 0; /* modulo offset in send ring buffer */ } smc_sndbuf_sync_sg_for_device(conn); /* update cursors */ smc_curs_add(conn->sndbuf_desc->len, &prep, copylen); smc_curs_copy(&conn->tx_curs_prep, &prep, conn); /* increased in send tasklet smc_cdc_tx_handler() */ smp_mb__before_atomic(); atomic_sub(copylen, &conn->sndbuf_space); /* guarantee 0 <= sndbuf_space <= sndbuf_desc->len */ smp_mb__after_atomic(); /* since we just produced more new data into sndbuf, * trigger sndbuf consumer: RDMA write into peer RMBE and CDC */ if ((msg->msg_flags & MSG_OOB) && !send_remaining) conn->urg_tx_pend = true; /* If we need to cork, do nothing and wait for the next * sendmsg() call or push on tx completion */ if (!smc_tx_should_cork(smc, msg)) smc_tx_sndbuf_nonempty(conn); trace_smc_tx_sendmsg(smc, copylen); } /* while (msg_data_left(msg)) */ return send_done; out_err: rc = sk_stream_error(sk, msg->msg_flags, rc); /* make sure we wake any epoll edge trigger waiter */ if (unlikely(rc == -EAGAIN)) sk->sk_write_space(sk); return rc; } /***************************** sndbuf consumer *******************************/ /* sndbuf consumer: actual data transfer of one target chunk with ISM write */ int smcd_tx_ism_write(struct smc_connection *conn, void *data, size_t len, u32 offset, int signal) { int rc; rc = smc_ism_write(conn->lgr->smcd, conn->peer_token, conn->peer_rmbe_idx, signal, conn->tx_off + offset, data, len); if (rc) conn->local_tx_ctrl.conn_state_flags.peer_conn_abort = 1; return rc; } /* sndbuf consumer: actual data transfer of one target chunk with RDMA write */ static int smc_tx_rdma_write(struct smc_connection *conn, int peer_rmbe_offset, int num_sges, struct ib_rdma_wr *rdma_wr) { struct smc_link_group *lgr = conn->lgr; struct smc_link *link = conn->lnk; int rc; rdma_wr->wr.wr_id = smc_wr_tx_get_next_wr_id(link); rdma_wr->wr.num_sge = num_sges; rdma_wr->remote_addr = lgr->rtokens[conn->rtoken_idx][link->link_idx].dma_addr + /* RMBE within RMB */ conn->tx_off + /* offset within RMBE */ peer_rmbe_offset; rdma_wr->rkey = lgr->rtokens[conn->rtoken_idx][link->link_idx].rkey; rc = ib_post_send(link->roce_qp, &rdma_wr->wr, NULL); if (rc) smcr_link_down_cond_sched(link); return rc; } /* sndbuf consumer */ static inline void smc_tx_advance_cursors(struct smc_connection *conn, union smc_host_cursor *prod, union smc_host_cursor *sent, size_t len) { smc_curs_add(conn->peer_rmbe_size, prod, len); /* increased in recv tasklet smc_cdc_msg_rcv() */ smp_mb__before_atomic(); /* data in flight reduces usable snd_wnd */ atomic_sub(len, &conn->peer_rmbe_space); /* guarantee 0 <= peer_rmbe_space <= peer_rmbe_size */ smp_mb__after_atomic(); smc_curs_add(conn->sndbuf_desc->len, sent, len); } /* SMC-R helper for smc_tx_rdma_writes() */ static int smcr_tx_rdma_writes(struct smc_connection *conn, size_t len, size_t src_off, size_t src_len, size_t dst_off, size_t dst_len, struct smc_rdma_wr *wr_rdma_buf) { struct smc_link *link = conn->lnk; dma_addr_t dma_addr = sg_dma_address(conn->sndbuf_desc->sgt[link->link_idx].sgl); u64 virt_addr = (uintptr_t)conn->sndbuf_desc->cpu_addr; int src_len_sum = src_len, dst_len_sum = dst_len; int sent_count = src_off; int srcchunk, dstchunk; int num_sges; int rc; for (dstchunk = 0; dstchunk < 2; dstchunk++) { struct ib_rdma_wr *wr = &wr_rdma_buf->wr_tx_rdma[dstchunk]; struct ib_sge *sge = wr->wr.sg_list; u64 base_addr = dma_addr; if (dst_len < link->qp_attr.cap.max_inline_data) { base_addr = virt_addr; wr->wr.send_flags |= IB_SEND_INLINE; } else { wr->wr.send_flags &= ~IB_SEND_INLINE; } num_sges = 0; for (srcchunk = 0; srcchunk < 2; srcchunk++) { sge[srcchunk].addr = conn->sndbuf_desc->is_vm ? (virt_addr + src_off) : (base_addr + src_off); sge[srcchunk].length = src_len; if (conn->sndbuf_desc->is_vm) sge[srcchunk].lkey = conn->sndbuf_desc->mr[link->link_idx]->lkey; num_sges++; src_off += src_len; if (src_off >= conn->sndbuf_desc->len) src_off -= conn->sndbuf_desc->len; /* modulo in send ring */ if (src_len_sum == dst_len) break; /* either on 1st or 2nd iteration */ /* prepare next (== 2nd) iteration */ src_len = dst_len - src_len; /* remainder */ src_len_sum += src_len; } rc = smc_tx_rdma_write(conn, dst_off, num_sges, wr); if (rc) return rc; if (dst_len_sum == len) break; /* either on 1st or 2nd iteration */ /* prepare next (== 2nd) iteration */ dst_off = 0; /* modulo offset in RMBE ring buffer */ dst_len = len - dst_len; /* remainder */ dst_len_sum += dst_len; src_len = min_t(int, dst_len, conn->sndbuf_desc->len - sent_count); src_len_sum = src_len; } return 0; } /* SMC-D helper for smc_tx_rdma_writes() */ static int smcd_tx_rdma_writes(struct smc_connection *conn, size_t len, size_t src_off, size_t src_len, size_t dst_off, size_t dst_len) { int src_len_sum = src_len, dst_len_sum = dst_len; int srcchunk, dstchunk; int rc; for (dstchunk = 0; dstchunk < 2; dstchunk++) { for (srcchunk = 0; srcchunk < 2; srcchunk++) { void *data = conn->sndbuf_desc->cpu_addr + src_off; rc = smcd_tx_ism_write(conn, data, src_len, dst_off + sizeof(struct smcd_cdc_msg), 0); if (rc) return rc; dst_off += src_len; src_off += src_len; if (src_off >= conn->sndbuf_desc->len) src_off -= conn->sndbuf_desc->len; /* modulo in send ring */ if (src_len_sum == dst_len) break; /* either on 1st or 2nd iteration */ /* prepare next (== 2nd) iteration */ src_len = dst_len - src_len; /* remainder */ src_len_sum += src_len; } if (dst_len_sum == len) break; /* either on 1st or 2nd iteration */ /* prepare next (== 2nd) iteration */ dst_off = 0; /* modulo offset in RMBE ring buffer */ dst_len = len - dst_len; /* remainder */ dst_len_sum += dst_len; src_len = min_t(int, dst_len, conn->sndbuf_desc->len - src_off); src_len_sum = src_len; } return 0; } /* sndbuf consumer: prepare all necessary (src&dst) chunks of data transmit; * usable snd_wnd as max transmit */ static int smc_tx_rdma_writes(struct smc_connection *conn, struct smc_rdma_wr *wr_rdma_buf) { size_t len, src_len, dst_off, dst_len; /* current chunk values */ union smc_host_cursor sent, prep, prod, cons; struct smc_cdc_producer_flags *pflags; int to_send, rmbespace; int rc; /* source: sndbuf */ smc_curs_copy(&sent, &conn->tx_curs_sent, conn); smc_curs_copy(&prep, &conn->tx_curs_prep, conn); /* cf. wmem_alloc - (snd_max - snd_una) */ to_send = smc_curs_diff(conn->sndbuf_desc->len, &sent, &prep); if (to_send <= 0) return 0; /* destination: RMBE */ /* cf. snd_wnd */ rmbespace = atomic_read(&conn->peer_rmbe_space); if (rmbespace <= 0) { struct smc_sock *smc = container_of(conn, struct smc_sock, conn); SMC_STAT_RMB_TX_PEER_FULL(smc, !conn->lnk); return 0; } smc_curs_copy(&prod, &conn->local_tx_ctrl.prod, conn); smc_curs_copy(&cons, &conn->local_rx_ctrl.cons, conn); /* if usable snd_wnd closes ask peer to advertise once it opens again */ pflags = &conn->local_tx_ctrl.prod_flags; pflags->write_blocked = (to_send >= rmbespace); /* cf. usable snd_wnd */ len = min(to_send, rmbespace); /* initialize variables for first iteration of subsequent nested loop */ dst_off = prod.count; if (prod.wrap == cons.wrap) { /* the filled destination area is unwrapped, * hence the available free destination space is wrapped * and we need 2 destination chunks of sum len; start with 1st * which is limited by what's available in sndbuf */ dst_len = min_t(size_t, conn->peer_rmbe_size - prod.count, len); } else { /* the filled destination area is wrapped, * hence the available free destination space is unwrapped * and we need a single destination chunk of entire len */ dst_len = len; } /* dst_len determines the maximum src_len */ if (sent.count + dst_len <= conn->sndbuf_desc->len) { /* unwrapped src case: single chunk of entire dst_len */ src_len = dst_len; } else { /* wrapped src case: 2 chunks of sum dst_len; start with 1st: */ src_len = conn->sndbuf_desc->len - sent.count; } if (conn->lgr->is_smcd) rc = smcd_tx_rdma_writes(conn, len, sent.count, src_len, dst_off, dst_len); else rc = smcr_tx_rdma_writes(conn, len, sent.count, src_len, dst_off, dst_len, wr_rdma_buf); if (rc) return rc; if (conn->urg_tx_pend && len == to_send) pflags->urg_data_present = 1; smc_tx_advance_cursors(conn, &prod, &sent, len); /* update connection's cursors with advanced local cursors */ smc_curs_copy(&conn->local_tx_ctrl.prod, &prod, conn); /* dst: peer RMBE */ smc_curs_copy(&conn->tx_curs_sent, &sent, conn);/* src: local sndbuf */ return 0; } /* Wakeup sndbuf consumers from any context (IRQ or process) * since there is more data to transmit; usable snd_wnd as max transmit */ static int smcr_tx_sndbuf_nonempty(struct smc_connection *conn) { struct smc_cdc_producer_flags *pflags = &conn->local_tx_ctrl.prod_flags; struct smc_link *link = conn->lnk; struct smc_rdma_wr *wr_rdma_buf; struct smc_cdc_tx_pend *pend; struct smc_wr_buf *wr_buf; int rc; if (!link || !smc_wr_tx_link_hold(link)) return -ENOLINK; rc = smc_cdc_get_free_slot(conn, link, &wr_buf, &wr_rdma_buf, &pend); if (rc < 0) { smc_wr_tx_link_put(link); if (rc == -EBUSY) { struct smc_sock *smc = container_of(conn, struct smc_sock, conn); if (smc->sk.sk_err == ECONNABORTED) return sock_error(&smc->sk); if (conn->killed) return -EPIPE; rc = 0; mod_delayed_work(conn->lgr->tx_wq, &conn->tx_work, SMC_TX_WORK_DELAY); } return rc; } spin_lock_bh(&conn->send_lock); if (link != conn->lnk) { /* link of connection changed, tx_work will restart */ smc_wr_tx_put_slot(link, (struct smc_wr_tx_pend_priv *)pend); rc = -ENOLINK; goto out_unlock; } if (!pflags->urg_data_present) { rc = smc_tx_rdma_writes(conn, wr_rdma_buf); if (rc) { smc_wr_tx_put_slot(link, (struct smc_wr_tx_pend_priv *)pend); goto out_unlock; } } rc = smc_cdc_msg_send(conn, wr_buf, pend); if (!rc && pflags->urg_data_present) { pflags->urg_data_pending = 0; pflags->urg_data_present = 0; } out_unlock: spin_unlock_bh(&conn->send_lock); smc_wr_tx_link_put(link); return rc; } static int smcd_tx_sndbuf_nonempty(struct smc_connection *conn) { struct smc_cdc_producer_flags *pflags = &conn->local_tx_ctrl.prod_flags; int rc = 0; spin_lock_bh(&conn->send_lock); if (!pflags->urg_data_present) rc = smc_tx_rdma_writes(conn, NULL); if (!rc) rc = smcd_cdc_msg_send(conn); if (!rc && pflags->urg_data_present) { pflags->urg_data_pending = 0; pflags->urg_data_present = 0; } spin_unlock_bh(&conn->send_lock); return rc; } int smc_tx_sndbuf_nonempty(struct smc_connection *conn) { struct smc_sock *smc = container_of(conn, struct smc_sock, conn); int rc = 0; /* No data in the send queue */ if (unlikely(smc_tx_prepared_sends(conn) <= 0)) goto out; /* Peer don't have RMBE space */ if (unlikely(atomic_read(&conn->peer_rmbe_space) <= 0)) { SMC_STAT_RMB_TX_PEER_FULL(smc, !conn->lnk); goto out; } if (conn->killed || conn->local_rx_ctrl.conn_state_flags.peer_conn_abort) { rc = -EPIPE; /* connection being aborted */ goto out; } if (conn->lgr->is_smcd) rc = smcd_tx_sndbuf_nonempty(conn); else rc = smcr_tx_sndbuf_nonempty(conn); if (!rc) { /* trigger socket release if connection is closing */ smc_close_wake_tx_prepared(smc); } out: return rc; } /* Wakeup sndbuf consumers from process context * since there is more data to transmit. The caller * must hold sock lock. */ void smc_tx_pending(struct smc_connection *conn) { struct smc_sock *smc = container_of(conn, struct smc_sock, conn); int rc; if (smc->sk.sk_err) return; rc = smc_tx_sndbuf_nonempty(conn); if (!rc && conn->local_rx_ctrl.prod_flags.write_blocked && !atomic_read(&conn->bytes_to_rcv)) conn->local_rx_ctrl.prod_flags.write_blocked = 0; } /* Wakeup sndbuf consumers from process context * since there is more data to transmit in locked * sock. */ void smc_tx_work(struct work_struct *work) { struct smc_connection *conn = container_of(to_delayed_work(work), struct smc_connection, tx_work); struct smc_sock *smc = container_of(conn, struct smc_sock, conn); lock_sock(&smc->sk); smc_tx_pending(conn); release_sock(&smc->sk); } void smc_tx_consumer_update(struct smc_connection *conn, bool force) { union smc_host_cursor cfed, cons, prod; int sender_free = conn->rmb_desc->len; int to_confirm; smc_curs_copy(&cons, &conn->local_tx_ctrl.cons, conn); smc_curs_copy(&cfed, &conn->rx_curs_confirmed, conn); to_confirm = smc_curs_diff(conn->rmb_desc->len, &cfed, &cons); if (to_confirm > conn->rmbe_update_limit) { smc_curs_copy(&prod, &conn->local_rx_ctrl.prod, conn); sender_free = conn->rmb_desc->len - smc_curs_diff_large(conn->rmb_desc->len, &cfed, &prod); } if (conn->local_rx_ctrl.prod_flags.cons_curs_upd_req || force || ((to_confirm > conn->rmbe_update_limit) && ((sender_free <= (conn->rmb_desc->len / 2)) || conn->local_rx_ctrl.prod_flags.write_blocked))) { if (conn->killed || conn->local_rx_ctrl.conn_state_flags.peer_conn_abort) return; if ((smc_cdc_get_slot_and_msg_send(conn) < 0) && !conn->killed) { queue_delayed_work(conn->lgr->tx_wq, &conn->tx_work, SMC_TX_WORK_DELAY); return; } } if (conn->local_rx_ctrl.prod_flags.write_blocked && !atomic_read(&conn->bytes_to_rcv)) conn->local_rx_ctrl.prod_flags.write_blocked = 0; } /***************************** send initialize *******************************/ /* Initialize send properties on connection establishment. NB: not __init! */ void smc_tx_init(struct smc_sock *smc) { smc->sk.sk_write_space = smc_tx_write_space; } |
| 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 | // SPDX-License-Identifier: GPL-2.0 // // Register map access API // // Copyright 2011 Wolfson Microelectronics plc // // Author: Mark Brown <broonie@opensource.wolfsonmicro.com> #include <linux/device.h> #include <linux/slab.h> #include <linux/export.h> #include <linux/mutex.h> #include <linux/err.h> #include <linux/property.h> #include <linux/rbtree.h> #include <linux/sched.h> #include <linux/delay.h> #include <linux/log2.h> #include <linux/hwspinlock.h> #include <linux/unaligned.h> #define CREATE_TRACE_POINTS #include "trace.h" #include "internal.h" /* * Sometimes for failures during very early init the trace * infrastructure isn't available early enough to be used. For this * sort of problem defining LOG_DEVICE will add printks for basic * register I/O on a specific device. */ #undef LOG_DEVICE #ifdef LOG_DEVICE static inline bool regmap_should_log(struct regmap *map) { return (map->dev && strcmp(dev_name(map->dev), LOG_DEVICE) == 0); } #else static inline bool regmap_should_log(struct regmap *map) { return false; } #endif static int _regmap_update_bits(struct regmap *map, unsigned int reg, unsigned int mask, unsigned int val, bool *change, bool force_write); static int _regmap_bus_reg_read(void *context, unsigned int reg, unsigned int *val); static int _regmap_bus_read(void *context, unsigned int reg, unsigned int *val); static int _regmap_bus_formatted_write(void *context, unsigned int reg, unsigned int val); static int _regmap_bus_reg_write(void *context, unsigned int reg, unsigned int val); static int _regmap_bus_raw_write(void *context, unsigned int reg, unsigned int val); bool regmap_reg_in_ranges(unsigned int reg, const struct regmap_range *ranges, unsigned int nranges) { const struct regmap_range *r; int i; for (i = 0, r = ranges; i < nranges; i++, r++) if (regmap_reg_in_range(reg, r)) return true; return false; } EXPORT_SYMBOL_GPL(regmap_reg_in_ranges); bool regmap_check_range_table(struct regmap *map, unsigned int reg, const struct regmap_access_table *table) { /* Check "no ranges" first */ if (regmap_reg_in_ranges(reg, table->no_ranges, table->n_no_ranges)) return false; /* In case zero "yes ranges" are supplied, any reg is OK */ if (!table->n_yes_ranges) return true; return regmap_reg_in_ranges(reg, table->yes_ranges, table->n_yes_ranges); } EXPORT_SYMBOL_GPL(regmap_check_range_table); bool regmap_writeable(struct regmap *map, unsigned int reg) { if (map->max_register_is_set && reg > map->max_register) return false; if (map->writeable_reg) return map->writeable_reg(map->dev, reg); if (map->wr_table) return regmap_check_range_table(map, reg, map->wr_table); return true; } bool regmap_cached(struct regmap *map, unsigned int reg) { int ret; unsigned int val; if (map->cache_type == REGCACHE_NONE) return false; if (!map->cache_ops) return false; if (map->max_register_is_set && reg > map->max_register) return false; map->lock(map->lock_arg); ret = regcache_read(map, reg, &val); map->unlock(map->lock_arg); if (ret) return false; return true; } bool regmap_readable(struct regmap *map, unsigned int reg) { if (!map->reg_read) return false; if (map->max_register_is_set && reg > map->max_register) return false; if (map->format.format_write) return false; if (map->readable_reg) return map->readable_reg(map->dev, reg); if (map->rd_table) return regmap_check_range_table(map, reg, map->rd_table); return true; } bool regmap_volatile(struct regmap *map, unsigned int reg) { if (!map->format.format_write && !regmap_readable(map, reg)) return false; if (map->volatile_reg) return map->volatile_reg(map->dev, reg); if (map->volatile_table) return regmap_check_range_table(map, reg, map->volatile_table); if (map->cache_ops) return false; else return true; } bool regmap_precious(struct regmap *map, unsigned int reg) { if (!regmap_readable(map, reg)) return false; if (map->precious_reg) return map->precious_reg(map->dev, reg); if (map->precious_table) return regmap_check_range_table(map, reg, map->precious_table); return false; } bool regmap_writeable_noinc(struct regmap *map, unsigned int reg) { if (map->writeable_noinc_reg) return map->writeable_noinc_reg(map->dev, reg); if (map->wr_noinc_table) return regmap_check_range_table(map, reg, map->wr_noinc_table); return true; } bool regmap_readable_noinc(struct regmap *map, unsigned int reg) { if (map->readable_noinc_reg) return map->readable_noinc_reg(map->dev, reg); if (map->rd_noinc_table) return regmap_check_range_table(map, reg, map->rd_noinc_table); return true; } static bool regmap_volatile_range(struct regmap *map, unsigned int reg, size_t num) { unsigned int i; for (i = 0; i < num; i++) if (!regmap_volatile(map, reg + regmap_get_offset(map, i))) return false; return true; } static void regmap_format_12_20_write(struct regmap *map, unsigned int reg, unsigned int val) { u8 *out = map->work_buf; out[0] = reg >> 4; out[1] = (reg << 4) | (val >> 16); out[2] = val >> 8; out[3] = val; } static void regmap_format_2_6_write(struct regmap *map, unsigned int reg, unsigned int val) { u8 *out = map->work_buf; *out = (reg << 6) | val; } static void regmap_format_4_12_write(struct regmap *map, unsigned int reg, unsigned int val) { __be16 *out = map->work_buf; *out = cpu_to_be16((reg << 12) | val); } static void regmap_format_7_9_write(struct regmap *map, unsigned int reg, unsigned int val) { __be16 *out = map->work_buf; *out = cpu_to_be16((reg << 9) | val); } static void regmap_format_7_17_write(struct regmap *map, unsigned int reg, unsigned int val) { u8 *out = map->work_buf; out[2] = val; out[1] = val >> 8; out[0] = (val >> 16) | (reg << 1); } static void regmap_format_10_14_write(struct regmap *map, unsigned int reg, unsigned int val) { u8 *out = map->work_buf; out[2] = val; out[1] = (val >> 8) | (reg << 6); out[0] = reg >> 2; } static void regmap_format_8(void *buf, unsigned int val, unsigned int shift) { u8 *b = buf; b[0] = val << shift; } static void regmap_format_16_be(void *buf, unsigned int val, unsigned int shift) { put_unaligned_be16(val << shift, buf); } static void regmap_format_16_le(void *buf, unsigned int val, unsigned int shift) { put_unaligned_le16(val << shift, buf); } static void regmap_format_16_native(void *buf, unsigned int val, unsigned int shift) { u16 v = val << shift; memcpy(buf, &v, sizeof(v)); } static void regmap_format_24_be(void *buf, unsigned int val, unsigned int shift) { put_unaligned_be24(val << shift, buf); } static void regmap_format_32_be(void *buf, unsigned int val, unsigned int shift) { put_unaligned_be32(val << shift, buf); } static void regmap_format_32_le(void *buf, unsigned int val, unsigned int shift) { put_unaligned_le32(val << shift, buf); } static void regmap_format_32_native(void *buf, unsigned int val, unsigned int shift) { u32 v = val << shift; memcpy(buf, &v, sizeof(v)); } static void regmap_parse_inplace_noop(void *buf) { } static unsigned int regmap_parse_8(const void *buf) { const u8 *b = buf; return b[0]; } static unsigned int regmap_parse_16_be(const void *buf) { return get_unaligned_be16(buf); } static unsigned int regmap_parse_16_le(const void *buf) { return get_unaligned_le16(buf); } static void regmap_parse_16_be_inplace(void *buf) { u16 v = get_unaligned_be16(buf); memcpy(buf, &v, sizeof(v)); } static void regmap_parse_16_le_inplace(void *buf) { u16 v = get_unaligned_le16(buf); memcpy(buf, &v, sizeof(v)); } static unsigned int regmap_parse_16_native(const void *buf) { u16 v; memcpy(&v, buf, sizeof(v)); return v; } static unsigned int regmap_parse_24_be(const void *buf) { return get_unaligned_be24(buf); } static unsigned int regmap_parse_32_be(const void *buf) { return get_unaligned_be32(buf); } static unsigned int regmap_parse_32_le(const void *buf) { return get_unaligned_le32(buf); } static void regmap_parse_32_be_inplace(void *buf) { u32 v = get_unaligned_be32(buf); memcpy(buf, &v, sizeof(v)); } static void regmap_parse_32_le_inplace(void *buf) { u32 v = get_unaligned_le32(buf); memcpy(buf, &v, sizeof(v)); } static unsigned int regmap_parse_32_native(const void *buf) { u32 v; memcpy(&v, buf, sizeof(v)); return v; } static void regmap_lock_hwlock(void *__map) { struct regmap *map = __map; hwspin_lock_timeout(map->hwlock, UINT_MAX); } static void regmap_lock_hwlock_irq(void *__map) { struct regmap *map = __map; hwspin_lock_timeout_irq(map->hwlock, UINT_MAX); } static void regmap_lock_hwlock_irqsave(void *__map) { struct regmap *map = __map; hwspin_lock_timeout_irqsave(map->hwlock, UINT_MAX, &map->spinlock_flags); } static void regmap_unlock_hwlock(void *__map) { struct regmap *map = __map; hwspin_unlock(map->hwlock); } static void regmap_unlock_hwlock_irq(void *__map) { struct regmap *map = __map; hwspin_unlock_irq(map->hwlock); } static void regmap_unlock_hwlock_irqrestore(void *__map) { struct regmap *map = __map; hwspin_unlock_irqrestore(map->hwlock, &map->spinlock_flags); } static void regmap_lock_unlock_none(void *__map) { } static void regmap_lock_mutex(void *__map) { struct regmap *map = __map; mutex_lock(&map->mutex); } static void regmap_unlock_mutex(void *__map) { struct regmap *map = __map; mutex_unlock(&map->mutex); } static void regmap_lock_spinlock(void *__map) __acquires(&map->spinlock) { struct regmap *map = __map; unsigned long flags; spin_lock_irqsave(&map->spinlock, flags); map->spinlock_flags = flags; } static void regmap_unlock_spinlock(void *__map) __releases(&map->spinlock) { struct regmap *map = __map; spin_unlock_irqrestore(&map->spinlock, map->spinlock_flags); } static void regmap_lock_raw_spinlock(void *__map) __acquires(&map->raw_spinlock) { struct regmap *map = __map; unsigned long flags; raw_spin_lock_irqsave(&map->raw_spinlock, flags); map->raw_spinlock_flags = flags; } static void regmap_unlock_raw_spinlock(void *__map) __releases(&map->raw_spinlock) { struct regmap *map = __map; raw_spin_unlock_irqrestore(&map->raw_spinlock, map->raw_spinlock_flags); } static void dev_get_regmap_release(struct device *dev, void *res) { /* * We don't actually have anything to do here; the goal here * is not to manage the regmap but to provide a simple way to * get the regmap back given a struct device. */ } static bool _regmap_range_add(struct regmap *map, struct regmap_range_node *data) { struct rb_root *root = &map->range_tree; struct rb_node **new = &(root->rb_node), *parent = NULL; while (*new) { struct regmap_range_node *this = rb_entry(*new, struct regmap_range_node, node); parent = *new; if (data->range_max < this->range_min) new = &((*new)->rb_left); else if (data->range_min > this->range_max) new = &((*new)->rb_right); else return false; } rb_link_node(&data->node, parent, new); rb_insert_color(&data->node, root); return true; } static struct regmap_range_node *_regmap_range_lookup(struct regmap *map, unsigned int reg) { struct rb_node *node = map->range_tree.rb_node; while (node) { struct regmap_range_node *this = rb_entry(node, struct regmap_range_node, node); if (reg < this->range_min) node = node->rb_left; else if (reg > this->range_max) node = node->rb_right; else return this; } return NULL; } static void regmap_range_exit(struct regmap *map) { struct rb_node *next; struct regmap_range_node *range_node; next = rb_first(&map->range_tree); while (next) { range_node = rb_entry(next, struct regmap_range_node, node); next = rb_next(&range_node->node); rb_erase(&range_node->node, &map->range_tree); kfree(range_node); } kfree(map->selector_work_buf); } static int regmap_set_name(struct regmap *map, const struct regmap_config *config) { if (config->name) { const char *name = kstrdup_const(config->name, GFP_KERNEL); if (!name) return -ENOMEM; kfree_const(map->name); map->name = name; } return 0; } int regmap_attach_dev(struct device *dev, struct regmap *map, const struct regmap_config *config) { struct regmap **m; int ret; map->dev = dev; ret = regmap_set_name(map, config); if (ret) return ret; regmap_debugfs_exit(map); regmap_debugfs_init(map); /* Add a devres resource for dev_get_regmap() */ m = devres_alloc(dev_get_regmap_release, sizeof(*m), GFP_KERNEL); if (!m) { regmap_debugfs_exit(map); return -ENOMEM; } *m = map; devres_add(dev, m); return 0; } EXPORT_SYMBOL_GPL(regmap_attach_dev); static int dev_get_regmap_match(struct device *dev, void *res, void *data); static int regmap_detach_dev(struct device *dev, struct regmap *map) { if (!dev) return 0; return devres_release(dev, dev_get_regmap_release, dev_get_regmap_match, (void *)map->name); } static enum regmap_endian regmap_get_reg_endian(const struct regmap_bus *bus, const struct regmap_config *config) { enum regmap_endian endian; /* Retrieve the endianness specification from the regmap config */ endian = config->reg_format_endian; /* If the regmap config specified a non-default value, use that */ if (endian != REGMAP_ENDIAN_DEFAULT) return endian; /* Retrieve the endianness specification from the bus config */ if (bus && bus->reg_format_endian_default) endian = bus->reg_format_endian_default; /* If the bus specified a non-default value, use that */ if (endian != REGMAP_ENDIAN_DEFAULT) return endian; /* Use this if no other value was found */ return REGMAP_ENDIAN_BIG; } enum regmap_endian regmap_get_val_endian(struct device *dev, const struct regmap_bus *bus, const struct regmap_config *config) { struct fwnode_handle *fwnode = dev ? dev_fwnode(dev) : NULL; enum regmap_endian endian; /* Retrieve the endianness specification from the regmap config */ endian = config->val_format_endian; /* If the regmap config specified a non-default value, use that */ if (endian != REGMAP_ENDIAN_DEFAULT) return endian; /* If the firmware node exist try to get endianness from it */ if (fwnode_property_read_bool(fwnode, "big-endian")) endian = REGMAP_ENDIAN_BIG; else if (fwnode_property_read_bool(fwnode, "little-endian")) endian = REGMAP_ENDIAN_LITTLE; else if (fwnode_property_read_bool(fwnode, "native-endian")) endian = REGMAP_ENDIAN_NATIVE; /* If the endianness was specified in fwnode, use that */ if (endian != REGMAP_ENDIAN_DEFAULT) return endian; /* Retrieve the endianness specification from the bus config */ if (bus && bus->val_format_endian_default) endian = bus->val_format_endian_default; /* If the bus specified a non-default value, use that */ if (endian != REGMAP_ENDIAN_DEFAULT) return endian; /* Use this if no other value was found */ return REGMAP_ENDIAN_BIG; } EXPORT_SYMBOL_GPL(regmap_get_val_endian); struct regmap *__regmap_init(struct device *dev, const struct regmap_bus *bus, void *bus_context, const struct regmap_config *config, struct lock_class_key *lock_key, const char *lock_name) { struct regmap *map; int ret = -EINVAL; enum regmap_endian reg_endian, val_endian; int i, j; if (!config) goto err; map = kzalloc(sizeof(*map), GFP_KERNEL); if (map == NULL) { ret = -ENOMEM; goto err; } ret = regmap_set_name(map, config); if (ret) goto err_map; ret = -EINVAL; /* Later error paths rely on this */ if (config->disable_locking) { map->lock = map->unlock = regmap_lock_unlock_none; map->can_sleep = config->can_sleep; regmap_debugfs_disable(map); } else if (config->lock && config->unlock) { map->lock = config->lock; map->unlock = config->unlock; map->lock_arg = config->lock_arg; map->can_sleep = config->can_sleep; } else if (config->use_hwlock) { map->hwlock = hwspin_lock_request_specific(config->hwlock_id); if (!map->hwlock) { ret = -ENXIO; goto err_name; } switch (config->hwlock_mode) { case HWLOCK_IRQSTATE: map->lock = regmap_lock_hwlock_irqsave; map->unlock = regmap_unlock_hwlock_irqrestore; break; case HWLOCK_IRQ: map->lock = regmap_lock_hwlock_irq; map->unlock = regmap_unlock_hwlock_irq; break; default: map->lock = regmap_lock_hwlock; map->unlock = regmap_unlock_hwlock; break; } map->lock_arg = map; } else { if ((bus && bus->fast_io) || config->fast_io) { if (config->use_raw_spinlock) { raw_spin_lock_init(&map->raw_spinlock); map->lock = regmap_lock_raw_spinlock; map->unlock = regmap_unlock_raw_spinlock; lockdep_set_class_and_name(&map->raw_spinlock, lock_key, lock_name); } else { spin_lock_init(&map->spinlock); map->lock = regmap_lock_spinlock; map->unlock = regmap_unlock_spinlock; lockdep_set_class_and_name(&map->spinlock, lock_key, lock_name); } } else { mutex_init(&map->mutex); map->lock = regmap_lock_mutex; map->unlock = regmap_unlock_mutex; map->can_sleep = true; lockdep_set_class_and_name(&map->mutex, lock_key, lock_name); } map->lock_arg = map; map->lock_key = lock_key; } /* * When we write in fast-paths with regmap_bulk_write() don't allocate * scratch buffers with sleeping allocations. */ if ((bus && bus->fast_io) || config->fast_io) map->alloc_flags = GFP_ATOMIC; else map->alloc_flags = GFP_KERNEL; map->reg_base = config->reg_base; map->reg_shift = config->pad_bits % 8; map->format.pad_bytes = config->pad_bits / 8; map->format.reg_shift = config->reg_shift; map->format.reg_bytes = BITS_TO_BYTES(config->reg_bits); map->format.val_bytes = BITS_TO_BYTES(config->val_bits); map->format.buf_size = BITS_TO_BYTES(config->reg_bits + config->val_bits + config->pad_bits); if (config->reg_stride) map->reg_stride = config->reg_stride; else map->reg_stride = 1; if (is_power_of_2(map->reg_stride)) map->reg_stride_order = ilog2(map->reg_stride); else map->reg_stride_order = -1; map->use_single_read = config->use_single_read || !(config->read || (bus && bus->read)); map->use_single_write = config->use_single_write || !(config->write || (bus && bus->write)); map->can_multi_write = config->can_multi_write && (config->write || (bus && bus->write)); if (bus) { map->max_raw_read = bus->max_raw_read; map->max_raw_write = bus->max_raw_write; } else if (config->max_raw_read && config->max_raw_write) { map->max_raw_read = config->max_raw_read; map->max_raw_write = config->max_raw_write; } map->dev = dev; map->bus = bus; map->bus_context = bus_context; map->max_register = config->max_register; map->max_register_is_set = map->max_register ?: config->max_register_is_0; map->wr_table = config->wr_table; map->rd_table = config->rd_table; map->volatile_table = config->volatile_table; map->precious_table = config->precious_table; map->wr_noinc_table = config->wr_noinc_table; map->rd_noinc_table = config->rd_noinc_table; map->writeable_reg = config->writeable_reg; map->readable_reg = config->readable_reg; map->volatile_reg = config->volatile_reg; map->precious_reg = config->precious_reg; map->writeable_noinc_reg = config->writeable_noinc_reg; map->readable_noinc_reg = config->readable_noinc_reg; map->cache_type = config->cache_type; spin_lock_init(&map->async_lock); INIT_LIST_HEAD(&map->async_list); INIT_LIST_HEAD(&map->async_free); init_waitqueue_head(&map->async_waitq); if (config->read_flag_mask || config->write_flag_mask || config->zero_flag_mask) { map->read_flag_mask = config->read_flag_mask; map->write_flag_mask = config->write_flag_mask; } else if (bus) { map->read_flag_mask = bus->read_flag_mask; } if (config && config->read && config->write) { map->reg_read = _regmap_bus_read; if (config->reg_update_bits) map->reg_update_bits = config->reg_update_bits; /* Bulk read/write */ map->read = config->read; map->write = config->write; reg_endian = REGMAP_ENDIAN_NATIVE; val_endian = REGMAP_ENDIAN_NATIVE; } else if (!bus) { map->reg_read = config->reg_read; map->reg_write = config->reg_write; map->reg_update_bits = config->reg_update_bits; map->defer_caching = false; goto skip_format_initialization; } else if (!bus->read || !bus->write) { map->reg_read = _regmap_bus_reg_read; map->reg_write = _regmap_bus_reg_write; map->reg_update_bits = bus->reg_update_bits; map->defer_caching = false; goto skip_format_initialization; } else { map->reg_read = _regmap_bus_read; map->reg_update_bits = bus->reg_update_bits; /* Bulk read/write */ map->read = bus->read; map->write = bus->write; reg_endian = regmap_get_reg_endian(bus, config); val_endian = regmap_get_val_endian(dev, bus, config); } switch (config->reg_bits + map->reg_shift) { case 2: switch (config->val_bits) { case 6: map->format.format_write = regmap_format_2_6_write; break; default: goto err_hwlock; } break; case 4: switch (config->val_bits) { case 12: map->format.format_write = regmap_format_4_12_write; break; default: goto err_hwlock; } break; case 7: switch (config->val_bits) { case 9: map->format.format_write = regmap_format_7_9_write; break; case 17: map->format.format_write = regmap_format_7_17_write; break; default: goto err_hwlock; } break; case 10: switch (config->val_bits) { case 14: map->format.format_write = regmap_format_10_14_write; break; default: goto err_hwlock; } break; case 12: switch (config->val_bits) { case 20: map->format.format_write = regmap_format_12_20_write; break; default: goto err_hwlock; } break; case 8: map->format.format_reg = regmap_format_8; break; case 16: switch (reg_endian) { case REGMAP_ENDIAN_BIG: map->format.format_reg = regmap_format_16_be; break; case REGMAP_ENDIAN_LITTLE: map->format.format_reg = regmap_format_16_le; break; case REGMAP_ENDIAN_NATIVE: map->format.format_reg = regmap_format_16_native; break; default: goto err_hwlock; } break; case 24: switch (reg_endian) { case REGMAP_ENDIAN_BIG: map->format.format_reg = regmap_format_24_be; break; default: goto err_hwlock; } break; case 32: switch (reg_endian) { case REGMAP_ENDIAN_BIG: map->format.format_reg = regmap_format_32_be; break; case REGMAP_ENDIAN_LITTLE: map->format.format_reg = regmap_format_32_le; break; case REGMAP_ENDIAN_NATIVE: map->format.format_reg = regmap_format_32_native; break; default: goto err_hwlock; } break; default: goto err_hwlock; } if (val_endian == REGMAP_ENDIAN_NATIVE) map->format.parse_inplace = regmap_parse_inplace_noop; switch (config->val_bits) { case 8: map->format.format_val = regmap_format_8; map->format.parse_val = regmap_parse_8; map->format.parse_inplace = regmap_parse_inplace_noop; break; case 16: switch (val_endian) { case REGMAP_ENDIAN_BIG: map->format.format_val = regmap_format_16_be; map->format.parse_val = regmap_parse_16_be; map->format.parse_inplace = regmap_parse_16_be_inplace; break; case REGMAP_ENDIAN_LITTLE: map->format.format_val = regmap_format_16_le; map->format.parse_val = regmap_parse_16_le; map->format.parse_inplace = regmap_parse_16_le_inplace; break; case REGMAP_ENDIAN_NATIVE: map->format.format_val = regmap_format_16_native; map->format.parse_val = regmap_parse_16_native; break; default: goto err_hwlock; } break; case 24: switch (val_endian) { case REGMAP_ENDIAN_BIG: map->format.format_val = regmap_format_24_be; map->format.parse_val = regmap_parse_24_be; break; default: goto err_hwlock; } break; case 32: switch (val_endian) { case REGMAP_ENDIAN_BIG: map->format.format_val = regmap_format_32_be; map->format.parse_val = regmap_parse_32_be; map->format.parse_inplace = regmap_parse_32_be_inplace; break; case REGMAP_ENDIAN_LITTLE: map->format.format_val = regmap_format_32_le; map->format.parse_val = regmap_parse_32_le; map->format.parse_inplace = regmap_parse_32_le_inplace; break; case REGMAP_ENDIAN_NATIVE: map->format.format_val = regmap_format_32_native; map->format.parse_val = regmap_parse_32_native; break; default: goto err_hwlock; } break; } if (map->format.format_write) { if ((reg_endian != REGMAP_ENDIAN_BIG) || (val_endian != REGMAP_ENDIAN_BIG)) goto err_hwlock; map->use_single_write = true; } if (!map->format.format_write && !(map->format.format_reg && map->format.format_val)) goto err_hwlock; map->work_buf = kzalloc(map->format.buf_size, GFP_KERNEL); if (map->work_buf == NULL) { ret = -ENOMEM; goto err_hwlock; } if (map->format.format_write) { map->defer_caching = false; map->reg_write = _regmap_bus_formatted_write; } else if (map->format.format_val) { map->defer_caching = true; map->reg_write = _regmap_bus_raw_write; } skip_format_initialization: map->range_tree = RB_ROOT; for (i = 0; i < config->num_ranges; i++) { const struct regmap_range_cfg *range_cfg = &config->ranges[i]; struct regmap_range_node *new; /* Sanity check */ if (range_cfg->range_max < range_cfg->range_min) { dev_err(map->dev, "Invalid range %d: %u < %u\n", i, range_cfg->range_max, range_cfg->range_min); goto err_range; } if (range_cfg->range_max > map->max_register) { dev_err(map->dev, "Invalid range %d: %u > %u\n", i, range_cfg->range_max, map->max_register); goto err_range; } if (range_cfg->selector_reg > map->max_register) { dev_err(map->dev, "Invalid range %d: selector out of map\n", i); goto err_range; } if (range_cfg->window_len == 0) { dev_err(map->dev, "Invalid range %d: window_len 0\n", i); goto err_range; } /* Make sure, that this register range has no selector or data window within its boundary */ for (j = 0; j < config->num_ranges; j++) { unsigned int sel_reg = config->ranges[j].selector_reg; unsigned int win_min = config->ranges[j].window_start; unsigned int win_max = win_min + config->ranges[j].window_len - 1; /* Allow data window inside its own virtual range */ if (j == i) continue; if (range_cfg->range_min <= sel_reg && sel_reg <= range_cfg->range_max) { dev_err(map->dev, "Range %d: selector for %d in window\n", i, j); goto err_range; } if (!(win_max < range_cfg->range_min || win_min > range_cfg->range_max)) { dev_err(map->dev, "Range %d: window for %d in window\n", i, j); goto err_range; } } new = kzalloc(sizeof(*new), GFP_KERNEL); if (new == NULL) { ret = -ENOMEM; goto err_range; } new->map = map; new->name = range_cfg->name; new->range_min = range_cfg->range_min; new->range_max = range_cfg->range_max; new->selector_reg = range_cfg->selector_reg; new->selector_mask = range_cfg->selector_mask; new->selector_shift = range_cfg->selector_shift; new->window_start = range_cfg->window_start; new->window_len = range_cfg->window_len; if (!_regmap_range_add(map, new)) { dev_err(map->dev, "Failed to add range %d\n", i); kfree(new); goto err_range; } if (map->selector_work_buf == NULL) { map->selector_work_buf = kzalloc(map->format.buf_size, GFP_KERNEL); if (map->selector_work_buf == NULL) { ret = -ENOMEM; goto err_range; } } } ret = regcache_init(map, config); if (ret != 0) goto err_range; if (dev) { ret = regmap_attach_dev(dev, map, config); if (ret != 0) goto err_regcache; } else { regmap_debugfs_init(map); } return map; err_regcache: regcache_exit(map); err_range: regmap_range_exit(map); kfree(map->work_buf); err_hwlock: if (map->hwlock) hwspin_lock_free(map->hwlock); err_name: kfree_const(map->name); err_map: kfree(map); err: if (bus && bus->free_on_exit) kfree(bus); return ERR_PTR(ret); } EXPORT_SYMBOL_GPL(__regmap_init); static void devm_regmap_release(struct device *dev, void *res) { regmap_exit(*(struct regmap **)res); } struct regmap *__devm_regmap_init(struct device *dev, const struct regmap_bus *bus, void *bus_context, const struct regmap_config *config, struct lock_class_key *lock_key, const char *lock_name) { struct regmap **ptr, *regmap; ptr = devres_alloc(devm_regmap_release, sizeof(*ptr), GFP_KERNEL); if (!ptr) return ERR_PTR(-ENOMEM); regmap = __regmap_init(dev, bus, bus_context, config, lock_key, lock_name); if (!IS_ERR(regmap)) { *ptr = regmap; devres_add(dev, ptr); } else { devres_free(ptr); } return regmap; } EXPORT_SYMBOL_GPL(__devm_regmap_init); static void regmap_field_init(struct regmap_field *rm_field, struct regmap *regmap, struct reg_field reg_field) { rm_field->regmap = regmap; rm_field->reg = reg_field.reg; rm_field->shift = reg_field.lsb; rm_field->mask = GENMASK(reg_field.msb, reg_field.lsb); WARN_ONCE(rm_field->mask == 0, "invalid empty mask defined\n"); rm_field->id_size = reg_field.id_size; rm_field->id_offset = reg_field.id_offset; } /** * devm_regmap_field_alloc() - Allocate and initialise a register field. * * @dev: Device that will be interacted with * @regmap: regmap bank in which this register field is located. * @reg_field: Register field with in the bank. * * The return value will be an ERR_PTR() on error or a valid pointer * to a struct regmap_field. The regmap_field will be automatically freed * by the device management code. */ struct regmap_field *devm_regmap_field_alloc(struct device *dev, struct regmap *regmap, struct reg_field reg_field) { struct regmap_field *rm_field = devm_kzalloc(dev, sizeof(*rm_field), GFP_KERNEL); if (!rm_field) return ERR_PTR(-ENOMEM); regmap_field_init(rm_field, regmap, reg_field); return rm_field; } EXPORT_SYMBOL_GPL(devm_regmap_field_alloc); /** * regmap_field_bulk_alloc() - Allocate and initialise a bulk register field. * * @regmap: regmap bank in which this register field is located. * @rm_field: regmap register fields within the bank. * @reg_field: Register fields within the bank. * @num_fields: Number of register fields. * * The return value will be an -ENOMEM on error or zero for success. * Newly allocated regmap_fields should be freed by calling * regmap_field_bulk_free() */ int regmap_field_bulk_alloc(struct regmap *regmap, struct regmap_field **rm_field, const struct reg_field *reg_field, int num_fields) { struct regmap_field *rf; int i; rf = kcalloc(num_fields, sizeof(*rf), GFP_KERNEL); if (!rf) return -ENOMEM; for (i = 0; i < num_fields; i++) { regmap_field_init(&rf[i], regmap, reg_field[i]); rm_field[i] = &rf[i]; } return 0; } EXPORT_SYMBOL_GPL(regmap_field_bulk_alloc); /** * devm_regmap_field_bulk_alloc() - Allocate and initialise a bulk register * fields. * * @dev: Device that will be interacted with * @regmap: regmap bank in which this register field is located. * @rm_field: regmap register fields within the bank. * @reg_field: Register fields within the bank. * @num_fields: Number of register fields. * * The return value will be an -ENOMEM on error or zero for success. * Newly allocated regmap_fields will be automatically freed by the * device management code. */ int devm_regmap_field_bulk_alloc(struct device *dev, struct regmap *regmap, struct regmap_field **rm_field, const struct reg_field *reg_field, int num_fields) { struct regmap_field *rf; int i; rf = devm_kcalloc(dev, num_fields, sizeof(*rf), GFP_KERNEL); if (!rf) return -ENOMEM; for (i = 0; i < num_fields; i++) { regmap_field_init(&rf[i], regmap, reg_field[i]); rm_field[i] = &rf[i]; } return 0; } EXPORT_SYMBOL_GPL(devm_regmap_field_bulk_alloc); /** * regmap_field_bulk_free() - Free register field allocated using * regmap_field_bulk_alloc. * * @field: regmap fields which should be freed. */ void regmap_field_bulk_free(struct regmap_field *field) { kfree(field); } EXPORT_SYMBOL_GPL(regmap_field_bulk_free); /** * devm_regmap_field_bulk_free() - Free a bulk register field allocated using * devm_regmap_field_bulk_alloc. * * @dev: Device that will be interacted with * @field: regmap field which should be freed. * * Free register field allocated using devm_regmap_field_bulk_alloc(). Usually * drivers need not call this function, as the memory allocated via devm * will be freed as per device-driver life-cycle. */ void devm_regmap_field_bulk_free(struct device *dev, struct regmap_field *field) { devm_kfree(dev, field); } EXPORT_SYMBOL_GPL(devm_regmap_field_bulk_free); /** * devm_regmap_field_free() - Free a register field allocated using * devm_regmap_field_alloc. * * @dev: Device that will be interacted with * @field: regmap field which should be freed. * * Free register field allocated using devm_regmap_field_alloc(). Usually * drivers need not call this function, as the memory allocated via devm * will be freed as per device-driver life-cyle. */ void devm_regmap_field_free(struct device *dev, struct regmap_field *field) { devm_kfree(dev, field); } EXPORT_SYMBOL_GPL(devm_regmap_field_free); /** * regmap_field_alloc() - Allocate and initialise a register field. * * @regmap: regmap bank in which this register field is located. * @reg_field: Register field with in the bank. * * The return value will be an ERR_PTR() on error or a valid pointer * to a struct regmap_field. The regmap_field should be freed by the * user once its finished working with it using regmap_field_free(). */ struct regmap_field *regmap_field_alloc(struct regmap *regmap, struct reg_field reg_field) { struct regmap_field *rm_field = kzalloc(sizeof(*rm_field), GFP_KERNEL); if (!rm_field) return ERR_PTR(-ENOMEM); regmap_field_init(rm_field, regmap, reg_field); return rm_field; } EXPORT_SYMBOL_GPL(regmap_field_alloc); /** * regmap_field_free() - Free register field allocated using * regmap_field_alloc. * * @field: regmap field which should be freed. */ void regmap_field_free(struct regmap_field *field) { kfree(field); } EXPORT_SYMBOL_GPL(regmap_field_free); /** * regmap_reinit_cache() - Reinitialise the current register cache * * @map: Register map to operate on. * @config: New configuration. Only the cache data will be used. * * Discard any existing register cache for the map and initialize a * new cache. This can be used to restore the cache to defaults or to * update the cache configuration to reflect runtime discovery of the * hardware. * * No explicit locking is done here, the user needs to ensure that * this function will not race with other calls to regmap. */ int regmap_reinit_cache(struct regmap *map, const struct regmap_config *config) { int ret; regcache_exit(map); regmap_debugfs_exit(map); map->max_register = config->max_register; map->max_register_is_set = map->max_register ?: config->max_register_is_0; map->writeable_reg = config->writeable_reg; map->readable_reg = config->readable_reg; map->volatile_reg = config->volatile_reg; map->precious_reg = config->precious_reg; map->writeable_noinc_reg = config->writeable_noinc_reg; map->readable_noinc_reg = config->readable_noinc_reg; map->cache_type = config->cache_type; ret = regmap_set_name(map, config); if (ret) return ret; regmap_debugfs_init(map); map->cache_bypass = false; map->cache_only = false; return regcache_init(map, config); } EXPORT_SYMBOL_GPL(regmap_reinit_cache); /** * regmap_exit() - Free a previously allocated register map * * @map: Register map to operate on. */ void regmap_exit(struct regmap *map) { struct regmap_async *async; regmap_detach_dev(map->dev, map); regcache_exit(map); regmap_debugfs_exit(map); regmap_range_exit(map); if (map->bus && map->bus->free_context) map->bus->free_context(map->bus_context); kfree(map->work_buf); while (!list_empty(&map->async_free)) { async = list_first_entry_or_null(&map->async_free, struct regmap_async, list); list_del(&async->list); kfree(async->work_buf); kfree(async); } if (map->hwlock) hwspin_lock_free(map->hwlock); if (map->lock == regmap_lock_mutex) mutex_destroy(&map->mutex); kfree_const(map->name); kfree(map->patch); if (map->bus && map->bus->free_on_exit) kfree(map->bus); kfree(map); } EXPORT_SYMBOL_GPL(regmap_exit); static int dev_get_regmap_match(struct device *dev, void *res, void *data) { struct regmap **r = res; if (!r || !*r) { WARN_ON(!r || !*r); return 0; } /* If the user didn't specify a name match any */ if (data) return (*r)->name && !strcmp((*r)->name, data); else return 1; } /** * dev_get_regmap() - Obtain the regmap (if any) for a device * * @dev: Device to retrieve the map for * @name: Optional name for the register map, usually NULL. * * Returns the regmap for the device if one is present, or NULL. If * name is specified then it must match the name specified when * registering the device, if it is NULL then the first regmap found * will be used. Devices with multiple register maps are very rare, * generic code should normally not need to specify a name. */ struct regmap *dev_get_regmap(struct device *dev, const char *name) { struct regmap **r = devres_find(dev, dev_get_regmap_release, dev_get_regmap_match, (void *)name); if (!r) return NULL; return *r; } EXPORT_SYMBOL_GPL(dev_get_regmap); /** * regmap_get_device() - Obtain the device from a regmap * * @map: Register map to operate on. * * Returns the underlying device that the regmap has been created for. */ struct device *regmap_get_device(struct regmap *map) { return map->dev; } EXPORT_SYMBOL_GPL(regmap_get_device); static int _regmap_select_page(struct regmap *map, unsigned int *reg, struct regmap_range_node *range, unsigned int val_num) { void *orig_work_buf; unsigned int win_offset; unsigned int win_page; bool page_chg; int ret; win_offset = (*reg - range->range_min) % range->window_len; win_page = (*reg - range->range_min) / range->window_len; if (val_num > 1) { /* Bulk write shouldn't cross range boundary */ if (*reg + val_num - 1 > range->range_max) return -EINVAL; /* ... or single page boundary */ if (val_num > range->window_len - win_offset) return -EINVAL; } /* It is possible to have selector register inside data window. In that case, selector register is located on every page and it needs no page switching, when accessed alone. */ if (val_num > 1 || range->window_start + win_offset != range->selector_reg) { /* Use separate work_buf during page switching */ orig_work_buf = map->work_buf; map->work_buf = map->selector_work_buf; ret = _regmap_update_bits(map, range->selector_reg, range->selector_mask, win_page << range->selector_shift, &page_chg, false); map->work_buf = orig_work_buf; if (ret != 0) return ret; } *reg = range->window_start + win_offset; return 0; } static void regmap_set_work_buf_flag_mask(struct regmap *map, int max_bytes, unsigned long mask) { u8 *buf; int i; if (!mask || !map->work_buf) return; buf = map->work_buf; for (i = 0; i < max_bytes; i++) buf[i] |= (mask >> (8 * i)) & 0xff; } static unsigned int regmap_reg_addr(struct regmap *map, unsigned int reg) { reg += map->reg_base; if (map->format.reg_shift > 0) reg >>= map->format.reg_shift; else if (map->format.reg_shift < 0) reg <<= -(map->format.reg_shift); return reg; } static int _regmap_raw_write_impl(struct regmap *map, unsigned int reg, const void *val, size_t val_len, bool noinc) { struct regmap_range_node *range; unsigned long flags; void *work_val = map->work_buf + map->format.reg_bytes + map->format.pad_bytes; void *buf; int ret = -ENOTSUPP; size_t len; int i; /* Check for unwritable or noinc registers in range * before we start */ if (!regmap_writeable_noinc(map, reg)) { for (i = 0; i < val_len / map->format.val_bytes; i++) { unsigned int element = reg + regmap_get_offset(map, i); if (!regmap_writeable(map, element) || regmap_writeable_noinc(map, element)) return -EINVAL; } } if (!map->cache_bypass && map->format.parse_val) { unsigned int ival, offset; int val_bytes = map->format.val_bytes; /* Cache the last written value for noinc writes */ i = noinc ? val_len - val_bytes : 0; for (; i < val_len; i += val_bytes) { ival = map->format.parse_val(val + i); offset = noinc ? 0 : regmap_get_offset(map, i / val_bytes); ret = regcache_write(map, reg + offset, ival); if (ret) { dev_err(map->dev, "Error in caching of register: %x ret: %d\n", reg + offset, ret); return ret; } } if (map->cache_only) { map->cache_dirty = true; return 0; } } range = _regmap_range_lookup(map, reg); if (range) { int val_num = val_len / map->format.val_bytes; int win_offset = (reg - range->range_min) % range->window_len; int win_residue = range->window_len - win_offset; /* If the write goes beyond the end of the window split it */ while (val_num > win_residue) { dev_dbg(map->dev, "Writing window %d/%zu\n", win_residue, val_len / map->format.val_bytes); ret = _regmap_raw_write_impl(map, reg, val, win_residue * map->format.val_bytes, noinc); if (ret != 0) return ret; reg += win_residue; val_num -= win_residue; val += win_residue * map->format.val_bytes; val_len -= win_residue * map->format.val_bytes; win_offset = (reg - range->range_min) % range->window_len; win_residue = range->window_len - win_offset; } ret = _regmap_select_page(map, ®, range, noinc ? 1 : val_num); if (ret != 0) return ret; } reg = regmap_reg_addr(map, reg); map->format.format_reg(map->work_buf, reg, map->reg_shift); regmap_set_work_buf_flag_mask(map, map->format.reg_bytes, map->write_flag_mask); /* * Essentially all I/O mechanisms will be faster with a single * buffer to write. Since register syncs often generate raw * writes of single registers optimise that case. */ if (val != work_val && val_len == map->format.val_bytes) { memcpy(work_val, val, map->format.val_bytes); val = work_val; } if (map->async && map->bus && map->bus->async_write) { struct regmap_async *async; trace_regmap_async_write_start(map, reg, val_len); spin_lock_irqsave(&map->async_lock, flags); async = list_first_entry_or_null(&map->async_free, struct regmap_async, list); if (async) list_del(&async->list); spin_unlock_irqrestore(&map->async_lock, flags); if (!async) { async = map->bus->async_alloc(); if (!async) return -ENOMEM; async->work_buf = kzalloc(map->format.buf_size, GFP_KERNEL | GFP_DMA); if (!async->work_buf) { kfree(async); return -ENOMEM; } } async->map = map; /* If the caller supplied the value we can use it safely. */ memcpy(async->work_buf, map->work_buf, map->format.pad_bytes + map->format.reg_bytes + map->format.val_bytes); spin_lock_irqsave(&map->async_lock, flags); list_add_tail(&async->list, &map->async_list); spin_unlock_irqrestore(&map->async_lock, flags); if (val != work_val) ret = map->bus->async_write(map->bus_context, async->work_buf, map->format.reg_bytes + map->format.pad_bytes, val, val_len, async); else ret = map->bus->async_write(map->bus_context, async->work_buf, map->format.reg_bytes + map->format.pad_bytes + val_len, NULL, 0, async); if (ret != 0) { dev_err(map->dev, "Failed to schedule write: %d\n", ret); spin_lock_irqsave(&map->async_lock, flags); list_move(&async->list, &map->async_free); spin_unlock_irqrestore(&map->async_lock, flags); } return ret; } trace_regmap_hw_write_start(map, reg, val_len / map->format.val_bytes); /* If we're doing a single register write we can probably just * send the work_buf directly, otherwise try to do a gather * write. */ if (val == work_val) ret = map->write(map->bus_context, map->work_buf, map->format.reg_bytes + map->format.pad_bytes + val_len); else if (map->bus && map->bus->gather_write) ret = map->bus->gather_write(map->bus_context, map->work_buf, map->format.reg_bytes + map->format.pad_bytes, val, val_len); else ret = -ENOTSUPP; /* If that didn't work fall back on linearising by hand. */ if (ret == -ENOTSUPP) { len = map->format.reg_bytes + map->format.pad_bytes + val_len; buf = kzalloc(len, GFP_KERNEL); if (!buf) return -ENOMEM; memcpy(buf, map->work_buf, map->format.reg_bytes); memcpy(buf + map->format.reg_bytes + map->format.pad_bytes, val, val_len); ret = map->write(map->bus_context, buf, len); kfree(buf); } else if (ret != 0 && !map->cache_bypass && map->format.parse_val) { /* regcache_drop_region() takes lock that we already have, * thus call map->cache_ops->drop() directly */ if (map->cache_ops && map->cache_ops->drop) map->cache_ops->drop(map, reg, reg + 1); } trace_regmap_hw_write_done(map, reg, val_len / map->format.val_bytes); return ret; } /** * regmap_can_raw_write - Test if regmap_raw_write() is supported * * @map: Map to check. */ bool regmap_can_raw_write(struct regmap *map) { return map->write && map->format.format_val && map->format.format_reg; } EXPORT_SYMBOL_GPL(regmap_can_raw_write); /** * regmap_get_raw_read_max - Get the maximum size we can read * * @map: Map to check. */ size_t regmap_get_raw_read_max(struct regmap *map) { return map->max_raw_read; } EXPORT_SYMBOL_GPL(regmap_get_raw_read_max); /** * regmap_get_raw_write_max - Get the maximum size we can read * * @map: Map to check. */ size_t regmap_get_raw_write_max(struct regmap *map) { return map->max_raw_write; } EXPORT_SYMBOL_GPL(regmap_get_raw_write_max); static int _regmap_bus_formatted_write(void *context, unsigned int reg, unsigned int val) { int ret; struct regmap_range_node *range; struct regmap *map = context; WARN_ON(!map->format.format_write); range = _regmap_range_lookup(map, reg); if (range) { ret = _regmap_select_page(map, ®, range, 1); if (ret != 0) return ret; } reg = regmap_reg_addr(map, reg); map->format.format_write(map, reg, val); trace_regmap_hw_write_start(map, reg, 1); ret = map->write(map->bus_context, map->work_buf, map->format.buf_size); trace_regmap_hw_write_done(map, reg, 1); return ret; } static int _regmap_bus_reg_write(void *context, unsigned int reg, unsigned int val) { struct regmap *map = context; struct regmap_range_node *range; int ret; range = _regmap_range_lookup(map, reg); if (range) { ret = _regmap_select_page(map, ®, range, 1); if (ret != 0) return ret; } reg = regmap_reg_addr(map, reg); return map->bus->reg_write(map->bus_context, reg, val); } static int _regmap_bus_raw_write(void *context, unsigned int reg, unsigned int val) { struct regmap *map = context; WARN_ON(!map->format.format_val); map->format.format_val(map->work_buf + map->format.reg_bytes + map->format.pad_bytes, val, 0); return _regmap_raw_write_impl(map, reg, map->work_buf + map->format.reg_bytes + map->format.pad_bytes, map->format.val_bytes, false); } static inline void *_regmap_map_get_context(struct regmap *map) { return (map->bus || (!map->bus && map->read)) ? map : map->bus_context; } int _regmap_write(struct regmap *map, unsigned int reg, unsigned int val) { int ret; void *context = _regmap_map_get_context(map); if (!regmap_writeable(map, reg)) return -EIO; if (!map->cache_bypass && !map->defer_caching) { ret = regcache_write(map, reg, val); if (ret != 0) return ret; if (map->cache_only) { map->cache_dirty = true; return 0; } } ret = map->reg_write(context, reg, val); if (ret == 0) { if (regmap_should_log(map)) dev_info(map->dev, "%x <= %x\n", reg, val); trace_regmap_reg_write(map, reg, val); } return ret; } /** * regmap_write() - Write a value to a single register * * @map: Register map to write to * @reg: Register to write to * @val: Value to be written * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_write(struct regmap *map, unsigned int reg, unsigned int val) { int ret; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; map->lock(map->lock_arg); ret = _regmap_write(map, reg, val); map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_write); /** * regmap_write_async() - Write a value to a single register asynchronously * * @map: Register map to write to * @reg: Register to write to * @val: Value to be written * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_write_async(struct regmap *map, unsigned int reg, unsigned int val) { int ret; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; map->lock(map->lock_arg); map->async = true; ret = _regmap_write(map, reg, val); map->async = false; map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_write_async); int _regmap_raw_write(struct regmap *map, unsigned int reg, const void *val, size_t val_len, bool noinc) { size_t val_bytes = map->format.val_bytes; size_t val_count = val_len / val_bytes; size_t chunk_count, chunk_bytes; size_t chunk_regs = val_count; int ret, i; if (!val_count) return -EINVAL; if (map->use_single_write) chunk_regs = 1; else if (map->max_raw_write && val_len > map->max_raw_write) chunk_regs = map->max_raw_write / val_bytes; chunk_count = val_count / chunk_regs; chunk_bytes = chunk_regs * val_bytes; /* Write as many bytes as possible with chunk_size */ for (i = 0; i < chunk_count; i++) { ret = _regmap_raw_write_impl(map, reg, val, chunk_bytes, noinc); if (ret) return ret; reg += regmap_get_offset(map, chunk_regs); val += chunk_bytes; val_len -= chunk_bytes; } /* Write remaining bytes */ if (val_len) ret = _regmap_raw_write_impl(map, reg, val, val_len, noinc); return ret; } /** * regmap_raw_write() - Write raw values to one or more registers * * @map: Register map to write to * @reg: Initial register to write to * @val: Block of data to be written, laid out for direct transmission to the * device * @val_len: Length of data pointed to by val. * * This function is intended to be used for things like firmware * download where a large block of data needs to be transferred to the * device. No formatting will be done on the data provided. * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_raw_write(struct regmap *map, unsigned int reg, const void *val, size_t val_len) { int ret; if (!regmap_can_raw_write(map)) return -EINVAL; if (val_len % map->format.val_bytes) return -EINVAL; map->lock(map->lock_arg); ret = _regmap_raw_write(map, reg, val, val_len, false); map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_raw_write); static int regmap_noinc_readwrite(struct regmap *map, unsigned int reg, void *val, unsigned int val_len, bool write) { size_t val_bytes = map->format.val_bytes; size_t val_count = val_len / val_bytes; unsigned int lastval; u8 *u8p; u16 *u16p; u32 *u32p; int ret; int i; switch (val_bytes) { case 1: u8p = val; if (write) lastval = (unsigned int)u8p[val_count - 1]; break; case 2: u16p = val; if (write) lastval = (unsigned int)u16p[val_count - 1]; break; case 4: u32p = val; if (write) lastval = (unsigned int)u32p[val_count - 1]; break; default: return -EINVAL; } /* * Update the cache with the last value we write, the rest is just * gone down in the hardware FIFO. We can't cache FIFOs. This makes * sure a single read from the cache will work. */ if (write) { if (!map->cache_bypass && !map->defer_caching) { ret = regcache_write(map, reg, lastval); if (ret != 0) return ret; if (map->cache_only) { map->cache_dirty = true; return 0; } } ret = map->bus->reg_noinc_write(map->bus_context, reg, val, val_count); } else { ret = map->bus->reg_noinc_read(map->bus_context, reg, val, val_count); } if (!ret && regmap_should_log(map)) { dev_info(map->dev, "%x %s [", reg, write ? "<=" : "=>"); for (i = 0; i < val_count; i++) { switch (val_bytes) { case 1: pr_cont("%x", u8p[i]); break; case 2: pr_cont("%x", u16p[i]); break; case 4: pr_cont("%x", u32p[i]); break; default: break; } if (i == (val_count - 1)) pr_cont("]\n"); else pr_cont(","); } } return 0; } /** * regmap_noinc_write(): Write data to a register without incrementing the * register number * * @map: Register map to write to * @reg: Register to write to * @val: Pointer to data buffer * @val_len: Length of output buffer in bytes. * * The regmap API usually assumes that bulk bus write operations will write a * range of registers. Some devices have certain registers for which a write * operation can write to an internal FIFO. * * The target register must be volatile but registers after it can be * completely unrelated cacheable registers. * * This will attempt multiple writes as required to write val_len bytes. * * A value of zero will be returned on success, a negative errno will be * returned in error cases. */ int regmap_noinc_write(struct regmap *map, unsigned int reg, const void *val, size_t val_len) { size_t write_len; int ret; if (!map->write && !(map->bus && map->bus->reg_noinc_write)) return -EINVAL; if (val_len % map->format.val_bytes) return -EINVAL; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; if (val_len == 0) return -EINVAL; map->lock(map->lock_arg); if (!regmap_volatile(map, reg) || !regmap_writeable_noinc(map, reg)) { ret = -EINVAL; goto out_unlock; } /* * Use the accelerated operation if we can. The val drops the const * typing in order to facilitate code reuse in regmap_noinc_readwrite(). */ if (map->bus->reg_noinc_write) { ret = regmap_noinc_readwrite(map, reg, (void *)val, val_len, true); goto out_unlock; } while (val_len) { if (map->max_raw_write && map->max_raw_write < val_len) write_len = map->max_raw_write; else write_len = val_len; ret = _regmap_raw_write(map, reg, val, write_len, true); if (ret) goto out_unlock; val = ((u8 *)val) + write_len; val_len -= write_len; } out_unlock: map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_noinc_write); /** * regmap_field_update_bits_base() - Perform a read/modify/write cycle a * register field. * * @field: Register field to write to * @mask: Bitmask to change * @val: Value to be written * @change: Boolean indicating if a write was done * @async: Boolean indicating asynchronously * @force: Boolean indicating use force update * * Perform a read/modify/write cycle on the register field with change, * async, force option. * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_field_update_bits_base(struct regmap_field *field, unsigned int mask, unsigned int val, bool *change, bool async, bool force) { mask = (mask << field->shift) & field->mask; return regmap_update_bits_base(field->regmap, field->reg, mask, val << field->shift, change, async, force); } EXPORT_SYMBOL_GPL(regmap_field_update_bits_base); /** * regmap_field_test_bits() - Check if all specified bits are set in a * register field. * * @field: Register field to operate on * @bits: Bits to test * * Returns -1 if the underlying regmap_field_read() fails, 0 if at least one of the * tested bits is not set and 1 if all tested bits are set. */ int regmap_field_test_bits(struct regmap_field *field, unsigned int bits) { unsigned int val, ret; ret = regmap_field_read(field, &val); if (ret) return ret; return (val & bits) == bits; } EXPORT_SYMBOL_GPL(regmap_field_test_bits); /** * regmap_fields_update_bits_base() - Perform a read/modify/write cycle a * register field with port ID * * @field: Register field to write to * @id: port ID * @mask: Bitmask to change * @val: Value to be written * @change: Boolean indicating if a write was done * @async: Boolean indicating asynchronously * @force: Boolean indicating use force update * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_fields_update_bits_base(struct regmap_field *field, unsigned int id, unsigned int mask, unsigned int val, bool *change, bool async, bool force) { if (id >= field->id_size) return -EINVAL; mask = (mask << field->shift) & field->mask; return regmap_update_bits_base(field->regmap, field->reg + (field->id_offset * id), mask, val << field->shift, change, async, force); } EXPORT_SYMBOL_GPL(regmap_fields_update_bits_base); /** * regmap_bulk_write() - Write multiple registers to the device * * @map: Register map to write to * @reg: First register to be write from * @val: Block of data to be written, in native register size for device * @val_count: Number of registers to write * * This function is intended to be used for writing a large block of * data to the device either in single transfer or multiple transfer. * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_bulk_write(struct regmap *map, unsigned int reg, const void *val, size_t val_count) { int ret = 0, i; size_t val_bytes = map->format.val_bytes; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; /* * Some devices don't support bulk write, for them we have a series of * single write operations. */ if (!map->write || !map->format.parse_inplace) { map->lock(map->lock_arg); for (i = 0; i < val_count; i++) { unsigned int ival; switch (val_bytes) { case 1: ival = *(u8 *)(val + (i * val_bytes)); break; case 2: ival = *(u16 *)(val + (i * val_bytes)); break; case 4: ival = *(u32 *)(val + (i * val_bytes)); break; default: ret = -EINVAL; goto out; } ret = _regmap_write(map, reg + regmap_get_offset(map, i), ival); if (ret != 0) goto out; } out: map->unlock(map->lock_arg); } else { void *wval; wval = kmemdup_array(val, val_count, val_bytes, map->alloc_flags); if (!wval) return -ENOMEM; for (i = 0; i < val_count * val_bytes; i += val_bytes) map->format.parse_inplace(wval + i); ret = regmap_raw_write(map, reg, wval, val_bytes * val_count); kfree(wval); } if (!ret) trace_regmap_bulk_write(map, reg, val, val_bytes * val_count); return ret; } EXPORT_SYMBOL_GPL(regmap_bulk_write); /* * _regmap_raw_multi_reg_write() * * the (register,newvalue) pairs in regs have not been formatted, but * they are all in the same page and have been changed to being page * relative. The page register has been written if that was necessary. */ static int _regmap_raw_multi_reg_write(struct regmap *map, const struct reg_sequence *regs, size_t num_regs) { int ret; void *buf; int i; u8 *u8; size_t val_bytes = map->format.val_bytes; size_t reg_bytes = map->format.reg_bytes; size_t pad_bytes = map->format.pad_bytes; size_t pair_size = reg_bytes + pad_bytes + val_bytes; size_t len = pair_size * num_regs; if (!len) return -EINVAL; buf = kzalloc(len, GFP_KERNEL); if (!buf) return -ENOMEM; /* We have to linearise by hand. */ u8 = buf; for (i = 0; i < num_regs; i++) { unsigned int reg = regs[i].reg; unsigned int val = regs[i].def; trace_regmap_hw_write_start(map, reg, 1); reg = regmap_reg_addr(map, reg); map->format.format_reg(u8, reg, map->reg_shift); u8 += reg_bytes + pad_bytes; map->format.format_val(u8, val, 0); u8 += val_bytes; } u8 = buf; *u8 |= map->write_flag_mask; ret = map->write(map->bus_context, buf, len); kfree(buf); for (i = 0; i < num_regs; i++) { int reg = regs[i].reg; trace_regmap_hw_write_done(map, reg, 1); } return ret; } static unsigned int _regmap_register_page(struct regmap *map, unsigned int reg, struct regmap_range_node *range) { unsigned int win_page = (reg - range->range_min) / range->window_len; return win_page; } static int _regmap_range_multi_paged_reg_write(struct regmap *map, struct reg_sequence *regs, size_t num_regs) { int ret; int i, n; struct reg_sequence *base; unsigned int this_page = 0; unsigned int page_change = 0; /* * the set of registers are not neccessarily in order, but * since the order of write must be preserved this algorithm * chops the set each time the page changes. This also applies * if there is a delay required at any point in the sequence. */ base = regs; for (i = 0, n = 0; i < num_regs; i++, n++) { unsigned int reg = regs[i].reg; struct regmap_range_node *range; range = _regmap_range_lookup(map, reg); if (range) { unsigned int win_page = _regmap_register_page(map, reg, range); if (i == 0) this_page = win_page; if (win_page != this_page) { this_page = win_page; page_change = 1; } } /* If we have both a page change and a delay make sure to * write the regs and apply the delay before we change the * page. */ if (page_change || regs[i].delay_us) { /* For situations where the first write requires * a delay we need to make sure we don't call * raw_multi_reg_write with n=0 * This can't occur with page breaks as we * never write on the first iteration */ if (regs[i].delay_us && i == 0) n = 1; ret = _regmap_raw_multi_reg_write(map, base, n); if (ret != 0) return ret; if (regs[i].delay_us) { if (map->can_sleep) fsleep(regs[i].delay_us); else udelay(regs[i].delay_us); } base += n; n = 0; if (page_change) { ret = _regmap_select_page(map, &base[n].reg, range, 1); if (ret != 0) return ret; page_change = 0; } } } if (n > 0) return _regmap_raw_multi_reg_write(map, base, n); return 0; } static int _regmap_multi_reg_write(struct regmap *map, const struct reg_sequence *regs, size_t num_regs) { int i; int ret; if (!map->can_multi_write) { for (i = 0; i < num_regs; i++) { ret = _regmap_write(map, regs[i].reg, regs[i].def); if (ret != 0) return ret; if (regs[i].delay_us) { if (map->can_sleep) fsleep(regs[i].delay_us); else udelay(regs[i].delay_us); } } return 0; } if (!map->format.parse_inplace) return -EINVAL; if (map->writeable_reg) for (i = 0; i < num_regs; i++) { int reg = regs[i].reg; if (!map->writeable_reg(map->dev, reg)) return -EINVAL; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; } if (!map->cache_bypass) { for (i = 0; i < num_regs; i++) { unsigned int val = regs[i].def; unsigned int reg = regs[i].reg; ret = regcache_write(map, reg, val); if (ret) { dev_err(map->dev, "Error in caching of register: %x ret: %d\n", reg, ret); return ret; } } if (map->cache_only) { map->cache_dirty = true; return 0; } } WARN_ON(!map->bus); for (i = 0; i < num_regs; i++) { unsigned int reg = regs[i].reg; struct regmap_range_node *range; /* Coalesce all the writes between a page break or a delay * in a sequence */ range = _regmap_range_lookup(map, reg); if (range || regs[i].delay_us) { size_t len = sizeof(struct reg_sequence)*num_regs; struct reg_sequence *base = kmemdup(regs, len, GFP_KERNEL); if (!base) return -ENOMEM; ret = _regmap_range_multi_paged_reg_write(map, base, num_regs); kfree(base); return ret; } } return _regmap_raw_multi_reg_write(map, regs, num_regs); } /** * regmap_multi_reg_write() - Write multiple registers to the device * * @map: Register map to write to * @regs: Array of structures containing register,value to be written * @num_regs: Number of registers to write * * Write multiple registers to the device where the set of register, value * pairs are supplied in any order, possibly not all in a single range. * * The 'normal' block write mode will send ultimately send data on the * target bus as R,V1,V2,V3,..,Vn where successively higher registers are * addressed. However, this alternative block multi write mode will send * the data as R1,V1,R2,V2,..,Rn,Vn on the target bus. The target device * must of course support the mode. * * A value of zero will be returned on success, a negative errno will be * returned in error cases. */ int regmap_multi_reg_write(struct regmap *map, const struct reg_sequence *regs, int num_regs) { int ret; map->lock(map->lock_arg); ret = _regmap_multi_reg_write(map, regs, num_regs); map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_multi_reg_write); /** * regmap_multi_reg_write_bypassed() - Write multiple registers to the * device but not the cache * * @map: Register map to write to * @regs: Array of structures containing register,value to be written * @num_regs: Number of registers to write * * Write multiple registers to the device but not the cache where the set * of register are supplied in any order. * * This function is intended to be used for writing a large block of data * atomically to the device in single transfer for those I2C client devices * that implement this alternative block write mode. * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_multi_reg_write_bypassed(struct regmap *map, const struct reg_sequence *regs, int num_regs) { int ret; bool bypass; map->lock(map->lock_arg); bypass = map->cache_bypass; map->cache_bypass = true; ret = _regmap_multi_reg_write(map, regs, num_regs); map->cache_bypass = bypass; map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_multi_reg_write_bypassed); /** * regmap_raw_write_async() - Write raw values to one or more registers * asynchronously * * @map: Register map to write to * @reg: Initial register to write to * @val: Block of data to be written, laid out for direct transmission to the * device. Must be valid until regmap_async_complete() is called. * @val_len: Length of data pointed to by val. * * This function is intended to be used for things like firmware * download where a large block of data needs to be transferred to the * device. No formatting will be done on the data provided. * * If supported by the underlying bus the write will be scheduled * asynchronously, helping maximise I/O speed on higher speed buses * like SPI. regmap_async_complete() can be called to ensure that all * asynchrnous writes have been completed. * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_raw_write_async(struct regmap *map, unsigned int reg, const void *val, size_t val_len) { int ret; if (val_len % map->format.val_bytes) return -EINVAL; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; map->lock(map->lock_arg); map->async = true; ret = _regmap_raw_write(map, reg, val, val_len, false); map->async = false; map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_raw_write_async); static int _regmap_raw_read(struct regmap *map, unsigned int reg, void *val, unsigned int val_len, bool noinc) { struct regmap_range_node *range; int ret; if (!map->read) return -EINVAL; range = _regmap_range_lookup(map, reg); if (range) { ret = _regmap_select_page(map, ®, range, noinc ? 1 : val_len / map->format.val_bytes); if (ret != 0) return ret; } reg = regmap_reg_addr(map, reg); map->format.format_reg(map->work_buf, reg, map->reg_shift); regmap_set_work_buf_flag_mask(map, map->format.reg_bytes, map->read_flag_mask); trace_regmap_hw_read_start(map, reg, val_len / map->format.val_bytes); ret = map->read(map->bus_context, map->work_buf, map->format.reg_bytes + map->format.pad_bytes, val, val_len); trace_regmap_hw_read_done(map, reg, val_len / map->format.val_bytes); return ret; } static int _regmap_bus_reg_read(void *context, unsigned int reg, unsigned int *val) { struct regmap *map = context; struct regmap_range_node *range; int ret; range = _regmap_range_lookup(map, reg); if (range) { ret = _regmap_select_page(map, ®, range, 1); if (ret != 0) return ret; } reg = regmap_reg_addr(map, reg); return map->bus->reg_read(map->bus_context, reg, val); } static int _regmap_bus_read(void *context, unsigned int reg, unsigned int *val) { int ret; struct regmap *map = context; void *work_val = map->work_buf + map->format.reg_bytes + map->format.pad_bytes; if (!map->format.parse_val) return -EINVAL; ret = _regmap_raw_read(map, reg, work_val, map->format.val_bytes, false); if (ret == 0) *val = map->format.parse_val(work_val); return ret; } static int _regmap_read(struct regmap *map, unsigned int reg, unsigned int *val) { int ret; void *context = _regmap_map_get_context(map); if (!map->cache_bypass) { ret = regcache_read(map, reg, val); if (ret == 0) return 0; } if (map->cache_only) return -EBUSY; if (!regmap_readable(map, reg)) return -EIO; ret = map->reg_read(context, reg, val); if (ret == 0) { if (regmap_should_log(map)) dev_info(map->dev, "%x => %x\n", reg, *val); trace_regmap_reg_read(map, reg, *val); if (!map->cache_bypass) regcache_write(map, reg, *val); } return ret; } /** * regmap_read() - Read a value from a single register * * @map: Register map to read from * @reg: Register to be read from * @val: Pointer to store read value * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_read(struct regmap *map, unsigned int reg, unsigned int *val) { int ret; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; map->lock(map->lock_arg); ret = _regmap_read(map, reg, val); map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_read); /** * regmap_read_bypassed() - Read a value from a single register direct * from the device, bypassing the cache * * @map: Register map to read from * @reg: Register to be read from * @val: Pointer to store read value * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_read_bypassed(struct regmap *map, unsigned int reg, unsigned int *val) { int ret; bool bypass, cache_only; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; map->lock(map->lock_arg); bypass = map->cache_bypass; cache_only = map->cache_only; map->cache_bypass = true; map->cache_only = false; ret = _regmap_read(map, reg, val); map->cache_bypass = bypass; map->cache_only = cache_only; map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_read_bypassed); /** * regmap_raw_read() - Read raw data from the device * * @map: Register map to read from * @reg: First register to be read from * @val: Pointer to store read value * @val_len: Size of data to read * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_raw_read(struct regmap *map, unsigned int reg, void *val, size_t val_len) { size_t val_bytes = map->format.val_bytes; size_t val_count = val_len / val_bytes; unsigned int v; int ret, i; if (val_len % map->format.val_bytes) return -EINVAL; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; if (val_count == 0) return -EINVAL; map->lock(map->lock_arg); if (regmap_volatile_range(map, reg, val_count) || map->cache_bypass || map->cache_type == REGCACHE_NONE) { size_t chunk_count, chunk_bytes; size_t chunk_regs = val_count; if (!map->cache_bypass && map->cache_only) { ret = -EBUSY; goto out; } if (!map->read) { ret = -ENOTSUPP; goto out; } if (map->use_single_read) chunk_regs = 1; else if (map->max_raw_read && val_len > map->max_raw_read) chunk_regs = map->max_raw_read / val_bytes; chunk_count = val_count / chunk_regs; chunk_bytes = chunk_regs * val_bytes; /* Read bytes that fit into whole chunks */ for (i = 0; i < chunk_count; i++) { ret = _regmap_raw_read(map, reg, val, chunk_bytes, false); if (ret != 0) goto out; reg += regmap_get_offset(map, chunk_regs); val += chunk_bytes; val_len -= chunk_bytes; } /* Read remaining bytes */ if (val_len) { ret = _regmap_raw_read(map, reg, val, val_len, false); if (ret != 0) goto out; } } else { /* Otherwise go word by word for the cache; should be low * cost as we expect to hit the cache. */ for (i = 0; i < val_count; i++) { ret = _regmap_read(map, reg + regmap_get_offset(map, i), &v); if (ret != 0) goto out; map->format.format_val(val + (i * val_bytes), v, 0); } } out: map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_raw_read); /** * regmap_noinc_read(): Read data from a register without incrementing the * register number * * @map: Register map to read from * @reg: Register to read from * @val: Pointer to data buffer * @val_len: Length of output buffer in bytes. * * The regmap API usually assumes that bulk read operations will read a * range of registers. Some devices have certain registers for which a read * operation read will read from an internal FIFO. * * The target register must be volatile but registers after it can be * completely unrelated cacheable registers. * * This will attempt multiple reads as required to read val_len bytes. * * A value of zero will be returned on success, a negative errno will be * returned in error cases. */ int regmap_noinc_read(struct regmap *map, unsigned int reg, void *val, size_t val_len) { size_t read_len; int ret; if (!map->read) return -ENOTSUPP; if (val_len % map->format.val_bytes) return -EINVAL; if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; if (val_len == 0) return -EINVAL; map->lock(map->lock_arg); if (!regmap_volatile(map, reg) || !regmap_readable_noinc(map, reg)) { ret = -EINVAL; goto out_unlock; } /* * We have not defined the FIFO semantics for cache, as the * cache is just one value deep. Should we return the last * written value? Just avoid this by always reading the FIFO * even when using cache. Cache only will not work. */ if (!map->cache_bypass && map->cache_only) { ret = -EBUSY; goto out_unlock; } /* Use the accelerated operation if we can */ if (map->bus->reg_noinc_read) { ret = regmap_noinc_readwrite(map, reg, val, val_len, false); goto out_unlock; } while (val_len) { if (map->max_raw_read && map->max_raw_read < val_len) read_len = map->max_raw_read; else read_len = val_len; ret = _regmap_raw_read(map, reg, val, read_len, true); if (ret) goto out_unlock; val = ((u8 *)val) + read_len; val_len -= read_len; } out_unlock: map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_noinc_read); /** * regmap_field_read(): Read a value to a single register field * * @field: Register field to read from * @val: Pointer to store read value * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_field_read(struct regmap_field *field, unsigned int *val) { int ret; unsigned int reg_val; ret = regmap_read(field->regmap, field->reg, ®_val); if (ret != 0) return ret; reg_val &= field->mask; reg_val >>= field->shift; *val = reg_val; return ret; } EXPORT_SYMBOL_GPL(regmap_field_read); /** * regmap_fields_read() - Read a value to a single register field with port ID * * @field: Register field to read from * @id: port ID * @val: Pointer to store read value * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_fields_read(struct regmap_field *field, unsigned int id, unsigned int *val) { int ret; unsigned int reg_val; if (id >= field->id_size) return -EINVAL; ret = regmap_read(field->regmap, field->reg + (field->id_offset * id), ®_val); if (ret != 0) return ret; reg_val &= field->mask; reg_val >>= field->shift; *val = reg_val; return ret; } EXPORT_SYMBOL_GPL(regmap_fields_read); static int _regmap_bulk_read(struct regmap *map, unsigned int reg, const unsigned int *regs, void *val, size_t val_count) { u32 *u32 = val; u16 *u16 = val; u8 *u8 = val; int ret, i; map->lock(map->lock_arg); for (i = 0; i < val_count; i++) { unsigned int ival; if (regs) { if (!IS_ALIGNED(regs[i], map->reg_stride)) { ret = -EINVAL; goto out; } ret = _regmap_read(map, regs[i], &ival); } else { ret = _regmap_read(map, reg + regmap_get_offset(map, i), &ival); } if (ret != 0) goto out; switch (map->format.val_bytes) { case 4: u32[i] = ival; break; case 2: u16[i] = ival; break; case 1: u8[i] = ival; break; default: ret = -EINVAL; goto out; } } out: map->unlock(map->lock_arg); return ret; } /** * regmap_bulk_read() - Read multiple sequential registers from the device * * @map: Register map to read from * @reg: First register to be read from * @val: Pointer to store read value, in native register size for device * @val_count: Number of registers to read * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_bulk_read(struct regmap *map, unsigned int reg, void *val, size_t val_count) { int ret, i; size_t val_bytes = map->format.val_bytes; bool vol = regmap_volatile_range(map, reg, val_count); if (!IS_ALIGNED(reg, map->reg_stride)) return -EINVAL; if (val_count == 0) return -EINVAL; if (map->read && map->format.parse_inplace && (vol || map->cache_type == REGCACHE_NONE)) { ret = regmap_raw_read(map, reg, val, val_bytes * val_count); if (ret != 0) return ret; for (i = 0; i < val_count * val_bytes; i += val_bytes) map->format.parse_inplace(val + i); } else { ret = _regmap_bulk_read(map, reg, NULL, val, val_count); } if (!ret) trace_regmap_bulk_read(map, reg, val, val_bytes * val_count); return ret; } EXPORT_SYMBOL_GPL(regmap_bulk_read); /** * regmap_multi_reg_read() - Read multiple non-sequential registers from the device * * @map: Register map to read from * @regs: Array of registers to read from * @val: Pointer to store read value, in native register size for device * @val_count: Number of registers to read * * A value of zero will be returned on success, a negative errno will * be returned in error cases. */ int regmap_multi_reg_read(struct regmap *map, const unsigned int *regs, void *val, size_t val_count) { if (val_count == 0) return -EINVAL; return _regmap_bulk_read(map, 0, regs, val, val_count); } EXPORT_SYMBOL_GPL(regmap_multi_reg_read); static int _regmap_update_bits(struct regmap *map, unsigned int reg, unsigned int mask, unsigned int val, bool *change, bool force_write) { int ret; unsigned int tmp, orig; if (change) *change = false; if (regmap_volatile(map, reg) && map->reg_update_bits) { reg = regmap_reg_addr(map, reg); ret = map->reg_update_bits(map->bus_context, reg, mask, val); if (ret == 0 && change) *change = true; } else { ret = _regmap_read(map, reg, &orig); if (ret != 0) return ret; tmp = orig & ~mask; tmp |= val & mask; if (force_write || (tmp != orig) || map->force_write_field) { ret = _regmap_write(map, reg, tmp); if (ret == 0 && change) *change = true; } } return ret; } /** * regmap_update_bits_base() - Perform a read/modify/write cycle on a register * * @map: Register map to update * @reg: Register to update * @mask: Bitmask to change * @val: New value for bitmask * @change: Boolean indicating if a write was done * @async: Boolean indicating asynchronously * @force: Boolean indicating use force update * * Perform a read/modify/write cycle on a register map with change, async, force * options. * * If async is true: * * With most buses the read must be done synchronously so this is most useful * for devices with a cache which do not need to interact with the hardware to * determine the current register value. * * Returns zero for success, a negative number on error. */ int regmap_update_bits_base(struct regmap *map, unsigned int reg, unsigned int mask, unsigned int val, bool *change, bool async, bool force) { int ret; map->lock(map->lock_arg); map->async = async; ret = _regmap_update_bits(map, reg, mask, val, change, force); map->async = false; map->unlock(map->lock_arg); return ret; } EXPORT_SYMBOL_GPL(regmap_update_bits_base); /** * regmap_test_bits() - Check if all specified bits are set in a register. * * @map: Register map to operate on * @reg: Register to read from * @bits: Bits to test * * Returns 0 if at least one of the tested bits is not set, 1 if all tested * bits are set and a negative error number if the underlying regmap_read() * fails. */ int regmap_test_bits(struct regmap *map, unsigned int reg, unsigned int bits) { unsigned int val, ret; ret = regmap_read(map, reg, &val); if (ret) return ret; return (val & bits) == bits; } EXPORT_SYMBOL_GPL(regmap_test_bits); void regmap_async_complete_cb(struct regmap_async *async, int ret) { struct regmap *map = async->map; bool wake; trace_regmap_async_io_complete(map); spin_lock(&map->async_lock); list_move(&async->list, &map->async_free); wake = list_empty(&map->async_list); if (ret != 0) map->async_ret = ret; spin_unlock(&map->async_lock); if (wake) wake_up(&map->async_waitq); } EXPORT_SYMBOL_GPL(regmap_async_complete_cb); static int regmap_async_is_done(struct regmap *map) { unsigned long flags; int ret; spin_lock_irqsave(&map->async_lock, flags); ret = list_empty(&map->async_list); spin_unlock_irqrestore(&map->async_lock, flags); return ret; } /** * regmap_async_complete - Ensure all asynchronous I/O has completed. * * @map: Map to operate on. * * Blocks until any pending asynchronous I/O has completed. Returns * an error code for any failed I/O operations. */ int regmap_async_complete(struct regmap *map) { unsigned long flags; int ret; /* Nothing to do with no async support */ if (!map->bus || !map->bus->async_write) return 0; trace_regmap_async_complete_start(map); wait_event(map->async_waitq, regmap_async_is_done(map)); spin_lock_irqsave(&map->async_lock, flags); ret = map->async_ret; map->async_ret = 0; spin_unlock_irqrestore(&map->async_lock, flags); trace_regmap_async_complete_done(map); return ret; } EXPORT_SYMBOL_GPL(regmap_async_complete); /** * regmap_register_patch - Register and apply register updates to be applied * on device initialistion * * @map: Register map to apply updates to. * @regs: Values to update. * @num_regs: Number of entries in regs. * * Register a set of register updates to be applied to the device * whenever the device registers are synchronised with the cache and * apply them immediately. Typically this is used to apply * corrections to be applied to the device defaults on startup, such * as the updates some vendors provide to undocumented registers. * * The caller must ensure that this function cannot be called * concurrently with either itself or regcache_sync(). */ int regmap_register_patch(struct regmap *map, const struct reg_sequence *regs, int num_regs) { struct reg_sequence *p; int ret; bool bypass; if (WARN_ONCE(num_regs <= 0, "invalid registers number (%d)\n", num_regs)) return 0; p = krealloc(map->patch, sizeof(struct reg_sequence) * (map->patch_regs + num_regs), GFP_KERNEL); if (p) { memcpy(p + map->patch_regs, regs, num_regs * sizeof(*regs)); map->patch = p; map->patch_regs += num_regs; } else { return -ENOMEM; } map->lock(map->lock_arg); bypass = map->cache_bypass; map->cache_bypass = true; map->async = true; ret = _regmap_multi_reg_write(map, regs, num_regs); map->async = false; map->cache_bypass = bypass; map->unlock(map->lock_arg); regmap_async_complete(map); return ret; } EXPORT_SYMBOL_GPL(regmap_register_patch); /** * regmap_get_val_bytes() - Report the size of a register value * * @map: Register map to operate on. * * Report the size of a register value, mainly intended to for use by * generic infrastructure built on top of regmap. */ int regmap_get_val_bytes(struct regmap *map) { if (map->format.format_write) return -EINVAL; return map->format.val_bytes; } EXPORT_SYMBOL_GPL(regmap_get_val_bytes); /** * regmap_get_max_register() - Report the max register value * * @map: Register map to operate on. * * Report the max register value, mainly intended to for use by * generic infrastructure built on top of regmap. */ int regmap_get_max_register(struct regmap *map) { return map->max_register_is_set ? map->max_register : -EINVAL; } EXPORT_SYMBOL_GPL(regmap_get_max_register); /** * regmap_get_reg_stride() - Report the register address stride * * @map: Register map to operate on. * * Report the register address stride, mainly intended to for use by * generic infrastructure built on top of regmap. */ int regmap_get_reg_stride(struct regmap *map) { return map->reg_stride; } EXPORT_SYMBOL_GPL(regmap_get_reg_stride); /** * regmap_might_sleep() - Returns whether a regmap access might sleep. * * @map: Register map to operate on. * * Returns true if an access to the register might sleep, else false. */ bool regmap_might_sleep(struct regmap *map) { return map->can_sleep; } EXPORT_SYMBOL_GPL(regmap_might_sleep); int regmap_parse_val(struct regmap *map, const void *buf, unsigned int *val) { if (!map->format.parse_val) return -EINVAL; *val = map->format.parse_val(buf); return 0; } EXPORT_SYMBOL_GPL(regmap_parse_val); static int __init regmap_initcall(void) { regmap_debugfs_initcall(); return 0; } postcore_initcall(regmap_initcall); |
| 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 | // SPDX-License-Identifier: GPL-2.0 /* * Copyright (C) 2023 Western Digital Corporation or its affiliates. */ #include <linux/btrfs_tree.h> #include "ctree.h" #include "fs.h" #include "accessors.h" #include "transaction.h" #include "disk-io.h" #include "raid-stripe-tree.h" #include "volumes.h" #include "print-tree.h" static int btrfs_partially_delete_raid_extent(struct btrfs_trans_handle *trans, struct btrfs_path *path, const struct btrfs_key *oldkey, u64 newlen, u64 frontpad) { struct btrfs_root *stripe_root = trans->fs_info->stripe_root; struct btrfs_stripe_extent *extent, *newitem; struct extent_buffer *leaf; int slot; size_t item_size; struct btrfs_key newkey = { .objectid = oldkey->objectid + frontpad, .type = BTRFS_RAID_STRIPE_KEY, .offset = newlen, }; int ret; ASSERT(newlen > 0); ASSERT(oldkey->type == BTRFS_RAID_STRIPE_KEY); leaf = path->nodes[0]; slot = path->slots[0]; item_size = btrfs_item_size(leaf, slot); newitem = kzalloc(item_size, GFP_NOFS); if (!newitem) return -ENOMEM; extent = btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); for (int i = 0; i < btrfs_num_raid_stripes(item_size); i++) { struct btrfs_raid_stride *stride = &extent->strides[i]; u64 phys; phys = btrfs_raid_stride_physical(leaf, stride) + frontpad; btrfs_set_stack_raid_stride_physical(&newitem->strides[i], phys); } ret = btrfs_del_item(trans, stripe_root, path); if (ret) goto out; btrfs_release_path(path); ret = btrfs_insert_item(trans, stripe_root, &newkey, newitem, item_size); out: kfree(newitem); return ret; } int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 length) { struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_root *stripe_root = fs_info->stripe_root; struct btrfs_path *path; struct btrfs_key key; struct extent_buffer *leaf; u64 found_start; u64 found_end; u64 end = start + length; int slot; int ret; if (!btrfs_fs_incompat(fs_info, RAID_STRIPE_TREE) || !stripe_root) return 0; if (!btrfs_is_testing(fs_info)) { struct btrfs_chunk_map *map; bool use_rst; map = btrfs_find_chunk_map(fs_info, start, length); if (!map) return -EINVAL; use_rst = btrfs_need_stripe_tree_update(fs_info, map->type); btrfs_free_chunk_map(map); if (!use_rst) return 0; } path = btrfs_alloc_path(); if (!path) return -ENOMEM; while (1) { key.objectid = start; key.type = BTRFS_RAID_STRIPE_KEY; key.offset = 0; ret = btrfs_search_slot(trans, stripe_root, &key, path, -1, 1); if (ret < 0) break; if (path->slots[0] == btrfs_header_nritems(path->nodes[0])) path->slots[0]--; leaf = path->nodes[0]; slot = path->slots[0]; btrfs_item_key_to_cpu(leaf, &key, slot); found_start = key.objectid; found_end = found_start + key.offset; ret = 0; /* * The stripe extent starts before the range we want to delete, * but the range spans more than one stripe extent: * * |--- RAID Stripe Extent ---||--- RAID Stripe Extent ---| * |--- keep ---|--- drop ---| * * This means we have to get the previous item, truncate its * length and then restart the search. */ if (found_start > start) { if (slot == 0) { ret = btrfs_previous_item(stripe_root, path, start, BTRFS_RAID_STRIPE_KEY); if (ret) { if (ret > 0) ret = -ENOENT; break; } } else { path->slots[0]--; } leaf = path->nodes[0]; slot = path->slots[0]; btrfs_item_key_to_cpu(leaf, &key, slot); found_start = key.objectid; found_end = found_start + key.offset; ASSERT(found_start <= start); } if (key.type != BTRFS_RAID_STRIPE_KEY) break; /* That stripe ends before we start, we're done. */ if (found_end <= start) break; trace_btrfs_raid_extent_delete(fs_info, start, end, found_start, found_end); /* * The stripe extent starts before the range we want to delete * and ends after the range we want to delete, i.e. we're * punching a hole in the stripe extent: * * |--- RAID Stripe Extent ---| * | keep |--- drop ---| keep | * * This means we need to a) truncate the existing item and b) * create a second item for the remaining range. */ if (found_start < start && found_end > end) { size_t item_size; u64 diff_start = start - found_start; u64 diff_end = found_end - end; struct btrfs_stripe_extent *extent; struct btrfs_key newkey = { .objectid = end, .type = BTRFS_RAID_STRIPE_KEY, .offset = diff_end, }; /* The "right" item. */ ret = btrfs_duplicate_item(trans, stripe_root, path, &newkey); if (ret) break; item_size = btrfs_item_size(leaf, path->slots[0]); extent = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_stripe_extent); for (int i = 0; i < btrfs_num_raid_stripes(item_size); i++) { struct btrfs_raid_stride *stride = &extent->strides[i]; u64 phys; phys = btrfs_raid_stride_physical(leaf, stride); phys += diff_start + length; btrfs_set_raid_stride_physical(leaf, stride, phys); } /* The "left" item. */ path->slots[0]--; btrfs_item_key_to_cpu(leaf, &key, path->slots[0]); btrfs_partially_delete_raid_extent(trans, path, &key, diff_start, 0); break; } /* * The stripe extent starts before the range we want to delete: * * |--- RAID Stripe Extent ---| * |--- keep ---|--- drop ---| * * This means we have to duplicate the tree item, truncate the * length to the new size and then re-insert the item. */ if (found_start < start) { u64 diff_start = start - found_start; btrfs_partially_delete_raid_extent(trans, path, &key, diff_start, 0); start += (key.offset - diff_start); length -= (key.offset - diff_start); if (length == 0) break; btrfs_release_path(path); continue; } /* * The stripe extent ends after the range we want to delete: * * |--- RAID Stripe Extent ---| * |--- drop ---|--- keep ---| * * This means we have to duplicate the tree item, truncate the * length to the new size and then re-insert the item. */ if (found_end > end) { u64 diff_end = found_end - end; btrfs_partially_delete_raid_extent(trans, path, &key, key.offset - length, length); ASSERT(key.offset - diff_end == length); break; } /* Finally we can delete the whole item, no more special cases. */ ret = btrfs_del_item(trans, stripe_root, path); if (ret) break; start += key.offset; length -= key.offset; if (length == 0) break; btrfs_release_path(path); } btrfs_free_path(path); return ret; } static int update_raid_extent_item(struct btrfs_trans_handle *trans, struct btrfs_key *key, struct btrfs_stripe_extent *stripe_extent, const size_t item_size) { struct btrfs_path *path; struct extent_buffer *leaf; int ret; int slot; path = btrfs_alloc_path(); if (!path) return -ENOMEM; ret = btrfs_search_slot(trans, trans->fs_info->stripe_root, key, path, 0, 1); if (ret) return (ret == 1 ? ret : -EINVAL); leaf = path->nodes[0]; slot = path->slots[0]; write_extent_buffer(leaf, stripe_extent, btrfs_item_ptr_offset(leaf, slot), item_size); btrfs_free_path(path); return ret; } EXPORT_FOR_TESTS int btrfs_insert_one_raid_extent(struct btrfs_trans_handle *trans, struct btrfs_io_context *bioc) { struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_key stripe_key; struct btrfs_root *stripe_root = fs_info->stripe_root; const int num_stripes = btrfs_bg_type_to_factor(bioc->map_type); struct btrfs_stripe_extent *stripe_extent; const size_t item_size = struct_size(stripe_extent, strides, num_stripes); int ret; stripe_extent = kzalloc(item_size, GFP_NOFS); if (!stripe_extent) { btrfs_abort_transaction(trans, -ENOMEM); btrfs_end_transaction(trans); return -ENOMEM; } trace_btrfs_insert_one_raid_extent(fs_info, bioc->logical, bioc->size, num_stripes); for (int i = 0; i < num_stripes; i++) { u64 devid = bioc->stripes[i].dev->devid; u64 physical = bioc->stripes[i].physical; struct btrfs_raid_stride *raid_stride = &stripe_extent->strides[i]; btrfs_set_stack_raid_stride_devid(raid_stride, devid); btrfs_set_stack_raid_stride_physical(raid_stride, physical); } stripe_key.objectid = bioc->logical; stripe_key.type = BTRFS_RAID_STRIPE_KEY; stripe_key.offset = bioc->size; ret = btrfs_insert_item(trans, stripe_root, &stripe_key, stripe_extent, item_size); if (ret == -EEXIST) ret = update_raid_extent_item(trans, &stripe_key, stripe_extent, item_size); if (ret) btrfs_abort_transaction(trans, ret); kfree(stripe_extent); return ret; } int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, struct btrfs_ordered_extent *ordered_extent) { struct btrfs_io_context *bioc; int ret; if (!btrfs_fs_incompat(trans->fs_info, RAID_STRIPE_TREE)) return 0; list_for_each_entry(bioc, &ordered_extent->bioc_list, rst_ordered_entry) { ret = btrfs_insert_one_raid_extent(trans, bioc); if (ret) return ret; } while (!list_empty(&ordered_extent->bioc_list)) { bioc = list_first_entry(&ordered_extent->bioc_list, typeof(*bioc), rst_ordered_entry); list_del(&bioc->rst_ordered_entry); btrfs_put_bioc(bioc); } return 0; } int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, u64 logical, u64 *length, u64 map_type, u32 stripe_index, struct btrfs_io_stripe *stripe) { struct btrfs_root *stripe_root = fs_info->stripe_root; struct btrfs_stripe_extent *stripe_extent; struct btrfs_key stripe_key; struct btrfs_key found_key; struct btrfs_path *path; struct extent_buffer *leaf; const u64 end = logical + *length; int num_stripes; u64 offset; u64 found_logical; u64 found_length; u64 found_end; int slot; int ret; stripe_key.objectid = logical; stripe_key.type = BTRFS_RAID_STRIPE_KEY; stripe_key.offset = 0; path = btrfs_alloc_path(); if (!path) return -ENOMEM; if (stripe->rst_search_commit_root) { path->skip_locking = 1; path->search_commit_root = 1; } ret = btrfs_search_slot(NULL, stripe_root, &stripe_key, path, 0, 0); if (ret < 0) goto free_path; if (ret) { if (path->slots[0] != 0) path->slots[0]--; } while (1) { leaf = path->nodes[0]; slot = path->slots[0]; btrfs_item_key_to_cpu(leaf, &found_key, slot); found_logical = found_key.objectid; found_length = found_key.offset; found_end = found_logical + found_length; if (found_logical > end) { ret = -ENODATA; goto out; } if (in_range(logical, found_logical, found_length)) break; ret = btrfs_next_item(stripe_root, path); if (ret) goto out; } offset = logical - found_logical; /* * If we have a logically contiguous, but physically non-continuous * range, we need to split the bio. Record the length after which we * must split the bio. */ if (end > found_end) *length -= end - found_end; num_stripes = btrfs_num_raid_stripes(btrfs_item_size(leaf, slot)); stripe_extent = btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); for (int i = 0; i < num_stripes; i++) { struct btrfs_raid_stride *stride = &stripe_extent->strides[i]; u64 devid = btrfs_raid_stride_devid(leaf, stride); u64 physical = btrfs_raid_stride_physical(leaf, stride); if (devid != stripe->dev->devid) continue; if ((map_type & BTRFS_BLOCK_GROUP_DUP) && stripe_index != i) continue; stripe->physical = physical + offset; trace_btrfs_get_raid_extent_offset(fs_info, logical, *length, stripe->physical, devid); ret = 0; goto free_path; } /* If we're here, we haven't found the requested devid in the stripe. */ ret = -ENODATA; out: if (ret > 0) ret = -ENODATA; if (ret && ret != -EIO && !stripe->rst_search_commit_root) { btrfs_debug(fs_info, "cannot find raid-stripe for logical [%llu, %llu] devid %llu, profile %s", logical, logical + *length, stripe->dev->devid, btrfs_bg_type_to_raid_name(map_type)); } free_path: btrfs_free_path(path); return ret; } |
| 8 1 8 4 4 4 2 4 1 1 3 4 3 4 2 4 1 1 4 3 14 9 7 13 10 3 3 1 11 8 4 4 2 4 3 7 14 7 5 7 5 14 7 4 14 11 4 4 2 12 8 5 5 1 5 4 14 7 5 7 5 10 4 1 10 6 3 8 5 2 3 2 1 2 1 5 4 4 2 2 1 5 2 8 10 3 1 3 2 3 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 | // SPDX-License-Identifier: GPL-2.0-only /* Copyright (C) 2000-2002 Joakim Axelsson <gozem@linux.nu> * Patrick Schaaf <bof@bof.de> * Martin Josefsson <gandalf@wlug.westbo.se> * Copyright (C) 2003-2013 Jozsef Kadlecsik <kadlec@netfilter.org> */ /* Kernel module which implements the set match and SET target * for netfilter/iptables. */ #include <linux/module.h> #include <linux/skbuff.h> #include <linux/netfilter/x_tables.h> #include <linux/netfilter/ipset/ip_set.h> #include <uapi/linux/netfilter/xt_set.h> MODULE_LICENSE("GPL"); MODULE_AUTHOR("Jozsef Kadlecsik <kadlec@netfilter.org>"); MODULE_DESCRIPTION("Xtables: IP set match and target module"); MODULE_ALIAS("xt_SET"); MODULE_ALIAS("ipt_set"); MODULE_ALIAS("ip6t_set"); MODULE_ALIAS("ipt_SET"); MODULE_ALIAS("ip6t_SET"); static inline int match_set(ip_set_id_t index, const struct sk_buff *skb, const struct xt_action_param *par, struct ip_set_adt_opt *opt, int inv) { if (ip_set_test(index, skb, par, opt)) inv = !inv; return inv; } #define ADT_OPT(n, f, d, fs, cfs, t, p, b, po, bo) \ struct ip_set_adt_opt n = { \ .family = f, \ .dim = d, \ .flags = fs, \ .cmdflags = cfs, \ .ext.timeout = t, \ .ext.packets = p, \ .ext.bytes = b, \ .ext.packets_op = po, \ .ext.bytes_op = bo, \ } /* Revision 0 interface: backward compatible with netfilter/iptables */ static bool set_match_v0(const struct sk_buff *skb, struct xt_action_param *par) { const struct xt_set_info_match_v0 *info = par->matchinfo; ADT_OPT(opt, xt_family(par), info->match_set.u.compat.dim, info->match_set.u.compat.flags, 0, UINT_MAX, 0, 0, 0, 0); return match_set(info->match_set.index, skb, par, &opt, info->match_set.u.compat.flags & IPSET_INV_MATCH); } static void compat_flags(struct xt_set_info_v0 *info) { u_int8_t i; /* Fill out compatibility data according to enum ip_set_kopt */ info->u.compat.dim = IPSET_DIM_ZERO; if (info->u.flags[0] & IPSET_MATCH_INV) info->u.compat.flags |= IPSET_INV_MATCH; for (i = 0; i < IPSET_DIM_MAX - 1 && info->u.flags[i]; i++) { info->u.compat.dim++; if (info->u.flags[i] & IPSET_SRC) info->u.compat.flags |= (1 << info->u.compat.dim); } } static int set_match_v0_checkentry(const struct xt_mtchk_param *par) { struct xt_set_info_match_v0 *info = par->matchinfo; ip_set_id_t index; index = ip_set_nfnl_get_byindex(par->net, info->match_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find set identified by id %u to match\n", info->match_set.index); return -ENOENT; } if (info->match_set.u.flags[IPSET_DIM_MAX - 1] != 0) { pr_info_ratelimited("set match dimension is over the limit!\n"); ip_set_nfnl_put(par->net, info->match_set.index); return -ERANGE; } /* Fill out compatibility data */ compat_flags(&info->match_set); return 0; } static void set_match_v0_destroy(const struct xt_mtdtor_param *par) { struct xt_set_info_match_v0 *info = par->matchinfo; ip_set_nfnl_put(par->net, info->match_set.index); } /* Revision 1 match */ static bool set_match_v1(const struct sk_buff *skb, struct xt_action_param *par) { const struct xt_set_info_match_v1 *info = par->matchinfo; ADT_OPT(opt, xt_family(par), info->match_set.dim, info->match_set.flags, 0, UINT_MAX, 0, 0, 0, 0); if (opt.flags & IPSET_RETURN_NOMATCH) opt.cmdflags |= IPSET_FLAG_RETURN_NOMATCH; return match_set(info->match_set.index, skb, par, &opt, info->match_set.flags & IPSET_INV_MATCH); } static int set_match_v1_checkentry(const struct xt_mtchk_param *par) { struct xt_set_info_match_v1 *info = par->matchinfo; ip_set_id_t index; index = ip_set_nfnl_get_byindex(par->net, info->match_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find set identified by id %u to match\n", info->match_set.index); return -ENOENT; } if (info->match_set.dim > IPSET_DIM_MAX) { pr_info_ratelimited("set match dimension is over the limit!\n"); ip_set_nfnl_put(par->net, info->match_set.index); return -ERANGE; } return 0; } static void set_match_v1_destroy(const struct xt_mtdtor_param *par) { struct xt_set_info_match_v1 *info = par->matchinfo; ip_set_nfnl_put(par->net, info->match_set.index); } /* Revision 3 match */ static bool set_match_v3(const struct sk_buff *skb, struct xt_action_param *par) { const struct xt_set_info_match_v3 *info = par->matchinfo; ADT_OPT(opt, xt_family(par), info->match_set.dim, info->match_set.flags, info->flags, UINT_MAX, info->packets.value, info->bytes.value, info->packets.op, info->bytes.op); if (info->packets.op != IPSET_COUNTER_NONE || info->bytes.op != IPSET_COUNTER_NONE) opt.cmdflags |= IPSET_FLAG_MATCH_COUNTERS; return match_set(info->match_set.index, skb, par, &opt, info->match_set.flags & IPSET_INV_MATCH); } #define set_match_v3_checkentry set_match_v1_checkentry #define set_match_v3_destroy set_match_v1_destroy /* Revision 4 match */ static bool set_match_v4(const struct sk_buff *skb, struct xt_action_param *par) { const struct xt_set_info_match_v4 *info = par->matchinfo; ADT_OPT(opt, xt_family(par), info->match_set.dim, info->match_set.flags, info->flags, UINT_MAX, info->packets.value, info->bytes.value, info->packets.op, info->bytes.op); if (info->packets.op != IPSET_COUNTER_NONE || info->bytes.op != IPSET_COUNTER_NONE) opt.cmdflags |= IPSET_FLAG_MATCH_COUNTERS; return match_set(info->match_set.index, skb, par, &opt, info->match_set.flags & IPSET_INV_MATCH); } #define set_match_v4_checkentry set_match_v1_checkentry #define set_match_v4_destroy set_match_v1_destroy /* Revision 0 interface: backward compatible with netfilter/iptables */ static unsigned int set_target_v0(struct sk_buff *skb, const struct xt_action_param *par) { const struct xt_set_info_target_v0 *info = par->targinfo; ADT_OPT(add_opt, xt_family(par), info->add_set.u.compat.dim, info->add_set.u.compat.flags, 0, UINT_MAX, 0, 0, 0, 0); ADT_OPT(del_opt, xt_family(par), info->del_set.u.compat.dim, info->del_set.u.compat.flags, 0, UINT_MAX, 0, 0, 0, 0); if (info->add_set.index != IPSET_INVALID_ID) ip_set_add(info->add_set.index, skb, par, &add_opt); if (info->del_set.index != IPSET_INVALID_ID) ip_set_del(info->del_set.index, skb, par, &del_opt); return XT_CONTINUE; } static int set_target_v0_checkentry(const struct xt_tgchk_param *par) { struct xt_set_info_target_v0 *info = par->targinfo; ip_set_id_t index; if (info->add_set.index != IPSET_INVALID_ID) { index = ip_set_nfnl_get_byindex(par->net, info->add_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find add_set index %u as target\n", info->add_set.index); return -ENOENT; } } if (info->del_set.index != IPSET_INVALID_ID) { index = ip_set_nfnl_get_byindex(par->net, info->del_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find del_set index %u as target\n", info->del_set.index); if (info->add_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->add_set.index); return -ENOENT; } } if (info->add_set.u.flags[IPSET_DIM_MAX - 1] != 0 || info->del_set.u.flags[IPSET_DIM_MAX - 1] != 0) { pr_info_ratelimited("SET target dimension over the limit!\n"); if (info->add_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->add_set.index); if (info->del_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->del_set.index); return -ERANGE; } /* Fill out compatibility data */ compat_flags(&info->add_set); compat_flags(&info->del_set); return 0; } static void set_target_v0_destroy(const struct xt_tgdtor_param *par) { const struct xt_set_info_target_v0 *info = par->targinfo; if (info->add_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->add_set.index); if (info->del_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->del_set.index); } /* Revision 1 target */ static unsigned int set_target_v1(struct sk_buff *skb, const struct xt_action_param *par) { const struct xt_set_info_target_v1 *info = par->targinfo; ADT_OPT(add_opt, xt_family(par), info->add_set.dim, info->add_set.flags, 0, UINT_MAX, 0, 0, 0, 0); ADT_OPT(del_opt, xt_family(par), info->del_set.dim, info->del_set.flags, 0, UINT_MAX, 0, 0, 0, 0); if (info->add_set.index != IPSET_INVALID_ID) ip_set_add(info->add_set.index, skb, par, &add_opt); if (info->del_set.index != IPSET_INVALID_ID) ip_set_del(info->del_set.index, skb, par, &del_opt); return XT_CONTINUE; } static int set_target_v1_checkentry(const struct xt_tgchk_param *par) { const struct xt_set_info_target_v1 *info = par->targinfo; ip_set_id_t index; if (info->add_set.index != IPSET_INVALID_ID) { index = ip_set_nfnl_get_byindex(par->net, info->add_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find add_set index %u as target\n", info->add_set.index); return -ENOENT; } } if (info->del_set.index != IPSET_INVALID_ID) { index = ip_set_nfnl_get_byindex(par->net, info->del_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find del_set index %u as target\n", info->del_set.index); if (info->add_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->add_set.index); return -ENOENT; } } if (info->add_set.dim > IPSET_DIM_MAX || info->del_set.dim > IPSET_DIM_MAX) { pr_info_ratelimited("SET target dimension over the limit!\n"); if (info->add_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->add_set.index); if (info->del_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->del_set.index); return -ERANGE; } return 0; } static void set_target_v1_destroy(const struct xt_tgdtor_param *par) { const struct xt_set_info_target_v1 *info = par->targinfo; if (info->add_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->add_set.index); if (info->del_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->del_set.index); } /* Revision 2 target */ static unsigned int set_target_v2(struct sk_buff *skb, const struct xt_action_param *par) { const struct xt_set_info_target_v2 *info = par->targinfo; ADT_OPT(add_opt, xt_family(par), info->add_set.dim, info->add_set.flags, info->flags, info->timeout, 0, 0, 0, 0); ADT_OPT(del_opt, xt_family(par), info->del_set.dim, info->del_set.flags, 0, UINT_MAX, 0, 0, 0, 0); /* Normalize to fit into jiffies */ if (add_opt.ext.timeout != IPSET_NO_TIMEOUT && add_opt.ext.timeout > IPSET_MAX_TIMEOUT) add_opt.ext.timeout = IPSET_MAX_TIMEOUT; if (info->add_set.index != IPSET_INVALID_ID) ip_set_add(info->add_set.index, skb, par, &add_opt); if (info->del_set.index != IPSET_INVALID_ID) ip_set_del(info->del_set.index, skb, par, &del_opt); return XT_CONTINUE; } #define set_target_v2_checkentry set_target_v1_checkentry #define set_target_v2_destroy set_target_v1_destroy /* Revision 3 target */ #define MOPT(opt, member) ((opt).ext.skbinfo.member) static unsigned int set_target_v3(struct sk_buff *skb, const struct xt_action_param *par) { const struct xt_set_info_target_v3 *info = par->targinfo; int ret; ADT_OPT(add_opt, xt_family(par), info->add_set.dim, info->add_set.flags, info->flags, info->timeout, 0, 0, 0, 0); ADT_OPT(del_opt, xt_family(par), info->del_set.dim, info->del_set.flags, 0, UINT_MAX, 0, 0, 0, 0); ADT_OPT(map_opt, xt_family(par), info->map_set.dim, info->map_set.flags, 0, UINT_MAX, 0, 0, 0, 0); /* Normalize to fit into jiffies */ if (add_opt.ext.timeout != IPSET_NO_TIMEOUT && add_opt.ext.timeout > IPSET_MAX_TIMEOUT) add_opt.ext.timeout = IPSET_MAX_TIMEOUT; if (info->add_set.index != IPSET_INVALID_ID) ip_set_add(info->add_set.index, skb, par, &add_opt); if (info->del_set.index != IPSET_INVALID_ID) ip_set_del(info->del_set.index, skb, par, &del_opt); if (info->map_set.index != IPSET_INVALID_ID) { map_opt.cmdflags |= info->flags & (IPSET_FLAG_MAP_SKBMARK | IPSET_FLAG_MAP_SKBPRIO | IPSET_FLAG_MAP_SKBQUEUE); ret = match_set(info->map_set.index, skb, par, &map_opt, info->map_set.flags & IPSET_INV_MATCH); if (!ret) return XT_CONTINUE; if (map_opt.cmdflags & IPSET_FLAG_MAP_SKBMARK) skb->mark = (skb->mark & ~MOPT(map_opt,skbmarkmask)) ^ MOPT(map_opt, skbmark); if (map_opt.cmdflags & IPSET_FLAG_MAP_SKBPRIO) skb->priority = MOPT(map_opt, skbprio); if ((map_opt.cmdflags & IPSET_FLAG_MAP_SKBQUEUE) && skb->dev && skb->dev->real_num_tx_queues > MOPT(map_opt, skbqueue)) skb_set_queue_mapping(skb, MOPT(map_opt, skbqueue)); } return XT_CONTINUE; } static int set_target_v3_checkentry(const struct xt_tgchk_param *par) { const struct xt_set_info_target_v3 *info = par->targinfo; ip_set_id_t index; int ret = 0; if (info->add_set.index != IPSET_INVALID_ID) { index = ip_set_nfnl_get_byindex(par->net, info->add_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find add_set index %u as target\n", info->add_set.index); return -ENOENT; } } if (info->del_set.index != IPSET_INVALID_ID) { index = ip_set_nfnl_get_byindex(par->net, info->del_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find del_set index %u as target\n", info->del_set.index); ret = -ENOENT; goto cleanup_add; } } if (info->map_set.index != IPSET_INVALID_ID) { if (strncmp(par->table, "mangle", 7)) { pr_info_ratelimited("--map-set only usable from mangle table\n"); ret = -EINVAL; goto cleanup_del; } if (((info->flags & IPSET_FLAG_MAP_SKBPRIO) | (info->flags & IPSET_FLAG_MAP_SKBQUEUE)) && (par->hook_mask & ~(1 << NF_INET_FORWARD | 1 << NF_INET_LOCAL_OUT | 1 << NF_INET_POST_ROUTING))) { pr_info_ratelimited("mapping of prio or/and queue is allowed only from OUTPUT/FORWARD/POSTROUTING chains\n"); ret = -EINVAL; goto cleanup_del; } index = ip_set_nfnl_get_byindex(par->net, info->map_set.index); if (index == IPSET_INVALID_ID) { pr_info_ratelimited("Cannot find map_set index %u as target\n", info->map_set.index); ret = -ENOENT; goto cleanup_del; } } if (info->add_set.dim > IPSET_DIM_MAX || info->del_set.dim > IPSET_DIM_MAX || info->map_set.dim > IPSET_DIM_MAX) { pr_info_ratelimited("SET target dimension over the limit!\n"); ret = -ERANGE; goto cleanup_mark; } return 0; cleanup_mark: if (info->map_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->map_set.index); cleanup_del: if (info->del_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->del_set.index); cleanup_add: if (info->add_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->add_set.index); return ret; } static void set_target_v3_destroy(const struct xt_tgdtor_param *par) { const struct xt_set_info_target_v3 *info = par->targinfo; if (info->add_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->add_set.index); if (info->del_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->del_set.index); if (info->map_set.index != IPSET_INVALID_ID) ip_set_nfnl_put(par->net, info->map_set.index); } static struct xt_match set_matches[] __read_mostly = { { .name = "set", .family = NFPROTO_IPV4, .revision = 0, .match = set_match_v0, .matchsize = sizeof(struct xt_set_info_match_v0), .checkentry = set_match_v0_checkentry, .destroy = set_match_v0_destroy, .me = THIS_MODULE }, { .name = "set", .family = NFPROTO_IPV4, .revision = 1, .match = set_match_v1, .matchsize = sizeof(struct xt_set_info_match_v1), .checkentry = set_match_v1_checkentry, .destroy = set_match_v1_destroy, .me = THIS_MODULE }, { .name = "set", .family = NFPROTO_IPV6, .revision = 1, .match = set_match_v1, .matchsize = sizeof(struct xt_set_info_match_v1), .checkentry = set_match_v1_checkentry, .destroy = set_match_v1_destroy, .me = THIS_MODULE }, /* --return-nomatch flag support */ { .name = "set", .family = NFPROTO_IPV4, .revision = 2, .match = set_match_v1, .matchsize = sizeof(struct xt_set_info_match_v1), .checkentry = set_match_v1_checkentry, .destroy = set_match_v1_destroy, .me = THIS_MODULE }, { .name = "set", .family = NFPROTO_IPV6, .revision = 2, .match = set_match_v1, .matchsize = sizeof(struct xt_set_info_match_v1), .checkentry = set_match_v1_checkentry, .destroy = set_match_v1_destroy, .me = THIS_MODULE }, /* counters support: update, match */ { .name = "set", .family = NFPROTO_IPV4, .revision = 3, .match = set_match_v3, .matchsize = sizeof(struct xt_set_info_match_v3), .checkentry = set_match_v3_checkentry, .destroy = set_match_v3_destroy, .me = THIS_MODULE }, { .name = "set", .family = NFPROTO_IPV6, .revision = 3, .match = set_match_v3, .matchsize = sizeof(struct xt_set_info_match_v3), .checkentry = set_match_v3_checkentry, .destroy = set_match_v3_destroy, .me = THIS_MODULE }, /* new revision for counters support: update, match */ { .name = "set", .family = NFPROTO_IPV4, .revision = 4, .match = set_match_v4, .matchsize = sizeof(struct xt_set_info_match_v4), .checkentry = set_match_v4_checkentry, .destroy = set_match_v4_destroy, .me = THIS_MODULE }, { .name = "set", .family = NFPROTO_IPV6, .revision = 4, .match = set_match_v4, .matchsize = sizeof(struct xt_set_info_match_v4), .checkentry = set_match_v4_checkentry, .destroy = set_match_v4_destroy, .me = THIS_MODULE }, }; static struct xt_target set_targets[] __read_mostly = { { .name = "SET", .revision = 0, .family = NFPROTO_IPV4, .target = set_target_v0, .targetsize = sizeof(struct xt_set_info_target_v0), .checkentry = set_target_v0_checkentry, .destroy = set_target_v0_destroy, .me = THIS_MODULE }, { .name = "SET", .revision = 1, .family = NFPROTO_IPV4, .target = set_target_v1, .targetsize = sizeof(struct xt_set_info_target_v1), .checkentry = set_target_v1_checkentry, .destroy = set_target_v1_destroy, .me = THIS_MODULE }, { .name = "SET", .revision = 1, .family = NFPROTO_IPV6, .target = set_target_v1, .targetsize = sizeof(struct xt_set_info_target_v1), .checkentry = set_target_v1_checkentry, .destroy = set_target_v1_destroy, .me = THIS_MODULE }, /* --timeout and --exist flags support */ { .name = "SET", .revision = 2, .family = NFPROTO_IPV4, .target = set_target_v2, .targetsize = sizeof(struct xt_set_info_target_v2), .checkentry = set_target_v2_checkentry, .destroy = set_target_v2_destroy, .me = THIS_MODULE }, { .name = "SET", .revision = 2, .family = NFPROTO_IPV6, .target = set_target_v2, .targetsize = sizeof(struct xt_set_info_target_v2), .checkentry = set_target_v2_checkentry, .destroy = set_target_v2_destroy, .me = THIS_MODULE }, /* --map-set support */ { .name = "SET", .revision = 3, .family = NFPROTO_IPV4, .target = set_target_v3, .targetsize = sizeof(struct xt_set_info_target_v3), .checkentry = set_target_v3_checkentry, .destroy = set_target_v3_destroy, .me = THIS_MODULE }, { .name = "SET", .revision = 3, .family = NFPROTO_IPV6, .target = set_target_v3, .targetsize = sizeof(struct xt_set_info_target_v3), .checkentry = set_target_v3_checkentry, .destroy = set_target_v3_destroy, .me = THIS_MODULE }, }; static int __init xt_set_init(void) { int ret = xt_register_matches(set_matches, ARRAY_SIZE(set_matches)); if (!ret) { ret = xt_register_targets(set_targets, ARRAY_SIZE(set_targets)); if (ret) xt_unregister_matches(set_matches, ARRAY_SIZE(set_matches)); } return ret; } static void __exit xt_set_fini(void) { xt_unregister_matches(set_matches, ARRAY_SIZE(set_matches)); xt_unregister_targets(set_targets, ARRAY_SIZE(set_targets)); } module_init(xt_set_init); module_exit(xt_set_fini); |
| 20 13 13 13 13 13 1 12 13 22 22 21 20 19 18 18 22 21 12 11 2 11 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 | // SPDX-License-Identifier: GPL-2.0 /* * rtc and date/time utility functions * * Copyright (C) 2005-06 Tower Technologies * Author: Alessandro Zummo <a.zummo@towertech.it> * * based on arch/arm/common/rtctime.c and other bits * * Author: Cassio Neri <cassio.neri@gmail.com> (rtc_time64_to_tm) */ #include <linux/export.h> #include <linux/rtc.h> static const unsigned char rtc_days_in_month[] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 }; static const unsigned short rtc_ydays[2][13] = { /* Normal years */ { 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365 }, /* Leap years */ { 0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366 } }; /* * The number of days in the month. */ int rtc_month_days(unsigned int month, unsigned int year) { return rtc_days_in_month[month] + (is_leap_year(year) && month == 1); } EXPORT_SYMBOL(rtc_month_days); /* * The number of days since January 1. (0 to 365) */ int rtc_year_days(unsigned int day, unsigned int month, unsigned int year) { return rtc_ydays[is_leap_year(year)][month] + day - 1; } EXPORT_SYMBOL(rtc_year_days); /** * rtc_time64_to_tm - converts time64_t to rtc_time. * * @time: The number of seconds since 01-01-1970 00:00:00. * Works for values since at least 1900 * @tm: Pointer to the struct rtc_time. */ void rtc_time64_to_tm(time64_t time, struct rtc_time *tm) { int days, secs; u64 u64tmp; u32 u32tmp, udays, century, day_of_century, year_of_century, year, day_of_year, month, day; bool is_Jan_or_Feb, is_leap_year; /* * Get days and seconds while preserving the sign to * handle negative time values (dates before 1970-01-01) */ days = div_s64_rem(time, 86400, &secs); /* * We need 0 <= secs < 86400 which isn't given for negative * values of time. Fixup accordingly. */ if (secs < 0) { days -= 1; secs += 86400; } /* day of the week, 1970-01-01 was a Thursday */ tm->tm_wday = (days + 4) % 7; /* Ensure tm_wday is always positive */ if (tm->tm_wday < 0) tm->tm_wday += 7; /* * The following algorithm is, basically, Proposition 6.3 of Neri * and Schneider [1]. In a few words: it works on the computational * (fictitious) calendar where the year starts in March, month = 2 * (*), and finishes in February, month = 13. This calendar is * mathematically convenient because the day of the year does not * depend on whether the year is leap or not. For instance: * * March 1st 0-th day of the year; * ... * April 1st 31-st day of the year; * ... * January 1st 306-th day of the year; (Important!) * ... * February 28th 364-th day of the year; * February 29th 365-th day of the year (if it exists). * * After having worked out the date in the computational calendar * (using just arithmetics) it's easy to convert it to the * corresponding date in the Gregorian calendar. * * [1] "Euclidean Affine Functions and Applications to Calendar * Algorithms". https://arxiv.org/abs/2102.06959 * * (*) The numbering of months follows rtc_time more closely and * thus, is slightly different from [1]. */ udays = days + 719468; u32tmp = 4 * udays + 3; century = u32tmp / 146097; day_of_century = u32tmp % 146097 / 4; u32tmp = 4 * day_of_century + 3; u64tmp = 2939745ULL * u32tmp; year_of_century = upper_32_bits(u64tmp); day_of_year = lower_32_bits(u64tmp) / 2939745 / 4; year = 100 * century + year_of_century; is_leap_year = year_of_century != 0 ? year_of_century % 4 == 0 : century % 4 == 0; u32tmp = 2141 * day_of_year + 132377; month = u32tmp >> 16; day = ((u16) u32tmp) / 2141; /* * Recall that January 01 is the 306-th day of the year in the * computational (not Gregorian) calendar. */ is_Jan_or_Feb = day_of_year >= 306; /* Converts to the Gregorian calendar. */ year = year + is_Jan_or_Feb; month = is_Jan_or_Feb ? month - 12 : month; day = day + 1; day_of_year = is_Jan_or_Feb ? day_of_year - 306 : day_of_year + 31 + 28 + is_leap_year; /* Converts to rtc_time's format. */ tm->tm_year = (int) (year - 1900); tm->tm_mon = (int) month; tm->tm_mday = (int) day; tm->tm_yday = (int) day_of_year + 1; tm->tm_hour = secs / 3600; secs -= tm->tm_hour * 3600; tm->tm_min = secs / 60; tm->tm_sec = secs - tm->tm_min * 60; tm->tm_isdst = 0; } EXPORT_SYMBOL(rtc_time64_to_tm); /* * Does the rtc_time represent a valid date/time? */ int rtc_valid_tm(struct rtc_time *tm) { if (tm->tm_year < 70 || tm->tm_year > (INT_MAX - 1900) || ((unsigned int)tm->tm_mon) >= 12 || tm->tm_mday < 1 || tm->tm_mday > rtc_month_days(tm->tm_mon, ((unsigned int)tm->tm_year + 1900)) || ((unsigned int)tm->tm_hour) >= 24 || ((unsigned int)tm->tm_min) >= 60 || ((unsigned int)tm->tm_sec) >= 60) return -EINVAL; return 0; } EXPORT_SYMBOL(rtc_valid_tm); /* * rtc_tm_to_time64 - Converts rtc_time to time64_t. * Convert Gregorian date to seconds since 01-01-1970 00:00:00. */ time64_t rtc_tm_to_time64(struct rtc_time *tm) { return mktime64(((unsigned int)tm->tm_year + 1900), tm->tm_mon + 1, tm->tm_mday, tm->tm_hour, tm->tm_min, tm->tm_sec); } EXPORT_SYMBOL(rtc_tm_to_time64); /* * Convert rtc_time to ktime */ ktime_t rtc_tm_to_ktime(struct rtc_time tm) { return ktime_set(rtc_tm_to_time64(&tm), 0); } EXPORT_SYMBOL_GPL(rtc_tm_to_ktime); /* * Convert ktime to rtc_time */ struct rtc_time rtc_ktime_to_tm(ktime_t kt) { struct timespec64 ts; struct rtc_time ret; ts = ktime_to_timespec64(kt); /* Round up any ns */ if (ts.tv_nsec) ts.tv_sec++; rtc_time64_to_tm(ts.tv_sec, &ret); return ret; } EXPORT_SYMBOL_GPL(rtc_ktime_to_tm); |
| 1953 1954 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (C) 2013 Politecnico di Torino, Italy * TORSEC group -- https://security.polito.it * * Author: Roberto Sassu <roberto.sassu@polito.it> * * File: ima_template.c * Helpers to manage template descriptors. */ #include <linux/rculist.h> #include "ima.h" #include "ima_template_lib.h" enum header_fields { HDR_PCR, HDR_DIGEST, HDR_TEMPLATE_NAME, HDR_TEMPLATE_DATA, HDR__LAST }; static struct ima_template_desc builtin_templates[] = { {.name = IMA_TEMPLATE_IMA_NAME, .fmt = IMA_TEMPLATE_IMA_FMT}, {.name = "ima-ng", .fmt = "d-ng|n-ng"}, {.name = "ima-sig", .fmt = "d-ng|n-ng|sig"}, {.name = "ima-ngv2", .fmt = "d-ngv2|n-ng"}, {.name = "ima-sigv2", .fmt = "d-ngv2|n-ng|sig"}, {.name = "ima-buf", .fmt = "d-ng|n-ng|buf"}, {.name = "ima-modsig", .fmt = "d-ng|n-ng|sig|d-modsig|modsig"}, {.name = "evm-sig", .fmt = "d-ng|n-ng|evmsig|xattrnames|xattrlengths|xattrvalues|iuid|igid|imode"}, {.name = "", .fmt = ""}, /* placeholder for a custom format */ }; static LIST_HEAD(defined_templates); static DEFINE_SPINLOCK(template_list); static int template_setup_done; static const struct ima_template_field supported_fields[] = { {.field_id = "d", .field_init = ima_eventdigest_init, .field_show = ima_show_template_digest}, {.field_id = "n", .field_init = ima_eventname_init, .field_show = ima_show_template_string}, {.field_id = "d-ng", .field_init = ima_eventdigest_ng_init, .field_show = ima_show_template_digest_ng}, {.field_id = "d-ngv2", .field_init = ima_eventdigest_ngv2_init, .field_show = ima_show_template_digest_ngv2}, {.field_id = "n-ng", .field_init = ima_eventname_ng_init, .field_show = ima_show_template_string}, {.field_id = "sig", .field_init = ima_eventsig_init, .field_show = ima_show_template_sig}, {.field_id = "buf", .field_init = ima_eventbuf_init, .field_show = ima_show_template_buf}, {.field_id = "d-modsig", .field_init = ima_eventdigest_modsig_init, .field_show = ima_show_template_digest_ng}, {.field_id = "modsig", .field_init = ima_eventmodsig_init, .field_show = ima_show_template_sig}, {.field_id = "evmsig", .field_init = ima_eventevmsig_init, .field_show = ima_show_template_sig}, {.field_id = "iuid", .field_init = ima_eventinodeuid_init, .field_show = ima_show_template_uint}, {.field_id = "igid", .field_init = ima_eventinodegid_init, .field_show = ima_show_template_uint}, {.field_id = "imode", .field_init = ima_eventinodemode_init, .field_show = ima_show_template_uint}, {.field_id = "xattrnames", .field_init = ima_eventinodexattrnames_init, .field_show = ima_show_template_string}, {.field_id = "xattrlengths", .field_init = ima_eventinodexattrlengths_init, .field_show = ima_show_template_sig}, {.field_id = "xattrvalues", .field_init = ima_eventinodexattrvalues_init, .field_show = ima_show_template_sig}, }; /* * Used when restoring measurements carried over from a kexec. 'd' and 'n' don't * need to be accounted for since they shouldn't be defined in the same template * description as 'd-ng' and 'n-ng' respectively. */ #define MAX_TEMPLATE_NAME_LEN \ sizeof("d-ng|n-ng|evmsig|xattrnames|xattrlengths|xattrvalues|iuid|igid|imode") static struct ima_template_desc *ima_template; static struct ima_template_desc *ima_buf_template; /** * ima_template_has_modsig - Check whether template has modsig-related fields. * @ima_template: IMA template to check. * * Tells whether the given template has fields referencing a file's appended * signature. */ bool ima_template_has_modsig(const struct ima_template_desc *ima_template) { int i; for (i = 0; i < ima_template->num_fields; i++) if (!strcmp(ima_template->fields[i]->field_id, "modsig") || !strcmp(ima_template->fields[i]->field_id, "d-modsig")) return true; return false; } static int __init ima_template_setup(char *str) { struct ima_template_desc *template_desc; int template_len = strlen(str); if (template_setup_done) return 1; if (!ima_template) ima_init_template_list(); /* * Verify that a template with the supplied name exists. * If not, use CONFIG_IMA_DEFAULT_TEMPLATE. */ template_desc = lookup_template_desc(str); if (!template_desc) { pr_err("template %s not found, using %s\n", str, CONFIG_IMA_DEFAULT_TEMPLATE); return 1; } /* * Verify whether the current hash algorithm is supported * by the 'ima' template. */ if (template_len == 3 && strcmp(str, IMA_TEMPLATE_IMA_NAME) == 0 && ima_hash_algo != HASH_ALGO_SHA1 && ima_hash_algo != HASH_ALGO_MD5) { pr_err("template does not support hash alg\n"); return 1; } ima_template = template_desc; template_setup_done = 1; return 1; } __setup("ima_template=", ima_template_setup); static int __init ima_template_fmt_setup(char *str) { int num_templates = ARRAY_SIZE(builtin_templates); if (template_setup_done) return 1; if (template_desc_init_fields(str, NULL, NULL) < 0) { pr_err("format string '%s' not valid, using template %s\n", str, CONFIG_IMA_DEFAULT_TEMPLATE); return 1; } builtin_templates[num_templates - 1].fmt = str; ima_template = builtin_templates + num_templates - 1; template_setup_done = 1; return 1; } __setup("ima_template_fmt=", ima_template_fmt_setup); struct ima_template_desc *lookup_template_desc(const char *name) { struct ima_template_desc *template_desc; int found = 0; rcu_read_lock(); list_for_each_entry_rcu(template_desc, &defined_templates, list) { if ((strcmp(template_desc->name, name) == 0) || (strcmp(template_desc->fmt, name) == 0)) { found = 1; break; } } rcu_read_unlock(); return found ? template_desc : NULL; } static const struct ima_template_field * lookup_template_field(const char *field_id) { int i; for (i = 0; i < ARRAY_SIZE(supported_fields); i++) if (strncmp(supported_fields[i].field_id, field_id, IMA_TEMPLATE_FIELD_ID_MAX_LEN) == 0) return &supported_fields[i]; return NULL; } static int template_fmt_size(const char *template_fmt) { char c; int template_fmt_len = strlen(template_fmt); int i = 0, j = 0; while (i < template_fmt_len) { c = template_fmt[i]; if (c == '|') j++; i++; } return j + 1; } int template_desc_init_fields(const char *template_fmt, const struct ima_template_field ***fields, int *num_fields) { const char *template_fmt_ptr; const struct ima_template_field *found_fields[IMA_TEMPLATE_NUM_FIELDS_MAX]; int template_num_fields; int i, len; if (num_fields && *num_fields > 0) /* already initialized? */ return 0; template_num_fields = template_fmt_size(template_fmt); if (template_num_fields > IMA_TEMPLATE_NUM_FIELDS_MAX) { pr_err("format string '%s' contains too many fields\n", template_fmt); return -EINVAL; } for (i = 0, template_fmt_ptr = template_fmt; i < template_num_fields; i++, template_fmt_ptr += len + 1) { char tmp_field_id[IMA_TEMPLATE_FIELD_ID_MAX_LEN + 1]; len = strchrnul(template_fmt_ptr, '|') - template_fmt_ptr; if (len == 0 || len > IMA_TEMPLATE_FIELD_ID_MAX_LEN) { pr_err("Invalid field with length %d\n", len); return -EINVAL; } memcpy(tmp_field_id, template_fmt_ptr, len); tmp_field_id[len] = '\0'; found_fields[i] = lookup_template_field(tmp_field_id); if (!found_fields[i]) { pr_err("field '%s' not found\n", tmp_field_id); return -ENOENT; } } if (fields && num_fields) { *fields = kmalloc_array(i, sizeof(**fields), GFP_KERNEL); if (*fields == NULL) return -ENOMEM; memcpy(*fields, found_fields, i * sizeof(**fields)); *num_fields = i; } return 0; } void ima_init_template_list(void) { int i; if (!list_empty(&defined_templates)) return; spin_lock(&template_list); for (i = 0; i < ARRAY_SIZE(builtin_templates); i++) { list_add_tail_rcu(&builtin_templates[i].list, &defined_templates); } spin_unlock(&template_list); } struct ima_template_desc *ima_template_desc_current(void) { if (!ima_template) { ima_init_template_list(); ima_template = lookup_template_desc(CONFIG_IMA_DEFAULT_TEMPLATE); } return ima_template; } struct ima_template_desc *ima_template_desc_buf(void) { if (!ima_buf_template) { ima_init_template_list(); ima_buf_template = lookup_template_desc("ima-buf"); } return ima_buf_template; } int __init ima_init_template(void) { struct ima_template_desc *template = ima_template_desc_current(); int result; result = template_desc_init_fields(template->fmt, &(template->fields), &(template->num_fields)); if (result < 0) { pr_err("template %s init failed, result: %d\n", (strlen(template->name) ? template->name : template->fmt), result); return result; } template = ima_template_desc_buf(); if (!template) { pr_err("Failed to get ima-buf template\n"); return -EINVAL; } result = template_desc_init_fields(template->fmt, &(template->fields), &(template->num_fields)); if (result < 0) pr_err("template %s init failed, result: %d\n", (strlen(template->name) ? template->name : template->fmt), result); return result; } static struct ima_template_desc *restore_template_fmt(char *template_name) { struct ima_template_desc *template_desc = NULL; int ret; ret = template_desc_init_fields(template_name, NULL, NULL); if (ret < 0) { pr_err("attempting to initialize the template \"%s\" failed\n", template_name); goto out; } template_desc = kzalloc(sizeof(*template_desc), GFP_KERNEL); if (!template_desc) goto out; template_desc->name = ""; template_desc->fmt = kstrdup(template_name, GFP_KERNEL); if (!template_desc->fmt) { kfree(template_desc); template_desc = NULL; goto out; } spin_lock(&template_list); list_add_tail_rcu(&template_desc->list, &defined_templates); spin_unlock(&template_list); out: return template_desc; } static int ima_restore_template_data(struct ima_template_desc *template_desc, void *template_data, int template_data_size, struct ima_template_entry **entry) { struct tpm_digest *digests; int ret = 0; int i; *entry = kzalloc(struct_size(*entry, template_data, template_desc->num_fields), GFP_NOFS); if (!*entry) return -ENOMEM; digests = kcalloc(NR_BANKS(ima_tpm_chip) + ima_extra_slots, sizeof(*digests), GFP_NOFS); if (!digests) { kfree(*entry); return -ENOMEM; } (*entry)->digests = digests; ret = ima_parse_buf(template_data, template_data + template_data_size, NULL, template_desc->num_fields, (*entry)->template_data, NULL, NULL, ENFORCE_FIELDS | ENFORCE_BUFEND, "template data"); if (ret < 0) { kfree((*entry)->digests); kfree(*entry); return ret; } (*entry)->template_desc = template_desc; for (i = 0; i < template_desc->num_fields; i++) { struct ima_field_data *field_data = &(*entry)->template_data[i]; u8 *data = field_data->data; (*entry)->template_data[i].data = kzalloc(field_data->len + 1, GFP_KERNEL); if (!(*entry)->template_data[i].data) { ret = -ENOMEM; break; } memcpy((*entry)->template_data[i].data, data, field_data->len); (*entry)->template_data_len += sizeof(field_data->len); (*entry)->template_data_len += field_data->len; } if (ret < 0) { ima_free_template_entry(*entry); *entry = NULL; } return ret; } /* Restore the serialized binary measurement list without extending PCRs. */ int ima_restore_measurement_list(loff_t size, void *buf) { char template_name[MAX_TEMPLATE_NAME_LEN]; unsigned char zero[TPM_DIGEST_SIZE] = { 0 }; struct ima_kexec_hdr *khdr = buf; struct ima_field_data hdr[HDR__LAST] = { [HDR_PCR] = {.len = sizeof(u32)}, [HDR_DIGEST] = {.len = TPM_DIGEST_SIZE}, }; void *bufp = buf + sizeof(*khdr); void *bufendp; struct ima_template_entry *entry; struct ima_template_desc *template_desc; DECLARE_BITMAP(hdr_mask, HDR__LAST); unsigned long count = 0; int ret = 0; if (!buf || size < sizeof(*khdr)) return 0; if (ima_canonical_fmt) { khdr->version = le16_to_cpu((__force __le16)khdr->version); khdr->count = le64_to_cpu((__force __le64)khdr->count); khdr->buffer_size = le64_to_cpu((__force __le64)khdr->buffer_size); } if (khdr->version != 1) { pr_err("attempting to restore a incompatible measurement list"); return -EINVAL; } if (khdr->count > ULONG_MAX - 1) { pr_err("attempting to restore too many measurements"); return -EINVAL; } bitmap_zero(hdr_mask, HDR__LAST); bitmap_set(hdr_mask, HDR_PCR, 1); bitmap_set(hdr_mask, HDR_DIGEST, 1); /* * ima kexec buffer prefix: version, buffer size, count * v1 format: pcr, digest, template-name-len, template-name, * template-data-size, template-data */ bufendp = buf + khdr->buffer_size; while ((bufp < bufendp) && (count++ < khdr->count)) { int enforce_mask = ENFORCE_FIELDS; enforce_mask |= (count == khdr->count) ? ENFORCE_BUFEND : 0; ret = ima_parse_buf(bufp, bufendp, &bufp, HDR__LAST, hdr, NULL, hdr_mask, enforce_mask, "entry header"); if (ret < 0) break; if (hdr[HDR_TEMPLATE_NAME].len >= MAX_TEMPLATE_NAME_LEN) { pr_err("attempting to restore a template name that is too long\n"); ret = -EINVAL; break; } /* template name is not null terminated */ memcpy(template_name, hdr[HDR_TEMPLATE_NAME].data, hdr[HDR_TEMPLATE_NAME].len); template_name[hdr[HDR_TEMPLATE_NAME].len] = 0; if (strcmp(template_name, "ima") == 0) { pr_err("attempting to restore an unsupported template \"%s\" failed\n", template_name); ret = -EINVAL; break; } template_desc = lookup_template_desc(template_name); if (!template_desc) { template_desc = restore_template_fmt(template_name); if (!template_desc) break; } /* * Only the running system's template format is initialized * on boot. As needed, initialize the other template formats. */ ret = template_desc_init_fields(template_desc->fmt, &(template_desc->fields), &(template_desc->num_fields)); if (ret < 0) { pr_err("attempting to restore the template fmt \"%s\" failed\n", template_desc->fmt); ret = -EINVAL; break; } ret = ima_restore_template_data(template_desc, hdr[HDR_TEMPLATE_DATA].data, hdr[HDR_TEMPLATE_DATA].len, &entry); if (ret < 0) break; if (memcmp(hdr[HDR_DIGEST].data, zero, sizeof(zero))) { ret = ima_calc_field_array_hash( &entry->template_data[0], entry); if (ret < 0) { pr_err("cannot calculate template digest\n"); ret = -EINVAL; break; } } entry->pcr = !ima_canonical_fmt ? *(u32 *)(hdr[HDR_PCR].data) : le32_to_cpu(*(__le32 *)(hdr[HDR_PCR].data)); ret = ima_restore_measurement_entry(entry); if (ret < 0) break; } return ret; } |
| 1 1 1 1 1 1 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 | /* * Copyright 2017 Red Hat * Parts ported from amdgpu (fence wait code). * Copyright 2016 Advanced Micro Devices, Inc. * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice (including the next * paragraph) shall be included in all copies or substantial portions of the * Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * Authors: * */ /** * DOC: Overview * * DRM synchronisation objects (syncobj, see struct &drm_syncobj) provide a * container for a synchronization primitive which can be used by userspace * to explicitly synchronize GPU commands, can be shared between userspace * processes, and can be shared between different DRM drivers. * Their primary use-case is to implement Vulkan fences and semaphores. * The syncobj userspace API provides ioctls for several operations: * * - Creation and destruction of syncobjs * - Import and export of syncobjs to/from a syncobj file descriptor * - Import and export a syncobj's underlying fence to/from a sync file * - Reset a syncobj (set its fence to NULL) * - Signal a syncobj (set a trivially signaled fence) * - Wait for a syncobj's fence to appear and be signaled * * The syncobj userspace API also provides operations to manipulate a syncobj * in terms of a timeline of struct &dma_fence_chain rather than a single * struct &dma_fence, through the following operations: * * - Signal a given point on the timeline * - Wait for a given point to appear and/or be signaled * - Import and export from/to a given point of a timeline * * At it's core, a syncobj is simply a wrapper around a pointer to a struct * &dma_fence which may be NULL. * When a syncobj is first created, its pointer is either NULL or a pointer * to an already signaled fence depending on whether the * &DRM_SYNCOBJ_CREATE_SIGNALED flag is passed to * &DRM_IOCTL_SYNCOBJ_CREATE. * * If the syncobj is considered as a binary (its state is either signaled or * unsignaled) primitive, when GPU work is enqueued in a DRM driver to signal * the syncobj, the syncobj's fence is replaced with a fence which will be * signaled by the completion of that work. * If the syncobj is considered as a timeline primitive, when GPU work is * enqueued in a DRM driver to signal the a given point of the syncobj, a new * struct &dma_fence_chain pointing to the DRM driver's fence and also * pointing to the previous fence that was in the syncobj. The new struct * &dma_fence_chain fence replace the syncobj's fence and will be signaled by * completion of the DRM driver's work and also any work associated with the * fence previously in the syncobj. * * When GPU work which waits on a syncobj is enqueued in a DRM driver, at the * time the work is enqueued, it waits on the syncobj's fence before * submitting the work to hardware. That fence is either : * * - The syncobj's current fence if the syncobj is considered as a binary * primitive. * - The struct &dma_fence associated with a given point if the syncobj is * considered as a timeline primitive. * * If the syncobj's fence is NULL or not present in the syncobj's timeline, * the enqueue operation is expected to fail. * * With binary syncobj, all manipulation of the syncobjs's fence happens in * terms of the current fence at the time the ioctl is called by userspace * regardless of whether that operation is an immediate host-side operation * (signal or reset) or or an operation which is enqueued in some driver * queue. &DRM_IOCTL_SYNCOBJ_RESET and &DRM_IOCTL_SYNCOBJ_SIGNAL can be used * to manipulate a syncobj from the host by resetting its pointer to NULL or * setting its pointer to a fence which is already signaled. * * With a timeline syncobj, all manipulation of the synobj's fence happens in * terms of a u64 value referring to point in the timeline. See * dma_fence_chain_find_seqno() to see how a given point is found in the * timeline. * * Note that applications should be careful to always use timeline set of * ioctl() when dealing with syncobj considered as timeline. Using a binary * set of ioctl() with a syncobj considered as timeline could result incorrect * synchronization. The use of binary syncobj is supported through the * timeline set of ioctl() by using a point value of 0, this will reproduce * the behavior of the binary set of ioctl() (for example replace the * syncobj's fence when signaling). * * * Host-side wait on syncobjs * -------------------------- * * &DRM_IOCTL_SYNCOBJ_WAIT takes an array of syncobj handles and does a * host-side wait on all of the syncobj fences simultaneously. * If &DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL is set, the wait ioctl will wait on * all of the syncobj fences to be signaled before it returns. * Otherwise, it returns once at least one syncobj fence has been signaled * and the index of a signaled fence is written back to the client. * * Unlike the enqueued GPU work dependencies which fail if they see a NULL * fence in a syncobj, if &DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT is set, * the host-side wait will first wait for the syncobj to receive a non-NULL * fence and then wait on that fence. * If &DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT is not set and any one of the * syncobjs in the array has a NULL fence, -EINVAL will be returned. * Assuming the syncobj starts off with a NULL fence, this allows a client * to do a host wait in one thread (or process) which waits on GPU work * submitted in another thread (or process) without having to manually * synchronize between the two. * This requirement is inherited from the Vulkan fence API. * * If &DRM_SYNCOBJ_WAIT_FLAGS_WAIT_DEADLINE is set, the ioctl will also set * a fence deadline hint on the backing fences before waiting, to provide the * fence signaler with an appropriate sense of urgency. The deadline is * specified as an absolute &CLOCK_MONOTONIC value in units of ns. * * Similarly, &DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT takes an array of syncobj * handles as well as an array of u64 points and does a host-side wait on all * of syncobj fences at the given points simultaneously. * * &DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT also adds the ability to wait for a given * fence to materialize on the timeline without waiting for the fence to be * signaled by using the &DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE flag. This * requirement is inherited from the wait-before-signal behavior required by * the Vulkan timeline semaphore API. * * Alternatively, &DRM_IOCTL_SYNCOBJ_EVENTFD can be used to wait without * blocking: an eventfd will be signaled when the syncobj is. This is useful to * integrate the wait in an event loop. * * * Import/export of syncobjs * ------------------------- * * &DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE and &DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD * provide two mechanisms for import/export of syncobjs. * * The first lets the client import or export an entire syncobj to a file * descriptor. * These fd's are opaque and have no other use case, except passing the * syncobj between processes. * All exported file descriptors and any syncobj handles created as a * result of importing those file descriptors own a reference to the * same underlying struct &drm_syncobj and the syncobj can be used * persistently across all the processes with which it is shared. * The syncobj is freed only once the last reference is dropped. * Unlike dma-buf, importing a syncobj creates a new handle (with its own * reference) for every import instead of de-duplicating. * The primary use-case of this persistent import/export is for shared * Vulkan fences and semaphores. * * The second import/export mechanism, which is indicated by * &DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE or * &DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE lets the client * import/export the syncobj's current fence from/to a &sync_file. * When a syncobj is exported to a sync file, that sync file wraps the * sycnobj's fence at the time of export and any later signal or reset * operations on the syncobj will not affect the exported sync file. * When a sync file is imported into a syncobj, the syncobj's fence is set * to the fence wrapped by that sync file. * Because sync files are immutable, resetting or signaling the syncobj * will not affect any sync files whose fences have been imported into the * syncobj. * * * Import/export of timeline points in timeline syncobjs * ----------------------------------------------------- * * &DRM_IOCTL_SYNCOBJ_TRANSFER provides a mechanism to transfer a struct * &dma_fence_chain of a syncobj at a given u64 point to another u64 point * into another syncobj. * * Note that if you want to transfer a struct &dma_fence_chain from a given * point on a timeline syncobj from/into a binary syncobj, you can use the * point 0 to mean take/replace the fence in the syncobj. */ #include <linux/anon_inodes.h> #include <linux/dma-fence-unwrap.h> #include <linux/eventfd.h> #include <linux/file.h> #include <linux/fs.h> #include <linux/sched/signal.h> #include <linux/sync_file.h> #include <linux/uaccess.h> #include <drm/drm.h> #include <drm/drm_drv.h> #include <drm/drm_file.h> #include <drm/drm_gem.h> #include <drm/drm_print.h> #include <drm/drm_syncobj.h> #include <drm/drm_utils.h> #include "drm_internal.h" struct syncobj_wait_entry { struct list_head node; struct task_struct *task; struct dma_fence *fence; struct dma_fence_cb fence_cb; u64 point; }; static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj, struct syncobj_wait_entry *wait); struct syncobj_eventfd_entry { struct list_head node; struct dma_fence *fence; struct dma_fence_cb fence_cb; struct drm_syncobj *syncobj; struct eventfd_ctx *ev_fd_ctx; u64 point; u32 flags; }; static void syncobj_eventfd_entry_func(struct drm_syncobj *syncobj, struct syncobj_eventfd_entry *entry); /** * drm_syncobj_find - lookup and reference a sync object. * @file_private: drm file private pointer * @handle: sync object handle to lookup. * * Returns a reference to the syncobj pointed to by handle or NULL. The * reference must be released by calling drm_syncobj_put(). */ struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private, u32 handle) { struct drm_syncobj *syncobj; spin_lock(&file_private->syncobj_table_lock); /* Check if we currently have a reference on the object */ syncobj = idr_find(&file_private->syncobj_idr, handle); if (syncobj) drm_syncobj_get(syncobj); spin_unlock(&file_private->syncobj_table_lock); return syncobj; } EXPORT_SYMBOL(drm_syncobj_find); static void drm_syncobj_fence_add_wait(struct drm_syncobj *syncobj, struct syncobj_wait_entry *wait) { struct dma_fence *fence; if (wait->fence) return; spin_lock(&syncobj->lock); /* We've already tried once to get a fence and failed. Now that we * have the lock, try one more time just to be sure we don't add a * callback when a fence has already been set. */ fence = dma_fence_get(rcu_dereference_protected(syncobj->fence, 1)); if (!fence || dma_fence_chain_find_seqno(&fence, wait->point)) { dma_fence_put(fence); list_add_tail(&wait->node, &syncobj->cb_list); } else if (!fence) { wait->fence = dma_fence_get_stub(); } else { wait->fence = fence; } spin_unlock(&syncobj->lock); } static void drm_syncobj_remove_wait(struct drm_syncobj *syncobj, struct syncobj_wait_entry *wait) { if (!wait->node.next) return; spin_lock(&syncobj->lock); list_del_init(&wait->node); spin_unlock(&syncobj->lock); } static void syncobj_eventfd_entry_free(struct syncobj_eventfd_entry *entry) { eventfd_ctx_put(entry->ev_fd_ctx); dma_fence_put(entry->fence); /* This happens either inside the syncobj lock, or after the node has * already been removed from the list. */ list_del(&entry->node); kfree(entry); } static void drm_syncobj_add_eventfd(struct drm_syncobj *syncobj, struct syncobj_eventfd_entry *entry) { spin_lock(&syncobj->lock); list_add_tail(&entry->node, &syncobj->ev_fd_list); syncobj_eventfd_entry_func(syncobj, entry); spin_unlock(&syncobj->lock); } /** * drm_syncobj_add_point - add new timeline point to the syncobj * @syncobj: sync object to add timeline point do * @chain: chain node to use to add the point * @fence: fence to encapsulate in the chain node * @point: sequence number to use for the point * * Add the chain node as new timeline point to the syncobj. */ void drm_syncobj_add_point(struct drm_syncobj *syncobj, struct dma_fence_chain *chain, struct dma_fence *fence, uint64_t point) { struct syncobj_wait_entry *wait_cur, *wait_tmp; struct syncobj_eventfd_entry *ev_fd_cur, *ev_fd_tmp; struct dma_fence *prev; dma_fence_get(fence); spin_lock(&syncobj->lock); prev = drm_syncobj_fence_get(syncobj); /* You are adding an unorder point to timeline, which could cause payload returned from query_ioctl is 0! */ if (prev && prev->seqno >= point) DRM_DEBUG("You are adding an unorder point to timeline!\n"); dma_fence_chain_init(chain, prev, fence, point); rcu_assign_pointer(syncobj->fence, &chain->base); list_for_each_entry_safe(wait_cur, wait_tmp, &syncobj->cb_list, node) syncobj_wait_syncobj_func(syncobj, wait_cur); list_for_each_entry_safe(ev_fd_cur, ev_fd_tmp, &syncobj->ev_fd_list, node) syncobj_eventfd_entry_func(syncobj, ev_fd_cur); spin_unlock(&syncobj->lock); /* Walk the chain once to trigger garbage collection */ dma_fence_chain_for_each(fence, prev); dma_fence_put(prev); } EXPORT_SYMBOL(drm_syncobj_add_point); /** * drm_syncobj_replace_fence - replace fence in a sync object. * @syncobj: Sync object to replace fence in * @fence: fence to install in sync file. * * This replaces the fence on a sync object. */ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *fence) { struct dma_fence *old_fence; struct syncobj_wait_entry *wait_cur, *wait_tmp; struct syncobj_eventfd_entry *ev_fd_cur, *ev_fd_tmp; if (fence) dma_fence_get(fence); spin_lock(&syncobj->lock); old_fence = rcu_dereference_protected(syncobj->fence, lockdep_is_held(&syncobj->lock)); rcu_assign_pointer(syncobj->fence, fence); if (fence != old_fence) { list_for_each_entry_safe(wait_cur, wait_tmp, &syncobj->cb_list, node) syncobj_wait_syncobj_func(syncobj, wait_cur); list_for_each_entry_safe(ev_fd_cur, ev_fd_tmp, &syncobj->ev_fd_list, node) syncobj_eventfd_entry_func(syncobj, ev_fd_cur); } spin_unlock(&syncobj->lock); dma_fence_put(old_fence); } EXPORT_SYMBOL(drm_syncobj_replace_fence); /** * drm_syncobj_assign_null_handle - assign a stub fence to the sync object * @syncobj: sync object to assign the fence on * * Assign a already signaled stub fence to the sync object. */ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj) { struct dma_fence *fence = dma_fence_allocate_private_stub(ktime_get()); if (!fence) return -ENOMEM; drm_syncobj_replace_fence(syncobj, fence); dma_fence_put(fence); return 0; } /* 5s default for wait submission */ #define DRM_SYNCOBJ_WAIT_FOR_SUBMIT_TIMEOUT 5000000000ULL /** * drm_syncobj_find_fence - lookup and reference the fence in a sync object * @file_private: drm file private pointer * @handle: sync object handle to lookup. * @point: timeline point * @flags: DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT or not * @fence: out parameter for the fence * * This is just a convenience function that combines drm_syncobj_find() and * drm_syncobj_fence_get(). * * Returns 0 on success or a negative error value on failure. On success @fence * contains a reference to the fence, which must be released by calling * dma_fence_put(). */ int drm_syncobj_find_fence(struct drm_file *file_private, u32 handle, u64 point, u64 flags, struct dma_fence **fence) { struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle); struct syncobj_wait_entry wait; u64 timeout = nsecs_to_jiffies64(DRM_SYNCOBJ_WAIT_FOR_SUBMIT_TIMEOUT); int ret; if (flags & ~DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) return -EINVAL; if (!syncobj) return -ENOENT; /* Waiting for userspace with locks help is illegal cause that can * trivial deadlock with page faults for example. Make lockdep complain * about it early on. */ if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) { might_sleep(); lockdep_assert_none_held_once(); } *fence = drm_syncobj_fence_get(syncobj); if (*fence) { ret = dma_fence_chain_find_seqno(fence, point); if (!ret) { /* If the requested seqno is already signaled * drm_syncobj_find_fence may return a NULL * fence. To make sure the recipient gets * signalled, use a new fence instead. */ if (!*fence) *fence = dma_fence_get_stub(); goto out; } dma_fence_put(*fence); } else { ret = -EINVAL; } if (!(flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)) goto out; memset(&wait, 0, sizeof(wait)); wait.task = current; wait.point = point; drm_syncobj_fence_add_wait(syncobj, &wait); do { set_current_state(TASK_INTERRUPTIBLE); if (wait.fence) { ret = 0; break; } if (timeout == 0) { ret = -ETIME; break; } if (signal_pending(current)) { ret = -ERESTARTSYS; break; } timeout = schedule_timeout(timeout); } while (1); __set_current_state(TASK_RUNNING); *fence = wait.fence; if (wait.node.next) drm_syncobj_remove_wait(syncobj, &wait); out: drm_syncobj_put(syncobj); return ret; } EXPORT_SYMBOL(drm_syncobj_find_fence); /** * drm_syncobj_free - free a sync object. * @kref: kref to free. * * Only to be called from kref_put in drm_syncobj_put. */ void drm_syncobj_free(struct kref *kref) { struct drm_syncobj *syncobj = container_of(kref, struct drm_syncobj, refcount); struct syncobj_eventfd_entry *ev_fd_cur, *ev_fd_tmp; drm_syncobj_replace_fence(syncobj, NULL); list_for_each_entry_safe(ev_fd_cur, ev_fd_tmp, &syncobj->ev_fd_list, node) syncobj_eventfd_entry_free(ev_fd_cur); kfree(syncobj); } EXPORT_SYMBOL(drm_syncobj_free); /** * drm_syncobj_create - create a new syncobj * @out_syncobj: returned syncobj * @flags: DRM_SYNCOBJ_* flags * @fence: if non-NULL, the syncobj will represent this fence * * This is the first function to create a sync object. After creating, drivers * probably want to make it available to userspace, either through * drm_syncobj_get_handle() or drm_syncobj_get_fd(). * * Returns 0 on success or a negative error value on failure. */ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags, struct dma_fence *fence) { int ret; struct drm_syncobj *syncobj; syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL); if (!syncobj) return -ENOMEM; kref_init(&syncobj->refcount); INIT_LIST_HEAD(&syncobj->cb_list); INIT_LIST_HEAD(&syncobj->ev_fd_list); spin_lock_init(&syncobj->lock); if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) { ret = drm_syncobj_assign_null_handle(syncobj); if (ret < 0) { drm_syncobj_put(syncobj); return ret; } } if (fence) drm_syncobj_replace_fence(syncobj, fence); *out_syncobj = syncobj; return 0; } EXPORT_SYMBOL(drm_syncobj_create); /** * drm_syncobj_get_handle - get a handle from a syncobj * @file_private: drm file private pointer * @syncobj: Sync object to export * @handle: out parameter with the new handle * * Exports a sync object created with drm_syncobj_create() as a handle on * @file_private to userspace. * * Returns 0 on success or a negative error value on failure. */ int drm_syncobj_get_handle(struct drm_file *file_private, struct drm_syncobj *syncobj, u32 *handle) { int ret; /* take a reference to put in the idr */ drm_syncobj_get(syncobj); idr_preload(GFP_KERNEL); spin_lock(&file_private->syncobj_table_lock); ret = idr_alloc(&file_private->syncobj_idr, syncobj, 1, 0, GFP_NOWAIT); spin_unlock(&file_private->syncobj_table_lock); idr_preload_end(); if (ret < 0) { drm_syncobj_put(syncobj); return ret; } *handle = ret; return 0; } EXPORT_SYMBOL(drm_syncobj_get_handle); static int drm_syncobj_create_as_handle(struct drm_file *file_private, u32 *handle, uint32_t flags) { int ret; struct drm_syncobj *syncobj; ret = drm_syncobj_create(&syncobj, flags, NULL); if (ret) return ret; ret = drm_syncobj_get_handle(file_private, syncobj, handle); drm_syncobj_put(syncobj); return ret; } static int drm_syncobj_destroy(struct drm_file *file_private, u32 handle) { struct drm_syncobj *syncobj; spin_lock(&file_private->syncobj_table_lock); syncobj = idr_remove(&file_private->syncobj_idr, handle); spin_unlock(&file_private->syncobj_table_lock); if (!syncobj) return -EINVAL; drm_syncobj_put(syncobj); return 0; } static int drm_syncobj_file_release(struct inode *inode, struct file *file) { struct drm_syncobj *syncobj = file->private_data; drm_syncobj_put(syncobj); return 0; } static const struct file_operations drm_syncobj_file_fops = { .release = drm_syncobj_file_release, }; /** * drm_syncobj_get_fd - get a file descriptor from a syncobj * @syncobj: Sync object to export * @p_fd: out parameter with the new file descriptor * * Exports a sync object created with drm_syncobj_create() as a file descriptor. * * Returns 0 on success or a negative error value on failure. */ int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd) { struct file *file; int fd; fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) return fd; file = anon_inode_getfile("syncobj_file", &drm_syncobj_file_fops, syncobj, 0); if (IS_ERR(file)) { put_unused_fd(fd); return PTR_ERR(file); } drm_syncobj_get(syncobj); fd_install(fd, file); *p_fd = fd; return 0; } EXPORT_SYMBOL(drm_syncobj_get_fd); static int drm_syncobj_handle_to_fd(struct drm_file *file_private, u32 handle, int *p_fd) { struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle); int ret; if (!syncobj) return -EINVAL; ret = drm_syncobj_get_fd(syncobj, p_fd); drm_syncobj_put(syncobj); return ret; } static int drm_syncobj_fd_to_handle(struct drm_file *file_private, int fd, u32 *handle) { struct drm_syncobj *syncobj; CLASS(fd, f)(fd); int ret; if (fd_empty(f)) return -EINVAL; if (fd_file(f)->f_op != &drm_syncobj_file_fops) return -EINVAL; /* take a reference to put in the idr */ syncobj = fd_file(f)->private_data; drm_syncobj_get(syncobj); idr_preload(GFP_KERNEL); spin_lock(&file_private->syncobj_table_lock); ret = idr_alloc(&file_private->syncobj_idr, syncobj, 1, 0, GFP_NOWAIT); spin_unlock(&file_private->syncobj_table_lock); idr_preload_end(); if (ret > 0) { *handle = ret; ret = 0; } else drm_syncobj_put(syncobj); return ret; } static int drm_syncobj_import_sync_file_fence(struct drm_file *file_private, int fd, int handle, u64 point) { struct dma_fence *fence = sync_file_get_fence(fd); struct drm_syncobj *syncobj; if (!fence) return -EINVAL; syncobj = drm_syncobj_find(file_private, handle); if (!syncobj) { dma_fence_put(fence); return -ENOENT; } if (point) { struct dma_fence_chain *chain = dma_fence_chain_alloc(); if (!chain) return -ENOMEM; drm_syncobj_add_point(syncobj, chain, fence, point); } else { drm_syncobj_replace_fence(syncobj, fence); } dma_fence_put(fence); drm_syncobj_put(syncobj); return 0; } static int drm_syncobj_export_sync_file(struct drm_file *file_private, int handle, u64 point, int *p_fd) { int ret; struct dma_fence *fence; struct sync_file *sync_file; int fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) return fd; ret = drm_syncobj_find_fence(file_private, handle, point, 0, &fence); if (ret) goto err_put_fd; sync_file = sync_file_create(fence); dma_fence_put(fence); if (!sync_file) { ret = -EINVAL; goto err_put_fd; } fd_install(fd, sync_file->file); *p_fd = fd; return 0; err_put_fd: put_unused_fd(fd); return ret; } /** * drm_syncobj_open - initializes syncobj file-private structures at devnode open time * @file_private: drm file-private structure to set up * * Called at device open time, sets up the structure for handling refcounting * of sync objects. */ void drm_syncobj_open(struct drm_file *file_private) { idr_init_base(&file_private->syncobj_idr, 1); spin_lock_init(&file_private->syncobj_table_lock); } static int drm_syncobj_release_handle(int id, void *ptr, void *data) { struct drm_syncobj *syncobj = ptr; drm_syncobj_put(syncobj); return 0; } /** * drm_syncobj_release - release file-private sync object resources * @file_private: drm file-private structure to clean up * * Called at close time when the filp is going away. * * Releases any remaining references on objects by this filp. */ void drm_syncobj_release(struct drm_file *file_private) { idr_for_each(&file_private->syncobj_idr, &drm_syncobj_release_handle, file_private); idr_destroy(&file_private->syncobj_idr); } int drm_syncobj_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_create *args = data; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) return -EOPNOTSUPP; /* no valid flags yet */ if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED) return -EINVAL; return drm_syncobj_create_as_handle(file_private, &args->handle, args->flags); } int drm_syncobj_destroy_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_destroy *args = data; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) return -EOPNOTSUPP; /* make sure padding is empty */ if (args->pad) return -EINVAL; return drm_syncobj_destroy(file_private, args->handle); } int drm_syncobj_handle_to_fd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_handle *args = data; unsigned int valid_flags = DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_TIMELINE | DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE; u64 point = 0; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) return -EOPNOTSUPP; if (args->pad) return -EINVAL; if (args->flags & ~valid_flags) return -EINVAL; if (args->flags & DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_TIMELINE) point = args->point; if (args->flags & DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE) return drm_syncobj_export_sync_file(file_private, args->handle, point, &args->fd); if (args->point) return -EINVAL; return drm_syncobj_handle_to_fd(file_private, args->handle, &args->fd); } int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_handle *args = data; unsigned int valid_flags = DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_TIMELINE | DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE; u64 point = 0; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) return -EOPNOTSUPP; if (args->pad) return -EINVAL; if (args->flags & ~valid_flags) return -EINVAL; if (args->flags & DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_TIMELINE) point = args->point; if (args->flags & DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE) return drm_syncobj_import_sync_file_fence(file_private, args->fd, args->handle, point); if (args->point) return -EINVAL; return drm_syncobj_fd_to_handle(file_private, args->fd, &args->handle); } static int drm_syncobj_transfer_to_timeline(struct drm_file *file_private, struct drm_syncobj_transfer *args) { struct drm_syncobj *timeline_syncobj = NULL; struct dma_fence *fence, *tmp; struct dma_fence_chain *chain; int ret; timeline_syncobj = drm_syncobj_find(file_private, args->dst_handle); if (!timeline_syncobj) { return -ENOENT; } ret = drm_syncobj_find_fence(file_private, args->src_handle, args->src_point, args->flags, &tmp); if (ret) goto err_put_timeline; fence = dma_fence_unwrap_merge(tmp); dma_fence_put(tmp); if (!fence) { ret = -ENOMEM; goto err_put_timeline; } chain = dma_fence_chain_alloc(); if (!chain) { ret = -ENOMEM; goto err_free_fence; } drm_syncobj_add_point(timeline_syncobj, chain, fence, args->dst_point); err_free_fence: dma_fence_put(fence); err_put_timeline: drm_syncobj_put(timeline_syncobj); return ret; } static int drm_syncobj_transfer_to_binary(struct drm_file *file_private, struct drm_syncobj_transfer *args) { struct drm_syncobj *binary_syncobj = NULL; struct dma_fence *fence; int ret; binary_syncobj = drm_syncobj_find(file_private, args->dst_handle); if (!binary_syncobj) return -ENOENT; ret = drm_syncobj_find_fence(file_private, args->src_handle, args->src_point, args->flags, &fence); if (ret) goto err; drm_syncobj_replace_fence(binary_syncobj, fence); dma_fence_put(fence); err: drm_syncobj_put(binary_syncobj); return ret; } int drm_syncobj_transfer_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_transfer *args = data; int ret; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) return -EOPNOTSUPP; if (args->pad) return -EINVAL; if (args->dst_point) ret = drm_syncobj_transfer_to_timeline(file_private, args); else ret = drm_syncobj_transfer_to_binary(file_private, args); return ret; } static void syncobj_wait_fence_func(struct dma_fence *fence, struct dma_fence_cb *cb) { struct syncobj_wait_entry *wait = container_of(cb, struct syncobj_wait_entry, fence_cb); wake_up_process(wait->task); } static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj, struct syncobj_wait_entry *wait) { struct dma_fence *fence; /* This happens inside the syncobj lock */ fence = rcu_dereference_protected(syncobj->fence, lockdep_is_held(&syncobj->lock)); dma_fence_get(fence); if (!fence || dma_fence_chain_find_seqno(&fence, wait->point)) { dma_fence_put(fence); return; } else if (!fence) { wait->fence = dma_fence_get_stub(); } else { wait->fence = fence; } wake_up_process(wait->task); list_del_init(&wait->node); } static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs, void __user *user_points, uint32_t count, uint32_t flags, signed long timeout, uint32_t *idx, ktime_t *deadline) { struct syncobj_wait_entry *entries; struct dma_fence *fence; uint64_t *points; uint32_t signaled_count, i; if (flags & (DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE)) { might_sleep(); lockdep_assert_none_held_once(); } points = kmalloc_array(count, sizeof(*points), GFP_KERNEL); if (points == NULL) return -ENOMEM; if (!user_points) { memset(points, 0, count * sizeof(uint64_t)); } else if (copy_from_user(points, user_points, sizeof(uint64_t) * count)) { timeout = -EFAULT; goto err_free_points; } entries = kcalloc(count, sizeof(*entries), GFP_KERNEL); if (!entries) { timeout = -ENOMEM; goto err_free_points; } /* Walk the list of sync objects and initialize entries. We do * this up-front so that we can properly return -EINVAL if there is * a syncobj with a missing fence and then never have the chance of * returning -EINVAL again. */ signaled_count = 0; for (i = 0; i < count; ++i) { struct dma_fence *fence; entries[i].task = current; entries[i].point = points[i]; fence = drm_syncobj_fence_get(syncobjs[i]); if (!fence || dma_fence_chain_find_seqno(&fence, points[i])) { dma_fence_put(fence); if (flags & (DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE)) { continue; } else { timeout = -EINVAL; goto cleanup_entries; } } if (fence) entries[i].fence = fence; else entries[i].fence = dma_fence_get_stub(); if ((flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE) || dma_fence_is_signaled(entries[i].fence)) { if (signaled_count == 0 && idx) *idx = i; signaled_count++; } } if (signaled_count == count || (signaled_count > 0 && !(flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL))) goto cleanup_entries; /* There's a very annoying laxness in the dma_fence API here, in * that backends are not required to automatically report when a * fence is signaled prior to fence->ops->enable_signaling() being * called. So here if we fail to match signaled_count, we need to * fallthough and try a 0 timeout wait! */ if (flags & (DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE)) { for (i = 0; i < count; ++i) drm_syncobj_fence_add_wait(syncobjs[i], &entries[i]); } if (deadline) { for (i = 0; i < count; ++i) { fence = entries[i].fence; if (!fence) continue; dma_fence_set_deadline(fence, *deadline); } } do { set_current_state(TASK_INTERRUPTIBLE); signaled_count = 0; for (i = 0; i < count; ++i) { fence = entries[i].fence; if (!fence) continue; if ((flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE) || dma_fence_is_signaled(fence) || (!entries[i].fence_cb.func && dma_fence_add_callback(fence, &entries[i].fence_cb, syncobj_wait_fence_func))) { /* The fence has been signaled */ if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) { signaled_count++; } else { if (idx) *idx = i; goto done_waiting; } } } if (signaled_count == count) goto done_waiting; if (timeout == 0) { timeout = -ETIME; goto done_waiting; } if (signal_pending(current)) { timeout = -ERESTARTSYS; goto done_waiting; } timeout = schedule_timeout(timeout); } while (1); done_waiting: __set_current_state(TASK_RUNNING); cleanup_entries: for (i = 0; i < count; ++i) { drm_syncobj_remove_wait(syncobjs[i], &entries[i]); if (entries[i].fence_cb.func) dma_fence_remove_callback(entries[i].fence, &entries[i].fence_cb); dma_fence_put(entries[i].fence); } kfree(entries); err_free_points: kfree(points); return timeout; } /** * drm_timeout_abs_to_jiffies - calculate jiffies timeout from absolute value * * @timeout_nsec: timeout nsec component in ns, 0 for poll * * Calculate the timeout in jiffies from an absolute time in sec/nsec. */ signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec) { ktime_t abs_timeout, now; u64 timeout_ns, timeout_jiffies64; /* make 0 timeout means poll - absolute 0 doesn't seem valid */ if (timeout_nsec == 0) return 0; abs_timeout = ns_to_ktime(timeout_nsec); now = ktime_get(); if (!ktime_after(abs_timeout, now)) return 0; timeout_ns = ktime_to_ns(ktime_sub(abs_timeout, now)); timeout_jiffies64 = nsecs_to_jiffies64(timeout_ns); /* clamp timeout to avoid infinite timeout */ if (timeout_jiffies64 >= MAX_SCHEDULE_TIMEOUT - 1) return MAX_SCHEDULE_TIMEOUT - 1; return timeout_jiffies64 + 1; } EXPORT_SYMBOL(drm_timeout_abs_to_jiffies); static int drm_syncobj_array_wait(struct drm_device *dev, struct drm_file *file_private, struct drm_syncobj_wait *wait, struct drm_syncobj_timeline_wait *timeline_wait, struct drm_syncobj **syncobjs, bool timeline, ktime_t *deadline) { signed long timeout = 0; uint32_t first = ~0; if (!timeline) { timeout = drm_timeout_abs_to_jiffies(wait->timeout_nsec); timeout = drm_syncobj_array_wait_timeout(syncobjs, NULL, wait->count_handles, wait->flags, timeout, &first, deadline); if (timeout < 0) return timeout; wait->first_signaled = first; } else { timeout = drm_timeout_abs_to_jiffies(timeline_wait->timeout_nsec); timeout = drm_syncobj_array_wait_timeout(syncobjs, u64_to_user_ptr(timeline_wait->points), timeline_wait->count_handles, timeline_wait->flags, timeout, &first, deadline); if (timeout < 0) return timeout; timeline_wait->first_signaled = first; } return 0; } static int drm_syncobj_array_find(struct drm_file *file_private, void __user *user_handles, uint32_t count_handles, struct drm_syncobj ***syncobjs_out) { uint32_t i, *handles; struct drm_syncobj **syncobjs; int ret; handles = kmalloc_array(count_handles, sizeof(*handles), GFP_KERNEL); if (handles == NULL) return -ENOMEM; if (copy_from_user(handles, user_handles, sizeof(uint32_t) * count_handles)) { ret = -EFAULT; goto err_free_handles; } syncobjs = kmalloc_array(count_handles, sizeof(*syncobjs), GFP_KERNEL); if (syncobjs == NULL) { ret = -ENOMEM; goto err_free_handles; } for (i = 0; i < count_handles; i++) { syncobjs[i] = drm_syncobj_find(file_private, handles[i]); if (!syncobjs[i]) { ret = -ENOENT; goto err_put_syncobjs; } } kfree(handles); *syncobjs_out = syncobjs; return 0; err_put_syncobjs: while (i-- > 0) drm_syncobj_put(syncobjs[i]); kfree(syncobjs); err_free_handles: kfree(handles); return ret; } static void drm_syncobj_array_free(struct drm_syncobj **syncobjs, uint32_t count) { uint32_t i; for (i = 0; i < count; i++) drm_syncobj_put(syncobjs[i]); kfree(syncobjs); } int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_wait *args = data; struct drm_syncobj **syncobjs; unsigned int possible_flags; ktime_t t, *tp = NULL; int ret = 0; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) return -EOPNOTSUPP; possible_flags = DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_DEADLINE; if (args->flags & ~possible_flags) return -EINVAL; if (args->count_handles == 0) return 0; ret = drm_syncobj_array_find(file_private, u64_to_user_ptr(args->handles), args->count_handles, &syncobjs); if (ret < 0) return ret; if (args->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_DEADLINE) { t = ns_to_ktime(args->deadline_nsec); tp = &t; } ret = drm_syncobj_array_wait(dev, file_private, args, NULL, syncobjs, false, tp); drm_syncobj_array_free(syncobjs, args->count_handles); return ret; } int drm_syncobj_timeline_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_timeline_wait *args = data; struct drm_syncobj **syncobjs; unsigned int possible_flags; ktime_t t, *tp = NULL; int ret = 0; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) return -EOPNOTSUPP; possible_flags = DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_DEADLINE; if (args->flags & ~possible_flags) return -EINVAL; if (args->count_handles == 0) return 0; ret = drm_syncobj_array_find(file_private, u64_to_user_ptr(args->handles), args->count_handles, &syncobjs); if (ret < 0) return ret; if (args->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_DEADLINE) { t = ns_to_ktime(args->deadline_nsec); tp = &t; } ret = drm_syncobj_array_wait(dev, file_private, NULL, args, syncobjs, true, tp); drm_syncobj_array_free(syncobjs, args->count_handles); return ret; } static void syncobj_eventfd_entry_fence_func(struct dma_fence *fence, struct dma_fence_cb *cb) { struct syncobj_eventfd_entry *entry = container_of(cb, struct syncobj_eventfd_entry, fence_cb); eventfd_signal(entry->ev_fd_ctx); syncobj_eventfd_entry_free(entry); } static void syncobj_eventfd_entry_func(struct drm_syncobj *syncobj, struct syncobj_eventfd_entry *entry) { int ret; struct dma_fence *fence; /* This happens inside the syncobj lock */ fence = dma_fence_get(rcu_dereference_protected(syncobj->fence, 1)); if (!fence) return; ret = dma_fence_chain_find_seqno(&fence, entry->point); if (ret != 0) { /* The given seqno has not been submitted yet. */ dma_fence_put(fence); return; } else if (!fence) { /* If dma_fence_chain_find_seqno returns 0 but sets the fence * to NULL, it implies that the given seqno is signaled and a * later seqno has already been submitted. Assign a stub fence * so that the eventfd still gets signaled below. */ fence = dma_fence_get_stub(); } list_del_init(&entry->node); entry->fence = fence; if (entry->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE) { eventfd_signal(entry->ev_fd_ctx); syncobj_eventfd_entry_free(entry); } else { ret = dma_fence_add_callback(fence, &entry->fence_cb, syncobj_eventfd_entry_fence_func); if (ret == -ENOENT) { eventfd_signal(entry->ev_fd_ctx); syncobj_eventfd_entry_free(entry); } } } int drm_syncobj_eventfd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_eventfd *args = data; struct drm_syncobj *syncobj; struct eventfd_ctx *ev_fd_ctx; struct syncobj_eventfd_entry *entry; int ret; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) return -EOPNOTSUPP; if (args->flags & ~DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE) return -EINVAL; if (args->pad) return -EINVAL; syncobj = drm_syncobj_find(file_private, args->handle); if (!syncobj) return -ENOENT; ev_fd_ctx = eventfd_ctx_fdget(args->fd); if (IS_ERR(ev_fd_ctx)) { ret = PTR_ERR(ev_fd_ctx); goto err_fdget; } entry = kzalloc(sizeof(*entry), GFP_KERNEL); if (!entry) { ret = -ENOMEM; goto err_kzalloc; } entry->syncobj = syncobj; entry->ev_fd_ctx = ev_fd_ctx; entry->point = args->point; entry->flags = args->flags; drm_syncobj_add_eventfd(syncobj, entry); drm_syncobj_put(syncobj); return 0; err_kzalloc: eventfd_ctx_put(ev_fd_ctx); err_fdget: drm_syncobj_put(syncobj); return ret; } int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_array *args = data; struct drm_syncobj **syncobjs; uint32_t i; int ret; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) return -EOPNOTSUPP; if (args->pad != 0) return -EINVAL; if (args->count_handles == 0) return -EINVAL; ret = drm_syncobj_array_find(file_private, u64_to_user_ptr(args->handles), args->count_handles, &syncobjs); if (ret < 0) return ret; for (i = 0; i < args->count_handles; i++) drm_syncobj_replace_fence(syncobjs[i], NULL); drm_syncobj_array_free(syncobjs, args->count_handles); return 0; } int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_array *args = data; struct drm_syncobj **syncobjs; uint32_t i; int ret; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) return -EOPNOTSUPP; if (args->pad != 0) return -EINVAL; if (args->count_handles == 0) return -EINVAL; ret = drm_syncobj_array_find(file_private, u64_to_user_ptr(args->handles), args->count_handles, &syncobjs); if (ret < 0) return ret; for (i = 0; i < args->count_handles; i++) { ret = drm_syncobj_assign_null_handle(syncobjs[i]); if (ret < 0) break; } drm_syncobj_array_free(syncobjs, args->count_handles); return ret; } int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_timeline_array *args = data; struct drm_syncobj **syncobjs; struct dma_fence_chain **chains; uint64_t *points; uint32_t i, j; int ret; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) return -EOPNOTSUPP; if (args->flags != 0) return -EINVAL; if (args->count_handles == 0) return -EINVAL; ret = drm_syncobj_array_find(file_private, u64_to_user_ptr(args->handles), args->count_handles, &syncobjs); if (ret < 0) return ret; points = kmalloc_array(args->count_handles, sizeof(*points), GFP_KERNEL); if (!points) { ret = -ENOMEM; goto out; } if (!u64_to_user_ptr(args->points)) { memset(points, 0, args->count_handles * sizeof(uint64_t)); } else if (copy_from_user(points, u64_to_user_ptr(args->points), sizeof(uint64_t) * args->count_handles)) { ret = -EFAULT; goto err_points; } chains = kmalloc_array(args->count_handles, sizeof(void *), GFP_KERNEL); if (!chains) { ret = -ENOMEM; goto err_points; } for (i = 0; i < args->count_handles; i++) { chains[i] = dma_fence_chain_alloc(); if (!chains[i]) { for (j = 0; j < i; j++) dma_fence_chain_free(chains[j]); ret = -ENOMEM; goto err_chains; } } for (i = 0; i < args->count_handles; i++) { struct dma_fence *fence = dma_fence_get_stub(); drm_syncobj_add_point(syncobjs[i], chains[i], fence, points[i]); dma_fence_put(fence); } err_chains: kfree(chains); err_points: kfree(points); out: drm_syncobj_array_free(syncobjs, args->count_handles); return ret; } int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private) { struct drm_syncobj_timeline_array *args = data; struct drm_syncobj **syncobjs; uint64_t __user *points = u64_to_user_ptr(args->points); uint32_t i; int ret; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) return -EOPNOTSUPP; if (args->flags & ~DRM_SYNCOBJ_QUERY_FLAGS_LAST_SUBMITTED) return -EINVAL; if (args->count_handles == 0) return -EINVAL; ret = drm_syncobj_array_find(file_private, u64_to_user_ptr(args->handles), args->count_handles, &syncobjs); if (ret < 0) return ret; for (i = 0; i < args->count_handles; i++) { struct dma_fence_chain *chain; struct dma_fence *fence; uint64_t point; fence = drm_syncobj_fence_get(syncobjs[i]); chain = to_dma_fence_chain(fence); if (chain) { struct dma_fence *iter, *last_signaled = dma_fence_get(fence); if (args->flags & DRM_SYNCOBJ_QUERY_FLAGS_LAST_SUBMITTED) { point = fence->seqno; } else { dma_fence_chain_for_each(iter, fence) { if (iter->context != fence->context) { dma_fence_put(iter); /* It is most likely that timeline has * unorder points. */ break; } dma_fence_put(last_signaled); last_signaled = dma_fence_get(iter); } point = dma_fence_is_signaled(last_signaled) ? last_signaled->seqno : to_dma_fence_chain(last_signaled)->prev_seqno; } dma_fence_put(last_signaled); } else { point = 0; } dma_fence_put(fence); ret = copy_to_user(&points[i], &point, sizeof(uint64_t)); ret = ret ? -EFAULT : 0; if (ret) break; } drm_syncobj_array_free(syncobjs, args->count_handles); return ret; } |
| 35 31 6 4 1 1 1 2 2 3 3 3 34 11 1 10 2 1 35 4 1 2 35 2 21 20 3 3 18 1 1 17 25 6 2 1 24 5 24 2 1 23 4 1 22 7 4 7 15 5 10 10 6 4 4 4 1 1 3 35 25 24 3 1 24 24 2 24 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 | // SPDX-License-Identifier: GPL-2.0-or-later /* * CDC Ethernet based networking peripherals * Copyright (C) 2003-2005 by David Brownell * Copyright (C) 2006 by Ole Andre Vadla Ravnas (ActiveSync) */ // #define DEBUG // error path messages, extra info // #define VERBOSE // more; success messages #include <linux/module.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> #include <linux/ethtool.h> #include <linux/workqueue.h> #include <linux/mii.h> #include <linux/usb.h> #include <linux/usb/cdc.h> #include <linux/usb/usbnet.h> #if IS_ENABLED(CONFIG_USB_NET_RNDIS_HOST) static int is_rndis(struct usb_interface_descriptor *desc) { return (desc->bInterfaceClass == USB_CLASS_COMM && desc->bInterfaceSubClass == 2 && desc->bInterfaceProtocol == 0xff); } static int is_activesync(struct usb_interface_descriptor *desc) { return (desc->bInterfaceClass == USB_CLASS_MISC && desc->bInterfaceSubClass == 1 && desc->bInterfaceProtocol == 1); } static int is_wireless_rndis(struct usb_interface_descriptor *desc) { return (desc->bInterfaceClass == USB_CLASS_WIRELESS_CONTROLLER && desc->bInterfaceSubClass == 1 && desc->bInterfaceProtocol == 3); } static int is_novatel_rndis(struct usb_interface_descriptor *desc) { return (desc->bInterfaceClass == USB_CLASS_MISC && desc->bInterfaceSubClass == 4 && desc->bInterfaceProtocol == 1); } #else #define is_rndis(desc) 0 #define is_activesync(desc) 0 #define is_wireless_rndis(desc) 0 #define is_novatel_rndis(desc) 0 #endif static const u8 mbm_guid[16] = { 0xa3, 0x17, 0xa8, 0x8b, 0x04, 0x5e, 0x4f, 0x01, 0xa6, 0x07, 0xc0, 0xff, 0xcb, 0x7e, 0x39, 0x2a, }; void usbnet_cdc_update_filter(struct usbnet *dev) { struct net_device *net = dev->net; u16 cdc_filter = USB_CDC_PACKET_TYPE_DIRECTED | USB_CDC_PACKET_TYPE_BROADCAST; /* filtering on the device is an optional feature and not worth * the hassle so we just roughly care about snooping and if any * multicast is requested, we take every multicast */ if (net->flags & IFF_PROMISC) cdc_filter |= USB_CDC_PACKET_TYPE_PROMISCUOUS; if (!netdev_mc_empty(net) || (net->flags & IFF_ALLMULTI)) cdc_filter |= USB_CDC_PACKET_TYPE_ALL_MULTICAST; usb_control_msg(dev->udev, usb_sndctrlpipe(dev->udev, 0), USB_CDC_SET_ETHERNET_PACKET_FILTER, USB_TYPE_CLASS | USB_RECIP_INTERFACE, cdc_filter, dev->intf->cur_altsetting->desc.bInterfaceNumber, NULL, 0, USB_CTRL_SET_TIMEOUT ); } EXPORT_SYMBOL_GPL(usbnet_cdc_update_filter); /* We need to override usbnet_*_link_ksettings in bind() */ static const struct ethtool_ops cdc_ether_ethtool_ops = { .get_link = usbnet_get_link, .nway_reset = usbnet_nway_reset, .get_drvinfo = usbnet_get_drvinfo, .get_msglevel = usbnet_get_msglevel, .set_msglevel = usbnet_set_msglevel, .get_ts_info = ethtool_op_get_ts_info, .get_link_ksettings = usbnet_get_link_ksettings_internal, .set_link_ksettings = NULL, }; /* probes control interface, claims data interface, collects the bulk * endpoints, activates data interface (if needed), maybe sets MTU. * all pure cdc, except for certain firmware workarounds, and knowing * that rndis uses one different rule. */ int usbnet_generic_cdc_bind(struct usbnet *dev, struct usb_interface *intf) { u8 *buf = intf->cur_altsetting->extra; int len = intf->cur_altsetting->extralen; struct usb_interface_descriptor *d; struct cdc_state *info = (void *) &dev->data; int status; int rndis; bool android_rndis_quirk = false; struct usb_driver *driver = driver_of(intf); struct usb_cdc_parsed_header header; if (sizeof(dev->data) < sizeof(*info)) return -EDOM; /* expect strict spec conformance for the descriptors, but * cope with firmware which stores them in the wrong place */ if (len == 0 && dev->udev->actconfig->extralen) { /* Motorola SB4100 (and others: Brad Hards says it's * from a Broadcom design) put CDC descriptors here */ buf = dev->udev->actconfig->extra; len = dev->udev->actconfig->extralen; dev_dbg(&intf->dev, "CDC descriptors on config\n"); } /* Maybe CDC descriptors are after the endpoint? This bug has * been seen on some 2Wire Inc RNDIS-ish products. */ if (len == 0) { struct usb_host_endpoint *hep; hep = intf->cur_altsetting->endpoint; if (hep) { buf = hep->extra; len = hep->extralen; } if (len) dev_dbg(&intf->dev, "CDC descriptors on endpoint\n"); } /* this assumes that if there's a non-RNDIS vendor variant * of cdc-acm, it'll fail RNDIS requests cleanly. */ rndis = (is_rndis(&intf->cur_altsetting->desc) || is_activesync(&intf->cur_altsetting->desc) || is_wireless_rndis(&intf->cur_altsetting->desc) || is_novatel_rndis(&intf->cur_altsetting->desc)); memset(info, 0, sizeof(*info)); info->control = intf; cdc_parse_cdc_header(&header, intf, buf, len); info->u = header.usb_cdc_union_desc; info->header = header.usb_cdc_header_desc; info->ether = header.usb_cdc_ether_desc; if (!info->u) { if (rndis) goto skip; else /* in that case a quirk is mandatory */ goto bad_desc; } /* we need a master/control interface (what we're * probed with) and a slave/data interface; union * descriptors sort this all out. */ info->control = usb_ifnum_to_if(dev->udev, info->u->bMasterInterface0); info->data = usb_ifnum_to_if(dev->udev, info->u->bSlaveInterface0); if (!info->control || !info->data) { dev_dbg(&intf->dev, "master #%u/%p slave #%u/%p\n", info->u->bMasterInterface0, info->control, info->u->bSlaveInterface0, info->data); /* fall back to hard-wiring for RNDIS */ if (rndis) { android_rndis_quirk = true; goto skip; } goto bad_desc; } if (info->control != intf) { dev_dbg(&intf->dev, "bogus CDC Union\n"); /* Ambit USB Cable Modem (and maybe others) * interchanges master and slave interface. */ if (info->data == intf) { info->data = info->control; info->control = intf; } else goto bad_desc; } /* some devices merge these - skip class check */ if (info->control == info->data) goto skip; /* a data interface altsetting does the real i/o */ d = &info->data->cur_altsetting->desc; if (d->bInterfaceClass != USB_CLASS_CDC_DATA) { dev_dbg(&intf->dev, "slave class %u\n", d->bInterfaceClass); goto bad_desc; } skip: /* Communication class functions with bmCapabilities are not * RNDIS. But some Wireless class RNDIS functions use * bmCapabilities for their own purpose. The failsafe is * therefore applied only to Communication class RNDIS * functions. The rndis test is redundant, but a cheap * optimization. */ if (rndis && is_rndis(&intf->cur_altsetting->desc) && header.usb_cdc_acm_descriptor && header.usb_cdc_acm_descriptor->bmCapabilities) { dev_dbg(&intf->dev, "ACM capabilities %02x, not really RNDIS?\n", header.usb_cdc_acm_descriptor->bmCapabilities); goto bad_desc; } if (header.usb_cdc_ether_desc && info->ether->wMaxSegmentSize) { dev->hard_mtu = le16_to_cpu(info->ether->wMaxSegmentSize); /* because of Zaurus, we may be ignoring the host * side link address we were given. */ } if (header.usb_cdc_mdlm_desc && memcmp(header.usb_cdc_mdlm_desc->bGUID, mbm_guid, 16)) { dev_dbg(&intf->dev, "GUID doesn't match\n"); goto bad_desc; } if (header.usb_cdc_mdlm_detail_desc && header.usb_cdc_mdlm_detail_desc->bLength < (sizeof(struct usb_cdc_mdlm_detail_desc) + 1)) { dev_dbg(&intf->dev, "Descriptor too short\n"); goto bad_desc; } /* Microsoft ActiveSync based and some regular RNDIS devices lack the * CDC descriptors, so we'll hard-wire the interfaces and not check * for descriptors. * * Some Android RNDIS devices have a CDC Union descriptor pointing * to non-existing interfaces. Ignore that and attempt the same * hard-wired 0 and 1 interfaces. */ if (rndis && (!info->u || android_rndis_quirk)) { info->control = usb_ifnum_to_if(dev->udev, 0); info->data = usb_ifnum_to_if(dev->udev, 1); if (!info->control || !info->data || info->control != intf) { dev_dbg(&intf->dev, "rndis: master #0/%p slave #1/%p\n", info->control, info->data); goto bad_desc; } } else if (!info->header || (!rndis && !info->ether)) { dev_dbg(&intf->dev, "missing cdc %s%s%sdescriptor\n", info->header ? "" : "header ", info->u ? "" : "union ", info->ether ? "" : "ether "); goto bad_desc; } /* claim data interface and set it up ... with side effects. * network traffic can't flow until an altsetting is enabled. */ if (info->data != info->control) { status = usb_driver_claim_interface(driver, info->data, dev); if (status < 0) return status; } status = usbnet_get_endpoints(dev, info->data); if (status < 0) { /* ensure immediate exit from usbnet_disconnect */ usb_set_intfdata(info->data, NULL); if (info->data != info->control) usb_driver_release_interface(driver, info->data); return status; } /* status endpoint: optional for CDC Ethernet, not RNDIS (or ACM) */ if (info->data != info->control) dev->status = NULL; if (info->control->cur_altsetting->desc.bNumEndpoints == 1) { struct usb_endpoint_descriptor *desc; dev->status = &info->control->cur_altsetting->endpoint[0]; desc = &dev->status->desc; if (!usb_endpoint_is_int_in(desc) || (le16_to_cpu(desc->wMaxPacketSize) < sizeof(struct usb_cdc_notification)) || !desc->bInterval) { dev_dbg(&intf->dev, "bad notification endpoint\n"); dev->status = NULL; } } if (rndis && !dev->status) { dev_dbg(&intf->dev, "missing RNDIS status endpoint\n"); usb_set_intfdata(info->data, NULL); usb_driver_release_interface(driver, info->data); return -ENODEV; } /* override ethtool_ops */ dev->net->ethtool_ops = &cdc_ether_ethtool_ops; return 0; bad_desc: dev_info(&dev->udev->dev, "bad CDC descriptors\n"); return -ENODEV; } EXPORT_SYMBOL_GPL(usbnet_generic_cdc_bind); /* like usbnet_generic_cdc_bind() but handles filter initialization * correctly */ int usbnet_ether_cdc_bind(struct usbnet *dev, struct usb_interface *intf) { int rv; rv = usbnet_generic_cdc_bind(dev, intf); if (rv < 0) goto bail_out; /* Some devices don't initialise properly. In particular * the packet filter is not reset. There are devices that * don't do reset all the way. So the packet filter should * be set to a sane initial value. */ usbnet_cdc_update_filter(dev); bail_out: return rv; } EXPORT_SYMBOL_GPL(usbnet_ether_cdc_bind); void usbnet_cdc_unbind(struct usbnet *dev, struct usb_interface *intf) { struct cdc_state *info = (void *) &dev->data; struct usb_driver *driver = driver_of(intf); /* combined interface - nothing to do */ if (info->data == info->control) return; /* disconnect master --> disconnect slave */ if (intf == info->control && info->data) { /* ensure immediate exit from usbnet_disconnect */ usb_set_intfdata(info->data, NULL); usb_driver_release_interface(driver, info->data); info->data = NULL; } /* and vice versa (just in case) */ else if (intf == info->data && info->control) { /* ensure immediate exit from usbnet_disconnect */ usb_set_intfdata(info->control, NULL); usb_driver_release_interface(driver, info->control); info->control = NULL; } } EXPORT_SYMBOL_GPL(usbnet_cdc_unbind); /* Communications Device Class, Ethernet Control model * * Takes two interfaces. The DATA interface is inactive till an altsetting * is selected. Configuration data includes class descriptors. There's * an optional status endpoint on the control interface. * * This should interop with whatever the 2.4 "CDCEther.c" driver * (by Brad Hards) talked with, with more functionality. */ static void speed_change(struct usbnet *dev, __le32 *speeds) { dev->tx_speed = __le32_to_cpu(speeds[0]); dev->rx_speed = __le32_to_cpu(speeds[1]); } void usbnet_cdc_status(struct usbnet *dev, struct urb *urb) { struct usb_cdc_notification *event; if (urb->actual_length < sizeof(*event)) return; /* SPEED_CHANGE can get split into two 8-byte packets */ if (test_and_clear_bit(EVENT_STS_SPLIT, &dev->flags)) { speed_change(dev, (__le32 *) urb->transfer_buffer); return; } event = urb->transfer_buffer; switch (event->bNotificationType) { case USB_CDC_NOTIFY_NETWORK_CONNECTION: netif_dbg(dev, timer, dev->net, "CDC: carrier %s\n", event->wValue ? "on" : "off"); if (netif_carrier_ok(dev->net) != !!event->wValue) usbnet_link_change(dev, !!event->wValue, 0); break; case USB_CDC_NOTIFY_SPEED_CHANGE: /* tx/rx rates */ netif_dbg(dev, timer, dev->net, "CDC: speed change (len %d)\n", urb->actual_length); if (urb->actual_length != (sizeof(*event) + 8)) set_bit(EVENT_STS_SPLIT, &dev->flags); else speed_change(dev, (__le32 *) &event[1]); break; /* USB_CDC_NOTIFY_RESPONSE_AVAILABLE can happen too (e.g. RNDIS), * but there are no standard formats for the response data. */ default: netdev_err(dev->net, "CDC: unexpected notification %02x!\n", event->bNotificationType); break; } } EXPORT_SYMBOL_GPL(usbnet_cdc_status); int usbnet_cdc_bind(struct usbnet *dev, struct usb_interface *intf) { int status; struct cdc_state *info = (void *) &dev->data; BUILD_BUG_ON((sizeof(((struct usbnet *)0)->data) < sizeof(struct cdc_state))); status = usbnet_ether_cdc_bind(dev, intf); if (status < 0) return status; status = usbnet_get_ethernet_addr(dev, info->ether->iMACAddress); if (status < 0) { usb_set_intfdata(info->data, NULL); usb_driver_release_interface(driver_of(intf), info->data); return status; } return 0; } EXPORT_SYMBOL_GPL(usbnet_cdc_bind); static int usbnet_cdc_zte_bind(struct usbnet *dev, struct usb_interface *intf) { int status = usbnet_cdc_bind(dev, intf); if (!status && (dev->net->dev_addr[0] & 0x02)) eth_hw_addr_random(dev->net); return status; } /* Make sure packets have correct destination MAC address * * A firmware bug observed on some devices (ZTE MF823/831/910) is that the * device sends packets with a static, bogus, random MAC address (event if * device MAC address has been updated). Always set MAC address to that of the * device. */ int usbnet_cdc_zte_rx_fixup(struct usbnet *dev, struct sk_buff *skb) { if (skb->len < ETH_HLEN || !(skb->data[0] & 0x02)) return 1; skb_reset_mac_header(skb); ether_addr_copy(eth_hdr(skb)->h_dest, dev->net->dev_addr); return 1; } EXPORT_SYMBOL_GPL(usbnet_cdc_zte_rx_fixup); /* Ensure correct link state * * Some devices (ZTE MF823/831/910) export two carrier on notifications when * connected. This causes the link state to be incorrect. Work around this by * always setting the state to off, then on. */ static void usbnet_cdc_zte_status(struct usbnet *dev, struct urb *urb) { struct usb_cdc_notification *event; if (urb->actual_length < sizeof(*event)) return; event = urb->transfer_buffer; if (event->bNotificationType != USB_CDC_NOTIFY_NETWORK_CONNECTION) { usbnet_cdc_status(dev, urb); return; } netif_dbg(dev, timer, dev->net, "CDC: carrier %s\n", event->wValue ? "on" : "off"); if (event->wValue && netif_carrier_ok(dev->net)) netif_carrier_off(dev->net); usbnet_link_change(dev, !!event->wValue, 0); } static const struct driver_info cdc_info = { .description = "CDC Ethernet Device", .flags = FLAG_ETHER | FLAG_POINTTOPOINT, .bind = usbnet_cdc_bind, .unbind = usbnet_cdc_unbind, .status = usbnet_cdc_status, .set_rx_mode = usbnet_cdc_update_filter, .manage_power = usbnet_manage_power, }; static const struct driver_info zte_cdc_info = { .description = "ZTE CDC Ethernet Device", .flags = FLAG_ETHER | FLAG_POINTTOPOINT, .bind = usbnet_cdc_zte_bind, .unbind = usbnet_cdc_unbind, .status = usbnet_cdc_zte_status, .set_rx_mode = usbnet_cdc_update_filter, .manage_power = usbnet_manage_power, .rx_fixup = usbnet_cdc_zte_rx_fixup, }; static const struct driver_info wwan_info = { .description = "Mobile Broadband Network Device", .flags = FLAG_WWAN, .bind = usbnet_cdc_bind, .unbind = usbnet_cdc_unbind, .status = usbnet_cdc_status, .set_rx_mode = usbnet_cdc_update_filter, .manage_power = usbnet_manage_power, }; /*-------------------------------------------------------------------------*/ #define HUAWEI_VENDOR_ID 0x12D1 #define NOVATEL_VENDOR_ID 0x1410 #define ZTE_VENDOR_ID 0x19D2 #define DELL_VENDOR_ID 0x413C #define REALTEK_VENDOR_ID 0x0bda #define SAMSUNG_VENDOR_ID 0x04e8 #define LENOVO_VENDOR_ID 0x17ef #define LINKSYS_VENDOR_ID 0x13b1 #define NVIDIA_VENDOR_ID 0x0955 #define HP_VENDOR_ID 0x03f0 #define MICROSOFT_VENDOR_ID 0x045e #define UBLOX_VENDOR_ID 0x1546 #define TPLINK_VENDOR_ID 0x2357 #define AQUANTIA_VENDOR_ID 0x2eca #define ASIX_VENDOR_ID 0x0b95 static const struct usb_device_id products[] = { /* BLACKLIST !! * * First blacklist any products that are egregiously nonconformant * with the CDC Ethernet specs. Minor braindamage we cope with; when * they're not even trying, needing a separate driver is only the first * of the differences to show up. */ #define ZAURUS_MASTER_INTERFACE \ .bInterfaceClass = USB_CLASS_COMM, \ .bInterfaceSubClass = USB_CDC_SUBCLASS_ETHERNET, \ .bInterfaceProtocol = USB_CDC_PROTO_NONE #define ZAURUS_FAKE_INTERFACE \ .bInterfaceClass = USB_CLASS_COMM, \ .bInterfaceSubClass = USB_CDC_SUBCLASS_MDLM, \ .bInterfaceProtocol = USB_CDC_PROTO_NONE /* SA-1100 based Sharp Zaurus ("collie"), or compatible; * wire-incompatible with true CDC Ethernet implementations. * (And, it seems, needlessly so...) */ { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x8004, ZAURUS_MASTER_INTERFACE, .driver_info = 0, }, /* PXA-25x based Sharp Zaurii. Note that it seems some of these * (later models especially) may have shipped only with firmware * advertising false "CDC MDLM" compatibility ... but we're not * clear which models did that, so for now let's assume the worst. */ { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x8005, /* A-300 */ ZAURUS_MASTER_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x8005, /* A-300 */ ZAURUS_FAKE_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x8006, /* B-500/SL-5600 */ ZAURUS_MASTER_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x8006, /* B-500/SL-5600 */ ZAURUS_FAKE_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x8007, /* C-700 */ ZAURUS_MASTER_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x8007, /* C-700 */ ZAURUS_FAKE_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x9031, /* C-750 C-760 */ ZAURUS_MASTER_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x9032, /* SL-6000 */ ZAURUS_MASTER_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, .idProduct = 0x9032, /* SL-6000 */ ZAURUS_FAKE_INTERFACE, .driver_info = 0, }, { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x04DD, /* reported with some C860 units */ .idProduct = 0x9050, /* C-860 */ ZAURUS_MASTER_INTERFACE, .driver_info = 0, }, /* Olympus has some models with a Zaurus-compatible option. * R-1000 uses a FreeScale i.MXL cpu (ARMv4T) */ { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_DEVICE, .idVendor = 0x07B4, .idProduct = 0x0F02, /* R-1000 */ ZAURUS_MASTER_INTERFACE, .driver_info = 0, }, /* LG Electronics VL600 wants additional headers on every frame */ { USB_DEVICE_AND_INTERFACE_INFO(0x1004, 0x61aa, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Logitech Harmony 900 - uses the pseudo-MDLM (BLAN) driver */ { USB_DEVICE_AND_INTERFACE_INFO(0x046d, 0xc11f, USB_CLASS_COMM, USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Novatel USB551L and MC551 - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(NOVATEL_VENDOR_ID, 0xB001, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Novatel E362 - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(NOVATEL_VENDOR_ID, 0x9010, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Dell Wireless 5800 (Novatel E362) - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x8195, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Dell Wireless 5800 (Novatel E362) - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x8196, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Dell Wireless 5804 (Novatel E371) - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x819b, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Novatel Expedite E371 - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(NOVATEL_VENDOR_ID, 0x9011, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* HP lt2523 (Novatel E371) - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(HP_VENDOR_ID, 0x421d, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* AnyDATA ADU960S - handled by qmi_wwan */ { USB_DEVICE_AND_INTERFACE_INFO(0x16d5, 0x650a, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Huawei E1820 - handled by qmi_wwan */ { USB_DEVICE_INTERFACE_NUMBER(HUAWEI_VENDOR_ID, 0x14ac, 1), .driver_info = 0, }, /* Realtek RTL8153 Based USB 3.0 Ethernet Adapters */ { USB_DEVICE_AND_INTERFACE_INFO(REALTEK_VENDOR_ID, 0x8153, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Lenovo Powered USB-C Travel Hub (4X90S92381, based on Realtek RTL8153) */ { USB_DEVICE_AND_INTERFACE_INFO(LENOVO_VENDOR_ID, 0x721e, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Lenovo ThinkPad Hybrid USB-C with USB-A Dock (40af0135eu, based on Realtek RTL8153) */ { USB_DEVICE_AND_INTERFACE_INFO(LENOVO_VENDOR_ID, 0xa359, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* Aquantia AQtion USB to 5GbE Controller (based on AQC111U) */ { USB_DEVICE_AND_INTERFACE_INFO(AQUANTIA_VENDOR_ID, 0xc101, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* ASIX USB 3.1 Gen1 to 5G Multi-Gigabit Ethernet Adapter(based on AQC111U) */ { USB_DEVICE_AND_INTERFACE_INFO(ASIX_VENDOR_ID, 0x2790, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* ASIX USB 3.1 Gen1 to 2.5G Multi-Gigabit Ethernet Adapter(based on AQC112U) */ { USB_DEVICE_AND_INTERFACE_INFO(ASIX_VENDOR_ID, 0x2791, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* USB-C 3.1 to 5GBASE-T Ethernet Adapter (based on AQC111U) */ { USB_DEVICE_AND_INTERFACE_INFO(0x20f4, 0xe05a, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* QNAP QNA-UC5G1T USB to 5GbE Adapter (based on AQC111U) */ { USB_DEVICE_AND_INTERFACE_INFO(0x1c04, 0x0015, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = 0, }, /* WHITELIST!!! * * CDC Ether uses two interfaces, not necessarily consecutive. * We match the main interface, ignoring the optional device * class so we could handle devices that aren't exclusively * CDC ether. * * NOTE: this match must come AFTER entries blacklisting devices * because of bugs/quirks in a given product (like Zaurus, above). */ { /* ZTE (Vodafone) K3805-Z */ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1003, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* ZTE (Vodafone) K3806-Z */ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1015, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* ZTE (Vodafone) K4510-Z */ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1173, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* ZTE (Vodafone) K3770-Z */ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1177, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* ZTE (Vodafone) K3772-Z */ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x1181, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* Telit modules */ USB_VENDOR_AND_INTERFACE_INFO(0x1bc7, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (kernel_ulong_t) &wwan_info, }, { /* Dell DW5580 modules */ USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x81ba, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (kernel_ulong_t)&wwan_info, }, { /* Huawei ME906 and ME909 */ USB_DEVICE_AND_INTERFACE_INFO(HUAWEI_VENDOR_ID, 0x15c1, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* ZTE modules */ USB_VENDOR_AND_INTERFACE_INFO(ZTE_VENDOR_ID, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&zte_cdc_info, }, { /* U-blox TOBY-L2 */ USB_DEVICE_AND_INTERFACE_INFO(UBLOX_VENDOR_ID, 0x1143, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* U-blox SARA-U2 */ USB_DEVICE_AND_INTERFACE_INFO(UBLOX_VENDOR_ID, 0x1104, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* U-blox LARA-R6 01B */ USB_DEVICE_AND_INTERFACE_INFO(UBLOX_VENDOR_ID, 0x1313, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* U-blox LARA-L6 */ USB_DEVICE_AND_INTERFACE_INFO(UBLOX_VENDOR_ID, 0x1343, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* Cinterion PLS8 modem by GEMALTO */ USB_DEVICE_AND_INTERFACE_INFO(0x1e2d, 0x0061, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* Cinterion AHS3 modem by GEMALTO */ USB_DEVICE_AND_INTERFACE_INFO(0x1e2d, 0x0055, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* Cinterion PLS62-W modem by GEMALTO/THALES */ USB_DEVICE_AND_INTERFACE_INFO(0x1e2d, 0x005b, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* Cinterion PLS83/PLS63 modem by GEMALTO/THALES */ USB_DEVICE_AND_INTERFACE_INFO(0x1e2d, 0x0069, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { USB_INTERFACE_INFO(USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE), .driver_info = (unsigned long) &cdc_info, }, { USB_INTERFACE_INFO(USB_CLASS_COMM, USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)&wwan_info, }, { /* Various Huawei modems with a network port like the UMG1831 */ USB_VENDOR_AND_INTERFACE_INFO(HUAWEI_VENDOR_ID, USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET, 255), .driver_info = (unsigned long)&wwan_info, }, { }, /* END */ }; MODULE_DEVICE_TABLE(usb, products); static struct usb_driver cdc_driver = { .name = "cdc_ether", .id_table = products, .probe = usbnet_probe, .disconnect = usbnet_disconnect, .suspend = usbnet_suspend, .resume = usbnet_resume, .reset_resume = usbnet_resume, .supports_autosuspend = 1, .disable_hub_initiated_lpm = 1, }; module_usb_driver(cdc_driver); MODULE_AUTHOR("David Brownell"); MODULE_DESCRIPTION("USB CDC Ethernet devices"); MODULE_LICENSE("GPL"); |
| 499 471 8 4671 357 4748 57 4033 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _LINUX_USER_NAMESPACE_H #define _LINUX_USER_NAMESPACE_H #include <linux/kref.h> #include <linux/nsproxy.h> #include <linux/ns_common.h> #include <linux/rculist_nulls.h> #include <linux/sched.h> #include <linux/workqueue.h> #include <linux/rcuref.h> #include <linux/rwsem.h> #include <linux/sysctl.h> #include <linux/err.h> #define UID_GID_MAP_MAX_BASE_EXTENTS 5 #define UID_GID_MAP_MAX_EXTENTS 340 struct uid_gid_extent { u32 first; u32 lower_first; u32 count; }; struct uid_gid_map { /* 64 bytes -- 1 cache line */ union { struct { struct uid_gid_extent extent[UID_GID_MAP_MAX_BASE_EXTENTS]; u32 nr_extents; }; struct { struct uid_gid_extent *forward; struct uid_gid_extent *reverse; }; }; }; #define USERNS_SETGROUPS_ALLOWED 1UL #define USERNS_INIT_FLAGS USERNS_SETGROUPS_ALLOWED struct ucounts; enum ucount_type { UCOUNT_USER_NAMESPACES, UCOUNT_PID_NAMESPACES, UCOUNT_UTS_NAMESPACES, UCOUNT_IPC_NAMESPACES, UCOUNT_NET_NAMESPACES, UCOUNT_MNT_NAMESPACES, UCOUNT_CGROUP_NAMESPACES, UCOUNT_TIME_NAMESPACES, #ifdef CONFIG_INOTIFY_USER UCOUNT_INOTIFY_INSTANCES, UCOUNT_INOTIFY_WATCHES, #endif #ifdef CONFIG_FANOTIFY UCOUNT_FANOTIFY_GROUPS, UCOUNT_FANOTIFY_MARKS, #endif UCOUNT_COUNTS, }; enum rlimit_type { UCOUNT_RLIMIT_NPROC, UCOUNT_RLIMIT_MSGQUEUE, UCOUNT_RLIMIT_SIGPENDING, UCOUNT_RLIMIT_MEMLOCK, UCOUNT_RLIMIT_COUNTS, }; #if IS_ENABLED(CONFIG_BINFMT_MISC) struct binfmt_misc; #endif struct user_namespace { struct uid_gid_map uid_map; struct uid_gid_map gid_map; struct uid_gid_map projid_map; struct user_namespace *parent; int level; kuid_t owner; kgid_t group; struct ns_common ns; unsigned long flags; /* parent_could_setfcap: true if the creator if this ns had CAP_SETFCAP * in its effective capability set at the child ns creation time. */ bool parent_could_setfcap; #ifdef CONFIG_KEYS /* List of joinable keyrings in this namespace. Modification access of * these pointers is controlled by keyring_sem. Once * user_keyring_register is set, it won't be changed, so it can be * accessed directly with READ_ONCE(). */ struct list_head keyring_name_list; struct key *user_keyring_register; struct rw_semaphore keyring_sem; #endif /* Register of per-UID persistent keyrings for this namespace */ #ifdef CONFIG_PERSISTENT_KEYRINGS struct key *persistent_keyring_register; #endif struct work_struct work; #ifdef CONFIG_SYSCTL struct ctl_table_set set; struct ctl_table_header *sysctls; #endif struct ucounts *ucounts; long ucount_max[UCOUNT_COUNTS]; long rlimit_max[UCOUNT_RLIMIT_COUNTS]; #if IS_ENABLED(CONFIG_BINFMT_MISC) struct binfmt_misc *binfmt_misc; #endif } __randomize_layout; struct ucounts { struct hlist_nulls_node node; struct user_namespace *ns; kuid_t uid; struct rcu_head rcu; rcuref_t count; atomic_long_t ucount[UCOUNT_COUNTS]; atomic_long_t rlimit[UCOUNT_RLIMIT_COUNTS]; }; extern struct user_namespace init_user_ns; extern struct ucounts init_ucounts; bool setup_userns_sysctls(struct user_namespace *ns); void retire_userns_sysctls(struct user_namespace *ns); struct ucounts *inc_ucount(struct user_namespace *ns, kuid_t uid, enum ucount_type type); void dec_ucount(struct ucounts *ucounts, enum ucount_type type); struct ucounts *alloc_ucounts(struct user_namespace *ns, kuid_t uid); void put_ucounts(struct ucounts *ucounts); static inline struct ucounts * __must_check get_ucounts(struct ucounts *ucounts) { if (rcuref_get(&ucounts->count)) return ucounts; return NULL; } static inline long get_rlimit_value(struct ucounts *ucounts, enum rlimit_type type) { return atomic_long_read(&ucounts->rlimit[type]); } long inc_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v); bool dec_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v); long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type, bool override_rlimit); void dec_rlimit_put_ucounts(struct ucounts *ucounts, enum rlimit_type type); bool is_rlimit_overlimit(struct ucounts *ucounts, enum rlimit_type type, unsigned long max); static inline long get_userns_rlimit_max(struct user_namespace *ns, enum rlimit_type type) { return READ_ONCE(ns->rlimit_max[type]); } static inline void set_userns_rlimit_max(struct user_namespace *ns, enum rlimit_type type, unsigned long max) { ns->rlimit_max[type] = max <= LONG_MAX ? max : LONG_MAX; } #ifdef CONFIG_USER_NS static inline struct user_namespace *get_user_ns(struct user_namespace *ns) { if (ns) refcount_inc(&ns->ns.count); return ns; } extern int create_user_ns(struct cred *new); extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred); extern void __put_user_ns(struct user_namespace *ns); static inline void put_user_ns(struct user_namespace *ns) { if (ns && refcount_dec_and_test(&ns->ns.count)) __put_user_ns(ns); } struct seq_operations; extern const struct seq_operations proc_uid_seq_operations; extern const struct seq_operations proc_gid_seq_operations; extern const struct seq_operations proc_projid_seq_operations; extern ssize_t proc_uid_map_write(struct file *, const char __user *, size_t, loff_t *); extern ssize_t proc_gid_map_write(struct file *, const char __user *, size_t, loff_t *); extern ssize_t proc_projid_map_write(struct file *, const char __user *, size_t, loff_t *); extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t, loff_t *); extern int proc_setgroups_show(struct seq_file *m, void *v); extern bool userns_may_setgroups(const struct user_namespace *ns); extern bool in_userns(const struct user_namespace *ancestor, const struct user_namespace *child); extern bool current_in_userns(const struct user_namespace *target_ns); struct ns_common *ns_get_owner(struct ns_common *ns); #else static inline struct user_namespace *get_user_ns(struct user_namespace *ns) { return &init_user_ns; } static inline int create_user_ns(struct cred *new) { return -EINVAL; } static inline int unshare_userns(unsigned long unshare_flags, struct cred **new_cred) { if (unshare_flags & CLONE_NEWUSER) return -EINVAL; return 0; } static inline void put_user_ns(struct user_namespace *ns) { } static inline bool userns_may_setgroups(const struct user_namespace *ns) { return true; } static inline bool in_userns(const struct user_namespace *ancestor, const struct user_namespace *child) { return true; } static inline bool current_in_userns(const struct user_namespace *target_ns) { return true; } static inline struct ns_common *ns_get_owner(struct ns_common *ns) { return ERR_PTR(-EPERM); } #endif #endif /* _LINUX_USER_H */ |
| 120 120 9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 | /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ #ifndef _BTRFS_CTREE_H_ #define _BTRFS_CTREE_H_ #include <linux/btrfs.h> #include <linux/types.h> #ifdef __KERNEL__ #include <linux/stddef.h> #else #include <stddef.h> #endif /* ASCII for _BHRfS_M, no terminating nul */ #define BTRFS_MAGIC 0x4D5F53665248425FULL #define BTRFS_MAX_LEVEL 8 /* * We can actually store much bigger names, but lets not confuse the rest of * linux. */ #define BTRFS_NAME_LEN 255 /* * Theoretical limit is larger, but we keep this down to a sane value. That * should limit greatly the possibility of collisions on inode ref items. */ #define BTRFS_LINK_MAX 65535U /* * This header contains the structure definitions and constants used * by file system objects that can be retrieved using * the BTRFS_IOC_SEARCH_TREE ioctl. That means basically anything that * is needed to describe a leaf node's key or item contents. */ /* holds pointers to all of the tree roots */ #define BTRFS_ROOT_TREE_OBJECTID 1ULL /* stores information about which extents are in use, and reference counts */ #define BTRFS_EXTENT_TREE_OBJECTID 2ULL /* * chunk tree stores translations from logical -> physical block numbering * the super block points to the chunk tree */ #define BTRFS_CHUNK_TREE_OBJECTID 3ULL /* * stores information about which areas of a given device are in use. * one per device. The tree of tree roots points to the device tree */ #define BTRFS_DEV_TREE_OBJECTID 4ULL /* one per subvolume, storing files and directories */ #define BTRFS_FS_TREE_OBJECTID 5ULL /* directory objectid inside the root tree */ #define BTRFS_ROOT_TREE_DIR_OBJECTID 6ULL /* holds checksums of all the data extents */ #define BTRFS_CSUM_TREE_OBJECTID 7ULL /* holds quota configuration and tracking */ #define BTRFS_QUOTA_TREE_OBJECTID 8ULL /* for storing items that use the BTRFS_UUID_KEY* types */ #define BTRFS_UUID_TREE_OBJECTID 9ULL /* tracks free space in block groups. */ #define BTRFS_FREE_SPACE_TREE_OBJECTID 10ULL /* Holds the block group items for extent tree v2. */ #define BTRFS_BLOCK_GROUP_TREE_OBJECTID 11ULL /* Tracks RAID stripes in block groups. */ #define BTRFS_RAID_STRIPE_TREE_OBJECTID 12ULL /* device stats in the device tree */ #define BTRFS_DEV_STATS_OBJECTID 0ULL /* for storing balance parameters in the root tree */ #define BTRFS_BALANCE_OBJECTID -4ULL /* orphan objectid for tracking unlinked/truncated files */ #define BTRFS_ORPHAN_OBJECTID -5ULL /* does write ahead logging to speed up fsyncs */ #define BTRFS_TREE_LOG_OBJECTID -6ULL #define BTRFS_TREE_LOG_FIXUP_OBJECTID -7ULL /* for space balancing */ #define BTRFS_TREE_RELOC_OBJECTID -8ULL #define BTRFS_DATA_RELOC_TREE_OBJECTID -9ULL /* * extent checksums all have this objectid * this allows them to share the logging tree * for fsyncs */ #define BTRFS_EXTENT_CSUM_OBJECTID -10ULL /* For storing free space cache */ #define BTRFS_FREE_SPACE_OBJECTID -11ULL /* * The inode number assigned to the special inode for storing * free ino cache */ #define BTRFS_FREE_INO_OBJECTID -12ULL /* dummy objectid represents multiple objectids */ #define BTRFS_MULTIPLE_OBJECTIDS -255ULL /* * All files have objectids in this range. */ #define BTRFS_FIRST_FREE_OBJECTID 256ULL #define BTRFS_LAST_FREE_OBJECTID -256ULL #define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL /* * the device items go into the chunk tree. The key is in the form * [ 1 BTRFS_DEV_ITEM_KEY device_id ] */ #define BTRFS_DEV_ITEMS_OBJECTID 1ULL #define BTRFS_BTREE_INODE_OBJECTID 1 #define BTRFS_EMPTY_SUBVOL_DIR_OBJECTID 2 #define BTRFS_DEV_REPLACE_DEVID 0ULL /* * inode items have the data typically returned from stat and store other * info about object characteristics. There is one for every file and dir in * the FS */ #define BTRFS_INODE_ITEM_KEY 1 #define BTRFS_INODE_REF_KEY 12 #define BTRFS_INODE_EXTREF_KEY 13 #define BTRFS_XATTR_ITEM_KEY 24 /* * fs verity items are stored under two different key types on disk. * The descriptor items: * [ inode objectid, BTRFS_VERITY_DESC_ITEM_KEY, offset ] * * At offset 0, we store a btrfs_verity_descriptor_item which tracks the size * of the descriptor item and some extra data for encryption. * Starting at offset 1, these hold the generic fs verity descriptor. The * latter are opaque to btrfs, we just read and write them as a blob for the * higher level verity code. The most common descriptor size is 256 bytes. * * The merkle tree items: * [ inode objectid, BTRFS_VERITY_MERKLE_ITEM_KEY, offset ] * * These also start at offset 0, and correspond to the merkle tree bytes. When * fsverity asks for page 0 of the merkle tree, we pull up one page starting at * offset 0 for this key type. These are also opaque to btrfs, we're blindly * storing whatever fsverity sends down. */ #define BTRFS_VERITY_DESC_ITEM_KEY 36 #define BTRFS_VERITY_MERKLE_ITEM_KEY 37 #define BTRFS_ORPHAN_ITEM_KEY 48 /* reserve 2-15 close to the inode for later flexibility */ /* * dir items are the name -> inode pointers in a directory. There is one * for every name in a directory. BTRFS_DIR_LOG_ITEM_KEY is no longer used * but it's still defined here for documentation purposes and to help avoid * having its numerical value reused in the future. */ #define BTRFS_DIR_LOG_ITEM_KEY 60 #define BTRFS_DIR_LOG_INDEX_KEY 72 #define BTRFS_DIR_ITEM_KEY 84 #define BTRFS_DIR_INDEX_KEY 96 /* * extent data is for file data */ #define BTRFS_EXTENT_DATA_KEY 108 /* * extent csums are stored in a separate tree and hold csums for * an entire extent on disk. */ #define BTRFS_EXTENT_CSUM_KEY 128 /* * root items point to tree roots. They are typically in the root * tree used by the super block to find all the other trees */ #define BTRFS_ROOT_ITEM_KEY 132 /* * root backrefs tie subvols and snapshots to the directory entries that * reference them */ #define BTRFS_ROOT_BACKREF_KEY 144 /* * root refs make a fast index for listing all of the snapshots and * subvolumes referenced by a given root. They point directly to the * directory item in the root that references the subvol */ #define BTRFS_ROOT_REF_KEY 156 /* * extent items are in the extent map tree. These record which blocks * are used, and how many references there are to each block */ #define BTRFS_EXTENT_ITEM_KEY 168 /* * The same as the BTRFS_EXTENT_ITEM_KEY, except it's metadata we already know * the length, so we save the level in key->offset instead of the length. */ #define BTRFS_METADATA_ITEM_KEY 169 /* * Special inline ref key which stores the id of the subvolume which originally * created the extent. This subvolume owns the extent permanently from the * perspective of simple quotas. Needed to know which subvolume to free quota * usage from when the extent is deleted. * * Stored as an inline ref rather to avoid wasting space on a separate item on * top of the existing extent item. However, unlike the other inline refs, * there is one one owner ref per extent rather than one per extent. * * Because of this, it goes at the front of the list of inline refs, and thus * must have a lower type value than any other inline ref type (to satisfy the * disk format rule that inline refs have non-decreasing type). */ #define BTRFS_EXTENT_OWNER_REF_KEY 172 #define BTRFS_TREE_BLOCK_REF_KEY 176 #define BTRFS_EXTENT_DATA_REF_KEY 178 /* * Obsolete key. Defintion removed in 6.6, value may be reused in the future. * * #define BTRFS_EXTENT_REF_V0_KEY 180 */ #define BTRFS_SHARED_BLOCK_REF_KEY 182 #define BTRFS_SHARED_DATA_REF_KEY 184 /* * block groups give us hints into the extent allocation trees. Which * blocks are free etc etc */ #define BTRFS_BLOCK_GROUP_ITEM_KEY 192 /* * Every block group is represented in the free space tree by a free space info * item, which stores some accounting information. It is keyed on * (block_group_start, FREE_SPACE_INFO, block_group_length). */ #define BTRFS_FREE_SPACE_INFO_KEY 198 /* * A free space extent tracks an extent of space that is free in a block group. * It is keyed on (start, FREE_SPACE_EXTENT, length). */ #define BTRFS_FREE_SPACE_EXTENT_KEY 199 /* * When a block group becomes very fragmented, we convert it to use bitmaps * instead of extents. A free space bitmap is keyed on * (start, FREE_SPACE_BITMAP, length); the corresponding item is a bitmap with * (length / sectorsize) bits. */ #define BTRFS_FREE_SPACE_BITMAP_KEY 200 #define BTRFS_DEV_EXTENT_KEY 204 #define BTRFS_DEV_ITEM_KEY 216 #define BTRFS_CHUNK_ITEM_KEY 228 #define BTRFS_RAID_STRIPE_KEY 230 /* * Records the overall state of the qgroups. * There's only one instance of this key present, * (0, BTRFS_QGROUP_STATUS_KEY, 0) */ #define BTRFS_QGROUP_STATUS_KEY 240 /* * Records the currently used space of the qgroup. * One key per qgroup, (0, BTRFS_QGROUP_INFO_KEY, qgroupid). */ #define BTRFS_QGROUP_INFO_KEY 242 /* * Contains the user configured limits for the qgroup. * One key per qgroup, (0, BTRFS_QGROUP_LIMIT_KEY, qgroupid). */ #define BTRFS_QGROUP_LIMIT_KEY 244 /* * Records the child-parent relationship of qgroups. For * each relation, 2 keys are present: * (childid, BTRFS_QGROUP_RELATION_KEY, parentid) * (parentid, BTRFS_QGROUP_RELATION_KEY, childid) */ #define BTRFS_QGROUP_RELATION_KEY 246 /* * Obsolete name, see BTRFS_TEMPORARY_ITEM_KEY. */ #define BTRFS_BALANCE_ITEM_KEY 248 /* * The key type for tree items that are stored persistently, but do not need to * exist for extended period of time. The items can exist in any tree. * * [subtype, BTRFS_TEMPORARY_ITEM_KEY, data] * * Existing items: * * - balance status item * (BTRFS_BALANCE_OBJECTID, BTRFS_TEMPORARY_ITEM_KEY, 0) */ #define BTRFS_TEMPORARY_ITEM_KEY 248 /* * Obsolete name, see BTRFS_PERSISTENT_ITEM_KEY */ #define BTRFS_DEV_STATS_KEY 249 /* * The key type for tree items that are stored persistently and usually exist * for a long period, eg. filesystem lifetime. The item kinds can be status * information, stats or preference values. The item can exist in any tree. * * [subtype, BTRFS_PERSISTENT_ITEM_KEY, data] * * Existing items: * * - device statistics, store IO stats in the device tree, one key for all * stats * (BTRFS_DEV_STATS_OBJECTID, BTRFS_DEV_STATS_KEY, 0) */ #define BTRFS_PERSISTENT_ITEM_KEY 249 /* * Persistently stores the device replace state in the device tree. * The key is built like this: (0, BTRFS_DEV_REPLACE_KEY, 0). */ #define BTRFS_DEV_REPLACE_KEY 250 /* * Stores items that allow to quickly map UUIDs to something else. * These items are part of the filesystem UUID tree. * The key is built like this: * (UUID_upper_64_bits, BTRFS_UUID_KEY*, UUID_lower_64_bits). */ #if BTRFS_UUID_SIZE != 16 #error "UUID items require BTRFS_UUID_SIZE == 16!" #endif #define BTRFS_UUID_KEY_SUBVOL 251 /* for UUIDs assigned to subvols */ #define BTRFS_UUID_KEY_RECEIVED_SUBVOL 252 /* for UUIDs assigned to * received subvols */ /* * string items are for debugging. They just store a short string of * data in the FS */ #define BTRFS_STRING_ITEM_KEY 253 /* Maximum metadata block size (nodesize) */ #define BTRFS_MAX_METADATA_BLOCKSIZE 65536 /* 32 bytes in various csum fields */ #define BTRFS_CSUM_SIZE 32 /* csum types */ enum btrfs_csum_type { BTRFS_CSUM_TYPE_CRC32 = 0, BTRFS_CSUM_TYPE_XXHASH = 1, BTRFS_CSUM_TYPE_SHA256 = 2, BTRFS_CSUM_TYPE_BLAKE2 = 3, }; /* * flags definitions for directory entry item type * * Used by: * struct btrfs_dir_item.type * * Values 0..7 must match common file type values in fs_types.h. */ #define BTRFS_FT_UNKNOWN 0 #define BTRFS_FT_REG_FILE 1 #define BTRFS_FT_DIR 2 #define BTRFS_FT_CHRDEV 3 #define BTRFS_FT_BLKDEV 4 #define BTRFS_FT_FIFO 5 #define BTRFS_FT_SOCK 6 #define BTRFS_FT_SYMLINK 7 #define BTRFS_FT_XATTR 8 #define BTRFS_FT_MAX 9 /* Directory contains encrypted data */ #define BTRFS_FT_ENCRYPTED 0x80 static inline __u8 btrfs_dir_flags_to_ftype(__u8 flags) { return flags & ~BTRFS_FT_ENCRYPTED; } /* * Inode flags */ #define BTRFS_INODE_NODATASUM (1U << 0) #define BTRFS_INODE_NODATACOW (1U << 1) #define BTRFS_INODE_READONLY (1U << 2) #define BTRFS_INODE_NOCOMPRESS (1U << 3) #define BTRFS_INODE_PREALLOC (1U << 4) #define BTRFS_INODE_SYNC (1U << 5) #define BTRFS_INODE_IMMUTABLE (1U << 6) #define BTRFS_INODE_APPEND (1U << 7) #define BTRFS_INODE_NODUMP (1U << 8) #define BTRFS_INODE_NOATIME (1U << 9) #define BTRFS_INODE_DIRSYNC (1U << 10) #define BTRFS_INODE_COMPRESS (1U << 11) #define BTRFS_INODE_ROOT_ITEM_INIT (1U << 31) #define BTRFS_INODE_FLAG_MASK \ (BTRFS_INODE_NODATASUM | \ BTRFS_INODE_NODATACOW | \ BTRFS_INODE_READONLY | \ BTRFS_INODE_NOCOMPRESS | \ BTRFS_INODE_PREALLOC | \ BTRFS_INODE_SYNC | \ BTRFS_INODE_IMMUTABLE | \ BTRFS_INODE_APPEND | \ BTRFS_INODE_NODUMP | \ BTRFS_INODE_NOATIME | \ BTRFS_INODE_DIRSYNC | \ BTRFS_INODE_COMPRESS | \ BTRFS_INODE_ROOT_ITEM_INIT) #define BTRFS_INODE_RO_VERITY (1U << 0) #define BTRFS_INODE_RO_FLAG_MASK (BTRFS_INODE_RO_VERITY) /* * The key defines the order in the tree, and so it also defines (optimal) * block layout. * * objectid corresponds to the inode number. * * type tells us things about the object, and is a kind of stream selector. * so for a given inode, keys with type of 1 might refer to the inode data, * type of 2 may point to file data in the btree and type == 3 may point to * extents. * * offset is the starting byte offset for this key in the stream. * * btrfs_disk_key is in disk byte order. struct btrfs_key is always * in cpu native order. Otherwise they are identical and their sizes * should be the same (ie both packed) */ struct btrfs_disk_key { __le64 objectid; __u8 type; __le64 offset; } __attribute__ ((__packed__)); struct btrfs_key { __u64 objectid; __u8 type; __u64 offset; } __attribute__ ((__packed__)); /* * Every tree block (leaf or node) starts with this header. */ struct btrfs_header { /* These first four must match the super block */ __u8 csum[BTRFS_CSUM_SIZE]; /* FS specific uuid */ __u8 fsid[BTRFS_FSID_SIZE]; /* Which block this node is supposed to live in */ __le64 bytenr; __le64 flags; /* Allowed to be different from the super from here on down */ __u8 chunk_tree_uuid[BTRFS_UUID_SIZE]; __le64 generation; __le64 owner; __le32 nritems; __u8 level; } __attribute__ ((__packed__)); /* * This is a very generous portion of the super block, giving us room to * translate 14 chunks with 3 stripes each. */ #define BTRFS_SYSTEM_CHUNK_ARRAY_SIZE 2048 /* * Just in case we somehow lose the roots and are not able to mount, we store * an array of the roots from previous transactions in the super. */ #define BTRFS_NUM_BACKUP_ROOTS 4 struct btrfs_root_backup { __le64 tree_root; __le64 tree_root_gen; __le64 chunk_root; __le64 chunk_root_gen; __le64 extent_root; __le64 extent_root_gen; __le64 fs_root; __le64 fs_root_gen; __le64 dev_root; __le64 dev_root_gen; __le64 csum_root; __le64 csum_root_gen; __le64 total_bytes; __le64 bytes_used; __le64 num_devices; /* future */ __le64 unused_64[4]; __u8 tree_root_level; __u8 chunk_root_level; __u8 extent_root_level; __u8 fs_root_level; __u8 dev_root_level; __u8 csum_root_level; /* future and to align */ __u8 unused_8[10]; } __attribute__ ((__packed__)); /* * A leaf is full of items. offset and size tell us where to find the item in * the leaf (relative to the start of the data area) */ struct btrfs_item { struct btrfs_disk_key key; __le32 offset; __le32 size; } __attribute__ ((__packed__)); /* * Leaves have an item area and a data area: * [item0, item1....itemN] [free space] [dataN...data1, data0] * * The data is separate from the items to get the keys closer together during * searches. */ struct btrfs_leaf { struct btrfs_header header; struct btrfs_item items[]; } __attribute__ ((__packed__)); /* * All non-leaf blocks are nodes, they hold only keys and pointers to other * blocks. */ struct btrfs_key_ptr { struct btrfs_disk_key key; __le64 blockptr; __le64 generation; } __attribute__ ((__packed__)); struct btrfs_node { struct btrfs_header header; struct btrfs_key_ptr ptrs[]; } __attribute__ ((__packed__)); struct btrfs_dev_item { /* the internal btrfs device id */ __le64 devid; /* size of the device */ __le64 total_bytes; /* bytes used */ __le64 bytes_used; /* optimal io alignment for this device */ __le32 io_align; /* optimal io width for this device */ __le32 io_width; /* minimal io size for this device */ __le32 sector_size; /* type and info about this device */ __le64 type; /* expected generation for this device */ __le64 generation; /* * starting byte of this partition on the device, * to allow for stripe alignment in the future */ __le64 start_offset; /* grouping information for allocation decisions */ __le32 dev_group; /* seek speed 0-100 where 100 is fastest */ __u8 seek_speed; /* bandwidth 0-100 where 100 is fastest */ __u8 bandwidth; /* btrfs generated uuid for this device */ __u8 uuid[BTRFS_UUID_SIZE]; /* uuid of FS who owns this device */ __u8 fsid[BTRFS_UUID_SIZE]; } __attribute__ ((__packed__)); struct btrfs_stripe { __le64 devid; __le64 offset; __u8 dev_uuid[BTRFS_UUID_SIZE]; } __attribute__ ((__packed__)); struct btrfs_chunk { /* size of this chunk in bytes */ __le64 length; /* objectid of the root referencing this chunk */ __le64 owner; __le64 stripe_len; __le64 type; /* optimal io alignment for this chunk */ __le32 io_align; /* optimal io width for this chunk */ __le32 io_width; /* minimal io size for this chunk */ __le32 sector_size; /* 2^16 stripes is quite a lot, a second limit is the size of a single * item in the btree */ __le16 num_stripes; /* sub stripes only matter for raid10 */ __le16 sub_stripes; struct btrfs_stripe stripe; /* additional stripes go here */ } __attribute__ ((__packed__)); /* * The super block basically lists the main trees of the FS. */ struct btrfs_super_block { /* The first 4 fields must match struct btrfs_header */ __u8 csum[BTRFS_CSUM_SIZE]; /* FS specific UUID, visible to user */ __u8 fsid[BTRFS_FSID_SIZE]; /* This block number */ __le64 bytenr; __le64 flags; /* Allowed to be different from the btrfs_header from here own down */ __le64 magic; __le64 generation; __le64 root; __le64 chunk_root; __le64 log_root; /* * This member has never been utilized since the very beginning, thus * it's always 0 regardless of kernel version. We always use * generation + 1 to read log tree root. So here we mark it deprecated. */ __le64 __unused_log_root_transid; __le64 total_bytes; __le64 bytes_used; __le64 root_dir_objectid; __le64 num_devices; __le32 sectorsize; __le32 nodesize; __le32 __unused_leafsize; __le32 stripesize; __le32 sys_chunk_array_size; __le64 chunk_root_generation; __le64 compat_flags; __le64 compat_ro_flags; __le64 incompat_flags; __le16 csum_type; __u8 root_level; __u8 chunk_root_level; __u8 log_root_level; struct btrfs_dev_item dev_item; char label[BTRFS_LABEL_SIZE]; __le64 cache_generation; __le64 uuid_tree_generation; /* The UUID written into btree blocks */ __u8 metadata_uuid[BTRFS_FSID_SIZE]; __u64 nr_global_roots; /* Future expansion */ __le64 reserved[27]; __u8 sys_chunk_array[BTRFS_SYSTEM_CHUNK_ARRAY_SIZE]; struct btrfs_root_backup super_roots[BTRFS_NUM_BACKUP_ROOTS]; /* Padded to 4096 bytes */ __u8 padding[565]; } __attribute__ ((__packed__)); #define BTRFS_FREE_SPACE_EXTENT 1 #define BTRFS_FREE_SPACE_BITMAP 2 struct btrfs_free_space_entry { __le64 offset; __le64 bytes; __u8 type; } __attribute__ ((__packed__)); struct btrfs_free_space_header { struct btrfs_disk_key location; __le64 generation; __le64 num_entries; __le64 num_bitmaps; } __attribute__ ((__packed__)); struct btrfs_raid_stride { /* The id of device this raid extent lives on. */ __le64 devid; /* The physical location on disk. */ __le64 physical; } __attribute__ ((__packed__)); struct btrfs_stripe_extent { /* An array of raid strides this stripe is composed of. */ __DECLARE_FLEX_ARRAY(struct btrfs_raid_stride, strides); } __attribute__ ((__packed__)); #define BTRFS_HEADER_FLAG_WRITTEN (1ULL << 0) #define BTRFS_HEADER_FLAG_RELOC (1ULL << 1) /* Super block flags */ /* Errors detected */ #define BTRFS_SUPER_FLAG_ERROR (1ULL << 2) #define BTRFS_SUPER_FLAG_SEEDING (1ULL << 32) #define BTRFS_SUPER_FLAG_METADUMP (1ULL << 33) #define BTRFS_SUPER_FLAG_METADUMP_V2 (1ULL << 34) #define BTRFS_SUPER_FLAG_CHANGING_FSID (1ULL << 35) #define BTRFS_SUPER_FLAG_CHANGING_FSID_V2 (1ULL << 36) /* * Those are temporaray flags utilized by btrfs-progs to do offline conversion. * They are rejected by kernel. * But still keep them all here to avoid conflicts. */ #define BTRFS_SUPER_FLAG_CHANGING_BG_TREE (1ULL << 38) #define BTRFS_SUPER_FLAG_CHANGING_DATA_CSUM (1ULL << 39) #define BTRFS_SUPER_FLAG_CHANGING_META_CSUM (1ULL << 40) /* * items in the extent btree are used to record the objectid of the * owner of the block and the number of references */ struct btrfs_extent_item { __le64 refs; __le64 generation; __le64 flags; } __attribute__ ((__packed__)); struct btrfs_extent_item_v0 { __le32 refs; } __attribute__ ((__packed__)); #define BTRFS_EXTENT_FLAG_DATA (1ULL << 0) #define BTRFS_EXTENT_FLAG_TREE_BLOCK (1ULL << 1) /* following flags only apply to tree blocks */ /* use full backrefs for extent pointers in the block */ #define BTRFS_BLOCK_FLAG_FULL_BACKREF (1ULL << 8) #define BTRFS_BACKREF_REV_MAX 256 #define BTRFS_BACKREF_REV_SHIFT 56 #define BTRFS_BACKREF_REV_MASK (((u64)BTRFS_BACKREF_REV_MAX - 1) << \ BTRFS_BACKREF_REV_SHIFT) #define BTRFS_OLD_BACKREF_REV 0 #define BTRFS_MIXED_BACKREF_REV 1 /* * this flag is only used internally by scrub and may be changed at any time * it is only declared here to avoid collisions */ #define BTRFS_EXTENT_FLAG_SUPER (1ULL << 48) struct btrfs_tree_block_info { struct btrfs_disk_key key; __u8 level; } __attribute__ ((__packed__)); struct btrfs_extent_data_ref { __le64 root; __le64 objectid; __le64 offset; __le32 count; } __attribute__ ((__packed__)); struct btrfs_shared_data_ref { __le32 count; } __attribute__ ((__packed__)); struct btrfs_extent_owner_ref { __le64 root_id; } __attribute__ ((__packed__)); struct btrfs_extent_inline_ref { __u8 type; __le64 offset; } __attribute__ ((__packed__)); /* dev extents record free space on individual devices. The owner * field points back to the chunk allocation mapping tree that allocated * the extent. The chunk tree uuid field is a way to double check the owner */ struct btrfs_dev_extent { __le64 chunk_tree; __le64 chunk_objectid; __le64 chunk_offset; __le64 length; __u8 chunk_tree_uuid[BTRFS_UUID_SIZE]; } __attribute__ ((__packed__)); struct btrfs_inode_ref { __le64 index; __le16 name_len; /* name goes here */ } __attribute__ ((__packed__)); struct btrfs_inode_extref { __le64 parent_objectid; __le64 index; __le16 name_len; __u8 name[]; /* name goes here */ } __attribute__ ((__packed__)); struct btrfs_timespec { __le64 sec; __le32 nsec; } __attribute__ ((__packed__)); struct btrfs_inode_item { /* nfs style generation number */ __le64 generation; /* transid that last touched this inode */ __le64 transid; __le64 size; __le64 nbytes; __le64 block_group; __le32 nlink; __le32 uid; __le32 gid; __le32 mode; __le64 rdev; __le64 flags; /* modification sequence number for NFS */ __le64 sequence; /* * a little future expansion, for more than this we can * just grow the inode item and version it */ __le64 reserved[4]; struct btrfs_timespec atime; struct btrfs_timespec ctime; struct btrfs_timespec mtime; struct btrfs_timespec otime; } __attribute__ ((__packed__)); struct btrfs_dir_log_item { __le64 end; } __attribute__ ((__packed__)); struct btrfs_dir_item { struct btrfs_disk_key location; __le64 transid; __le16 data_len; __le16 name_len; __u8 type; } __attribute__ ((__packed__)); #define BTRFS_ROOT_SUBVOL_RDONLY (1ULL << 0) /* * Internal in-memory flag that a subvolume has been marked for deletion but * still visible as a directory */ #define BTRFS_ROOT_SUBVOL_DEAD (1ULL << 48) struct btrfs_root_item { struct btrfs_inode_item inode; __le64 generation; __le64 root_dirid; __le64 bytenr; __le64 byte_limit; __le64 bytes_used; __le64 last_snapshot; __le64 flags; __le32 refs; struct btrfs_disk_key drop_progress; __u8 drop_level; __u8 level; /* * The following fields appear after subvol_uuids+subvol_times * were introduced. */ /* * This generation number is used to test if the new fields are valid * and up to date while reading the root item. Every time the root item * is written out, the "generation" field is copied into this field. If * anyone ever mounted the fs with an older kernel, we will have * mismatching generation values here and thus must invalidate the * new fields. See btrfs_update_root and btrfs_find_last_root for * details. * the offset of generation_v2 is also used as the start for the memset * when invalidating the fields. */ __le64 generation_v2; __u8 uuid[BTRFS_UUID_SIZE]; __u8 parent_uuid[BTRFS_UUID_SIZE]; __u8 received_uuid[BTRFS_UUID_SIZE]; __le64 ctransid; /* updated when an inode changes */ __le64 otransid; /* trans when created */ __le64 stransid; /* trans when sent. non-zero for received subvol */ __le64 rtransid; /* trans when received. non-zero for received subvol */ struct btrfs_timespec ctime; struct btrfs_timespec otime; struct btrfs_timespec stime; struct btrfs_timespec rtime; __le64 reserved[8]; /* for future */ } __attribute__ ((__packed__)); /* * Btrfs root item used to be smaller than current size. The old format ends * at where member generation_v2 is. */ static inline __u32 btrfs_legacy_root_item_size(void) { return offsetof(struct btrfs_root_item, generation_v2); } /* * this is used for both forward and backward root refs */ struct btrfs_root_ref { __le64 dirid; __le64 sequence; __le16 name_len; } __attribute__ ((__packed__)); struct btrfs_disk_balance_args { /* * profiles to operate on, single is denoted by * BTRFS_AVAIL_ALLOC_BIT_SINGLE */ __le64 profiles; /* * usage filter * BTRFS_BALANCE_ARGS_USAGE with a single value means '0..N' * BTRFS_BALANCE_ARGS_USAGE_RANGE - range syntax, min..max */ union { __le64 usage; struct { __le32 usage_min; __le32 usage_max; }; }; /* devid filter */ __le64 devid; /* devid subset filter [pstart..pend) */ __le64 pstart; __le64 pend; /* btrfs virtual address space subset filter [vstart..vend) */ __le64 vstart; __le64 vend; /* * profile to convert to, single is denoted by * BTRFS_AVAIL_ALLOC_BIT_SINGLE */ __le64 target; /* BTRFS_BALANCE_ARGS_* */ __le64 flags; /* * BTRFS_BALANCE_ARGS_LIMIT with value 'limit' * BTRFS_BALANCE_ARGS_LIMIT_RANGE - the extend version can use minimum * and maximum */ union { __le64 limit; struct { __le32 limit_min; __le32 limit_max; }; }; /* * Process chunks that cross stripes_min..stripes_max devices, * BTRFS_BALANCE_ARGS_STRIPES_RANGE */ __le32 stripes_min; __le32 stripes_max; __le64 unused[6]; } __attribute__ ((__packed__)); /* * store balance parameters to disk so that balance can be properly * resumed after crash or unmount */ struct btrfs_balance_item { /* BTRFS_BALANCE_* */ __le64 flags; struct btrfs_disk_balance_args data; struct btrfs_disk_balance_args meta; struct btrfs_disk_balance_args sys; __le64 unused[4]; } __attribute__ ((__packed__)); enum { BTRFS_FILE_EXTENT_INLINE = 0, BTRFS_FILE_EXTENT_REG = 1, BTRFS_FILE_EXTENT_PREALLOC = 2, BTRFS_NR_FILE_EXTENT_TYPES = 3, }; struct btrfs_file_extent_item { /* * transaction id that created this extent */ __le64 generation; /* * max number of bytes to hold this extent in ram * when we split a compressed extent we can't know how big * each of the resulting pieces will be. So, this is * an upper limit on the size of the extent in ram instead of * an exact limit. */ __le64 ram_bytes; /* * 32 bits for the various ways we might encode the data, * including compression and encryption. If any of these * are set to something a given disk format doesn't understand * it is treated like an incompat flag for reading and writing, * but not for stat. */ __u8 compression; __u8 encryption; __le16 other_encoding; /* spare for later use */ /* are we inline data or a real extent? */ __u8 type; /* * disk space consumed by the extent, checksum blocks are included * in these numbers * * At this offset in the structure, the inline extent data start. */ __le64 disk_bytenr; __le64 disk_num_bytes; /* * the logical offset in file blocks (no csums) * this extent record is for. This allows a file extent to point * into the middle of an existing extent on disk, sharing it * between two snapshots (useful if some bytes in the middle of the * extent have changed */ __le64 offset; /* * the logical number of file blocks (no csums included). This * always reflects the size uncompressed and without encoding. */ __le64 num_bytes; } __attribute__ ((__packed__)); struct btrfs_csum_item { __u8 csum; } __attribute__ ((__packed__)); struct btrfs_dev_stats_item { /* * grow this item struct at the end for future enhancements and keep * the existing values unchanged */ __le64 values[BTRFS_DEV_STAT_VALUES_MAX]; } __attribute__ ((__packed__)); #define BTRFS_DEV_REPLACE_ITEM_CONT_READING_FROM_SRCDEV_MODE_ALWAYS 0 #define BTRFS_DEV_REPLACE_ITEM_CONT_READING_FROM_SRCDEV_MODE_AVOID 1 struct btrfs_dev_replace_item { /* * grow this item struct at the end for future enhancements and keep * the existing values unchanged */ __le64 src_devid; __le64 cursor_left; __le64 cursor_right; __le64 cont_reading_from_srcdev_mode; __le64 replace_state; __le64 time_started; __le64 time_stopped; __le64 num_write_errors; __le64 num_uncorrectable_read_errors; } __attribute__ ((__packed__)); /* different types of block groups (and chunks) */ #define BTRFS_BLOCK_GROUP_DATA (1ULL << 0) #define BTRFS_BLOCK_GROUP_SYSTEM (1ULL << 1) #define BTRFS_BLOCK_GROUP_METADATA (1ULL << 2) #define BTRFS_BLOCK_GROUP_RAID0 (1ULL << 3) #define BTRFS_BLOCK_GROUP_RAID1 (1ULL << 4) #define BTRFS_BLOCK_GROUP_DUP (1ULL << 5) #define BTRFS_BLOCK_GROUP_RAID10 (1ULL << 6) #define BTRFS_BLOCK_GROUP_RAID5 (1ULL << 7) #define BTRFS_BLOCK_GROUP_RAID6 (1ULL << 8) #define BTRFS_BLOCK_GROUP_RAID1C3 (1ULL << 9) #define BTRFS_BLOCK_GROUP_RAID1C4 (1ULL << 10) #define BTRFS_BLOCK_GROUP_RESERVED (BTRFS_AVAIL_ALLOC_BIT_SINGLE | \ BTRFS_SPACE_INFO_GLOBAL_RSV) #define BTRFS_BLOCK_GROUP_TYPE_MASK (BTRFS_BLOCK_GROUP_DATA | \ BTRFS_BLOCK_GROUP_SYSTEM | \ BTRFS_BLOCK_GROUP_METADATA) #define BTRFS_BLOCK_GROUP_PROFILE_MASK (BTRFS_BLOCK_GROUP_RAID0 | \ BTRFS_BLOCK_GROUP_RAID1 | \ BTRFS_BLOCK_GROUP_RAID1C3 | \ BTRFS_BLOCK_GROUP_RAID1C4 | \ BTRFS_BLOCK_GROUP_RAID5 | \ BTRFS_BLOCK_GROUP_RAID6 | \ BTRFS_BLOCK_GROUP_DUP | \ BTRFS_BLOCK_GROUP_RAID10) #define BTRFS_BLOCK_GROUP_RAID56_MASK (BTRFS_BLOCK_GROUP_RAID5 | \ BTRFS_BLOCK_GROUP_RAID6) #define BTRFS_BLOCK_GROUP_RAID1_MASK (BTRFS_BLOCK_GROUP_RAID1 | \ BTRFS_BLOCK_GROUP_RAID1C3 | \ BTRFS_BLOCK_GROUP_RAID1C4) /* * We need a bit for restriper to be able to tell when chunks of type * SINGLE are available. This "extended" profile format is used in * fs_info->avail_*_alloc_bits (in-memory) and balance item fields * (on-disk). The corresponding on-disk bit in chunk.type is reserved * to avoid remappings between two formats in future. */ #define BTRFS_AVAIL_ALLOC_BIT_SINGLE (1ULL << 48) /* * A fake block group type that is used to communicate global block reserve * size to userspace via the SPACE_INFO ioctl. */ #define BTRFS_SPACE_INFO_GLOBAL_RSV (1ULL << 49) #define BTRFS_EXTENDED_PROFILE_MASK (BTRFS_BLOCK_GROUP_PROFILE_MASK | \ BTRFS_AVAIL_ALLOC_BIT_SINGLE) static inline __u64 chunk_to_extended(__u64 flags) { if ((flags & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0) flags |= BTRFS_AVAIL_ALLOC_BIT_SINGLE; return flags; } static inline __u64 extended_to_chunk(__u64 flags) { return flags & ~BTRFS_AVAIL_ALLOC_BIT_SINGLE; } struct btrfs_block_group_item { __le64 used; __le64 chunk_objectid; __le64 flags; } __attribute__ ((__packed__)); struct btrfs_free_space_info { __le32 extent_count; __le32 flags; } __attribute__ ((__packed__)); #define BTRFS_FREE_SPACE_USING_BITMAPS (1ULL << 0) #define BTRFS_QGROUP_LEVEL_SHIFT 48 static inline __u16 btrfs_qgroup_level(__u64 qgroupid) { return (__u16)(qgroupid >> BTRFS_QGROUP_LEVEL_SHIFT); } /* * is subvolume quota turned on? */ #define BTRFS_QGROUP_STATUS_FLAG_ON (1ULL << 0) /* * RESCAN is set during the initialization phase */ #define BTRFS_QGROUP_STATUS_FLAG_RESCAN (1ULL << 1) /* * Some qgroup entries are known to be out of date, * either because the configuration has changed in a way that * makes a rescan necessary, or because the fs has been mounted * with a non-qgroup-aware version. * Turning qouta off and on again makes it inconsistent, too. */ #define BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT (1ULL << 2) /* * Whether or not this filesystem is using simple quotas. Not exactly the * incompat bit, because we support using simple quotas, disabling it, then * going back to full qgroup quotas. */ #define BTRFS_QGROUP_STATUS_FLAG_SIMPLE_MODE (1ULL << 3) #define BTRFS_QGROUP_STATUS_FLAGS_MASK (BTRFS_QGROUP_STATUS_FLAG_ON | \ BTRFS_QGROUP_STATUS_FLAG_RESCAN | \ BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT | \ BTRFS_QGROUP_STATUS_FLAG_SIMPLE_MODE) #define BTRFS_QGROUP_STATUS_VERSION 1 struct btrfs_qgroup_status_item { __le64 version; /* * the generation is updated during every commit. As older * versions of btrfs are not aware of qgroups, it will be * possible to detect inconsistencies by checking the * generation on mount time */ __le64 generation; /* flag definitions see above */ __le64 flags; /* * only used during scanning to record the progress * of the scan. It contains a logical address */ __le64 rescan; /* * The generation when quotas were last enabled. Used by simple quotas to * avoid decrementing when freeing an extent that was written before * enable. * * Set only if flags contain BTRFS_QGROUP_STATUS_FLAG_SIMPLE_MODE. */ __le64 enable_gen; } __attribute__ ((__packed__)); struct btrfs_qgroup_info_item { __le64 generation; __le64 rfer; __le64 rfer_cmpr; __le64 excl; __le64 excl_cmpr; } __attribute__ ((__packed__)); struct btrfs_qgroup_limit_item { /* * only updated when any of the other values change */ __le64 flags; __le64 max_rfer; __le64 max_excl; __le64 rsv_rfer; __le64 rsv_excl; } __attribute__ ((__packed__)); struct btrfs_verity_descriptor_item { /* Size of the verity descriptor in bytes */ __le64 size; /* * When we implement support for fscrypt, we will need to encrypt the * Merkle tree for encrypted verity files. These 128 bits are for the * eventual storage of an fscrypt initialization vector. */ __le64 reserved[2]; __u8 encryption; } __attribute__ ((__packed__)); #endif /* _BTRFS_CTREE_H_ */ |
| 1 1 2 1 3 2 2 2 1 1 1 2 1 1 1 4 5 5 1 3 3 4 1 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Roccat Kone[+] driver for Linux * * Copyright (c) 2010 Stefan Achatz <erazor_de@users.sourceforge.net> */ /* */ /* * Roccat Kone[+] is an updated/improved version of the Kone with more memory * and functionality and without the non-standard behaviours the Kone had. * KoneXTD has same capabilities but updated sensor. */ #include <linux/device.h> #include <linux/input.h> #include <linux/hid.h> #include <linux/module.h> #include <linux/slab.h> #include <linux/hid-roccat.h> #include "hid-ids.h" #include "hid-roccat-common.h" #include "hid-roccat-koneplus.h" static uint profile_numbers[5] = {0, 1, 2, 3, 4}; static void koneplus_profile_activated(struct koneplus_device *koneplus, uint new_profile) { koneplus->actual_profile = new_profile; } static int koneplus_send_control(struct usb_device *usb_dev, uint value, enum koneplus_control_requests request) { struct roccat_common2_control control; if ((request == KONEPLUS_CONTROL_REQUEST_PROFILE_SETTINGS || request == KONEPLUS_CONTROL_REQUEST_PROFILE_BUTTONS) && value > 4) return -EINVAL; control.command = ROCCAT_COMMON_COMMAND_CONTROL; control.value = value; control.request = request; return roccat_common2_send_with_status(usb_dev, ROCCAT_COMMON_COMMAND_CONTROL, &control, sizeof(struct roccat_common2_control)); } /* retval is 0-4 on success, < 0 on error */ static int koneplus_get_actual_profile(struct usb_device *usb_dev) { struct koneplus_actual_profile buf; int retval; retval = roccat_common2_receive(usb_dev, KONEPLUS_COMMAND_ACTUAL_PROFILE, &buf, KONEPLUS_SIZE_ACTUAL_PROFILE); return retval ? retval : buf.actual_profile; } static int koneplus_set_actual_profile(struct usb_device *usb_dev, int new_profile) { struct koneplus_actual_profile buf; buf.command = KONEPLUS_COMMAND_ACTUAL_PROFILE; buf.size = KONEPLUS_SIZE_ACTUAL_PROFILE; buf.actual_profile = new_profile; return roccat_common2_send_with_status(usb_dev, KONEPLUS_COMMAND_ACTUAL_PROFILE, &buf, KONEPLUS_SIZE_ACTUAL_PROFILE); } static ssize_t koneplus_sysfs_read(struct file *fp, struct kobject *kobj, char *buf, loff_t off, size_t count, size_t real_size, uint command) { struct device *dev = kobj_to_dev(kobj)->parent->parent; struct koneplus_device *koneplus = hid_get_drvdata(dev_get_drvdata(dev)); struct usb_device *usb_dev = interface_to_usbdev(to_usb_interface(dev)); int retval; if (off >= real_size) return 0; if (off != 0 || count != real_size) return -EINVAL; mutex_lock(&koneplus->koneplus_lock); retval = roccat_common2_receive(usb_dev, command, buf, real_size); mutex_unlock(&koneplus->koneplus_lock); if (retval) return retval; return real_size; } static ssize_t koneplus_sysfs_write(struct file *fp, struct kobject *kobj, void const *buf, loff_t off, size_t count, size_t real_size, uint command) { struct device *dev = kobj_to_dev(kobj)->parent->parent; struct koneplus_device *koneplus = hid_get_drvdata(dev_get_drvdata(dev)); struct usb_device *usb_dev = interface_to_usbdev(to_usb_interface(dev)); int retval; if (off != 0 || count != real_size) return -EINVAL; mutex_lock(&koneplus->koneplus_lock); retval = roccat_common2_send_with_status(usb_dev, command, buf, real_size); mutex_unlock(&koneplus->koneplus_lock); if (retval) return retval; return real_size; } #define KONEPLUS_SYSFS_W(thingy, THINGY) \ static ssize_t koneplus_sysfs_write_ ## thingy(struct file *fp, \ struct kobject *kobj, const struct bin_attribute *attr, \ char *buf, loff_t off, size_t count) \ { \ return koneplus_sysfs_write(fp, kobj, buf, off, count, \ KONEPLUS_SIZE_ ## THINGY, KONEPLUS_COMMAND_ ## THINGY); \ } #define KONEPLUS_SYSFS_R(thingy, THINGY) \ static ssize_t koneplus_sysfs_read_ ## thingy(struct file *fp, \ struct kobject *kobj, const struct bin_attribute *attr, \ char *buf, loff_t off, size_t count) \ { \ return koneplus_sysfs_read(fp, kobj, buf, off, count, \ KONEPLUS_SIZE_ ## THINGY, KONEPLUS_COMMAND_ ## THINGY); \ } #define KONEPLUS_SYSFS_RW(thingy, THINGY) \ KONEPLUS_SYSFS_W(thingy, THINGY) \ KONEPLUS_SYSFS_R(thingy, THINGY) #define KONEPLUS_BIN_ATTRIBUTE_RW(thingy, THINGY) \ KONEPLUS_SYSFS_RW(thingy, THINGY); \ static const struct bin_attribute bin_attr_##thingy = { \ .attr = { .name = #thingy, .mode = 0660 }, \ .size = KONEPLUS_SIZE_ ## THINGY, \ .read_new = koneplus_sysfs_read_ ## thingy, \ .write_new = koneplus_sysfs_write_ ## thingy \ } #define KONEPLUS_BIN_ATTRIBUTE_R(thingy, THINGY) \ KONEPLUS_SYSFS_R(thingy, THINGY); \ static const struct bin_attribute bin_attr_##thingy = { \ .attr = { .name = #thingy, .mode = 0440 }, \ .size = KONEPLUS_SIZE_ ## THINGY, \ .read_new = koneplus_sysfs_read_ ## thingy, \ } #define KONEPLUS_BIN_ATTRIBUTE_W(thingy, THINGY) \ KONEPLUS_SYSFS_W(thingy, THINGY); \ static const struct bin_attribute bin_attr_##thingy = { \ .attr = { .name = #thingy, .mode = 0220 }, \ .size = KONEPLUS_SIZE_ ## THINGY, \ .write_new = koneplus_sysfs_write_ ## thingy \ } KONEPLUS_BIN_ATTRIBUTE_W(control, CONTROL); KONEPLUS_BIN_ATTRIBUTE_W(talk, TALK); KONEPLUS_BIN_ATTRIBUTE_W(macro, MACRO); KONEPLUS_BIN_ATTRIBUTE_R(tcu_image, TCU_IMAGE); KONEPLUS_BIN_ATTRIBUTE_RW(info, INFO); KONEPLUS_BIN_ATTRIBUTE_RW(sensor, SENSOR); KONEPLUS_BIN_ATTRIBUTE_RW(tcu, TCU); KONEPLUS_BIN_ATTRIBUTE_RW(profile_settings, PROFILE_SETTINGS); KONEPLUS_BIN_ATTRIBUTE_RW(profile_buttons, PROFILE_BUTTONS); static ssize_t koneplus_sysfs_read_profilex_settings(struct file *fp, struct kobject *kobj, const struct bin_attribute *attr, char *buf, loff_t off, size_t count) { struct device *dev = kobj_to_dev(kobj)->parent->parent; struct usb_device *usb_dev = interface_to_usbdev(to_usb_interface(dev)); ssize_t retval; retval = koneplus_send_control(usb_dev, *(uint *)(attr->private), KONEPLUS_CONTROL_REQUEST_PROFILE_SETTINGS); if (retval) return retval; return koneplus_sysfs_read(fp, kobj, buf, off, count, KONEPLUS_SIZE_PROFILE_SETTINGS, KONEPLUS_COMMAND_PROFILE_SETTINGS); } static ssize_t koneplus_sysfs_read_profilex_buttons(struct file *fp, struct kobject *kobj, const struct bin_attribute *attr, char *buf, loff_t off, size_t count) { struct device *dev = kobj_to_dev(kobj)->parent->parent; struct usb_device *usb_dev = interface_to_usbdev(to_usb_interface(dev)); ssize_t retval; retval = koneplus_send_control(usb_dev, *(uint *)(attr->private), KONEPLUS_CONTROL_REQUEST_PROFILE_BUTTONS); if (retval) return retval; return koneplus_sysfs_read(fp, kobj, buf, off, count, KONEPLUS_SIZE_PROFILE_BUTTONS, KONEPLUS_COMMAND_PROFILE_BUTTONS); } #define PROFILE_ATTR(number) \ static const struct bin_attribute bin_attr_profile##number##_settings = { \ .attr = { .name = "profile" #number "_settings", .mode = 0440 }, \ .size = KONEPLUS_SIZE_PROFILE_SETTINGS, \ .read_new = koneplus_sysfs_read_profilex_settings, \ .private = &profile_numbers[number-1], \ }; \ static const struct bin_attribute bin_attr_profile##number##_buttons = { \ .attr = { .name = "profile" #number "_buttons", .mode = 0440 }, \ .size = KONEPLUS_SIZE_PROFILE_BUTTONS, \ .read_new = koneplus_sysfs_read_profilex_buttons, \ .private = &profile_numbers[number-1], \ }; PROFILE_ATTR(1); PROFILE_ATTR(2); PROFILE_ATTR(3); PROFILE_ATTR(4); PROFILE_ATTR(5); static ssize_t koneplus_sysfs_show_actual_profile(struct device *dev, struct device_attribute *attr, char *buf) { struct koneplus_device *koneplus = hid_get_drvdata(dev_get_drvdata(dev->parent->parent)); return sysfs_emit(buf, "%d\n", koneplus->actual_profile); } static ssize_t koneplus_sysfs_set_actual_profile(struct device *dev, struct device_attribute *attr, char const *buf, size_t size) { struct koneplus_device *koneplus; struct usb_device *usb_dev; unsigned long profile; int retval; struct koneplus_roccat_report roccat_report; dev = dev->parent->parent; koneplus = hid_get_drvdata(dev_get_drvdata(dev)); usb_dev = interface_to_usbdev(to_usb_interface(dev)); retval = kstrtoul(buf, 10, &profile); if (retval) return retval; if (profile > 4) return -EINVAL; mutex_lock(&koneplus->koneplus_lock); retval = koneplus_set_actual_profile(usb_dev, profile); if (retval) { mutex_unlock(&koneplus->koneplus_lock); return retval; } koneplus_profile_activated(koneplus, profile); roccat_report.type = KONEPLUS_MOUSE_REPORT_BUTTON_TYPE_PROFILE; roccat_report.data1 = profile + 1; roccat_report.data2 = 0; roccat_report.profile = profile + 1; roccat_report_event(koneplus->chrdev_minor, (uint8_t const *)&roccat_report); mutex_unlock(&koneplus->koneplus_lock); return size; } static DEVICE_ATTR(actual_profile, 0660, koneplus_sysfs_show_actual_profile, koneplus_sysfs_set_actual_profile); static DEVICE_ATTR(startup_profile, 0660, koneplus_sysfs_show_actual_profile, koneplus_sysfs_set_actual_profile); static ssize_t koneplus_sysfs_show_firmware_version(struct device *dev, struct device_attribute *attr, char *buf) { struct koneplus_device *koneplus; struct usb_device *usb_dev; struct koneplus_info info; dev = dev->parent->parent; koneplus = hid_get_drvdata(dev_get_drvdata(dev)); usb_dev = interface_to_usbdev(to_usb_interface(dev)); mutex_lock(&koneplus->koneplus_lock); roccat_common2_receive(usb_dev, KONEPLUS_COMMAND_INFO, &info, KONEPLUS_SIZE_INFO); mutex_unlock(&koneplus->koneplus_lock); return sysfs_emit(buf, "%d\n", info.firmware_version); } static DEVICE_ATTR(firmware_version, 0440, koneplus_sysfs_show_firmware_version, NULL); static struct attribute *koneplus_attrs[] = { &dev_attr_actual_profile.attr, &dev_attr_startup_profile.attr, &dev_attr_firmware_version.attr, NULL, }; static const struct bin_attribute *const koneplus_bin_attributes[] = { &bin_attr_control, &bin_attr_talk, &bin_attr_macro, &bin_attr_tcu_image, &bin_attr_info, &bin_attr_sensor, &bin_attr_tcu, &bin_attr_profile_settings, &bin_attr_profile_buttons, &bin_attr_profile1_settings, &bin_attr_profile2_settings, &bin_attr_profile3_settings, &bin_attr_profile4_settings, &bin_attr_profile5_settings, &bin_attr_profile1_buttons, &bin_attr_profile2_buttons, &bin_attr_profile3_buttons, &bin_attr_profile4_buttons, &bin_attr_profile5_buttons, NULL, }; static const struct attribute_group koneplus_group = { .attrs = koneplus_attrs, .bin_attrs_new = koneplus_bin_attributes, }; static const struct attribute_group *koneplus_groups[] = { &koneplus_group, NULL, }; static const struct class koneplus_class = { .name = "koneplus", .dev_groups = koneplus_groups, }; static int koneplus_init_koneplus_device_struct(struct usb_device *usb_dev, struct koneplus_device *koneplus) { int retval; mutex_init(&koneplus->koneplus_lock); retval = koneplus_get_actual_profile(usb_dev); if (retval < 0) return retval; koneplus_profile_activated(koneplus, retval); return 0; } static int koneplus_init_specials(struct hid_device *hdev) { struct usb_interface *intf = to_usb_interface(hdev->dev.parent); struct usb_device *usb_dev = interface_to_usbdev(intf); struct koneplus_device *koneplus; int retval; if (intf->cur_altsetting->desc.bInterfaceProtocol == USB_INTERFACE_PROTOCOL_MOUSE) { koneplus = kzalloc(sizeof(*koneplus), GFP_KERNEL); if (!koneplus) { hid_err(hdev, "can't alloc device descriptor\n"); return -ENOMEM; } hid_set_drvdata(hdev, koneplus); retval = koneplus_init_koneplus_device_struct(usb_dev, koneplus); if (retval) { hid_err(hdev, "couldn't init struct koneplus_device\n"); goto exit_free; } retval = roccat_connect(&koneplus_class, hdev, sizeof(struct koneplus_roccat_report)); if (retval < 0) { hid_err(hdev, "couldn't init char dev\n"); } else { koneplus->chrdev_minor = retval; koneplus->roccat_claimed = 1; } } else { hid_set_drvdata(hdev, NULL); } return 0; exit_free: kfree(koneplus); return retval; } static void koneplus_remove_specials(struct hid_device *hdev) { struct usb_interface *intf = to_usb_interface(hdev->dev.parent); struct koneplus_device *koneplus; if (intf->cur_altsetting->desc.bInterfaceProtocol == USB_INTERFACE_PROTOCOL_MOUSE) { koneplus = hid_get_drvdata(hdev); if (koneplus->roccat_claimed) roccat_disconnect(koneplus->chrdev_minor); kfree(koneplus); } } static int koneplus_probe(struct hid_device *hdev, const struct hid_device_id *id) { int retval; if (!hid_is_usb(hdev)) return -EINVAL; retval = hid_parse(hdev); if (retval) { hid_err(hdev, "parse failed\n"); goto exit; } retval = hid_hw_start(hdev, HID_CONNECT_DEFAULT); if (retval) { hid_err(hdev, "hw start failed\n"); goto exit; } retval = koneplus_init_specials(hdev); if (retval) { hid_err(hdev, "couldn't install mouse\n"); goto exit_stop; } return 0; exit_stop: hid_hw_stop(hdev); exit: return retval; } static void koneplus_remove(struct hid_device *hdev) { koneplus_remove_specials(hdev); hid_hw_stop(hdev); } static void koneplus_keep_values_up_to_date(struct koneplus_device *koneplus, u8 const *data) { struct koneplus_mouse_report_button const *button_report; switch (data[0]) { case KONEPLUS_MOUSE_REPORT_NUMBER_BUTTON: button_report = (struct koneplus_mouse_report_button const *)data; switch (button_report->type) { case KONEPLUS_MOUSE_REPORT_BUTTON_TYPE_PROFILE: koneplus_profile_activated(koneplus, button_report->data1 - 1); break; } break; } } static void koneplus_report_to_chrdev(struct koneplus_device const *koneplus, u8 const *data) { struct koneplus_roccat_report roccat_report; struct koneplus_mouse_report_button const *button_report; if (data[0] != KONEPLUS_MOUSE_REPORT_NUMBER_BUTTON) return; button_report = (struct koneplus_mouse_report_button const *)data; if ((button_report->type == KONEPLUS_MOUSE_REPORT_BUTTON_TYPE_QUICKLAUNCH || button_report->type == KONEPLUS_MOUSE_REPORT_BUTTON_TYPE_TIMER) && button_report->data2 != KONEPLUS_MOUSE_REPORT_BUTTON_ACTION_PRESS) return; roccat_report.type = button_report->type; roccat_report.data1 = button_report->data1; roccat_report.data2 = button_report->data2; roccat_report.profile = koneplus->actual_profile + 1; roccat_report_event(koneplus->chrdev_minor, (uint8_t const *)&roccat_report); } static int koneplus_raw_event(struct hid_device *hdev, struct hid_report *report, u8 *data, int size) { struct usb_interface *intf = to_usb_interface(hdev->dev.parent); struct koneplus_device *koneplus = hid_get_drvdata(hdev); if (intf->cur_altsetting->desc.bInterfaceProtocol != USB_INTERFACE_PROTOCOL_MOUSE) return 0; if (koneplus == NULL) return 0; koneplus_keep_values_up_to_date(koneplus, data); if (koneplus->roccat_claimed) koneplus_report_to_chrdev(koneplus, data); return 0; } static const struct hid_device_id koneplus_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_ROCCAT, USB_DEVICE_ID_ROCCAT_KONEPLUS) }, { HID_USB_DEVICE(USB_VENDOR_ID_ROCCAT, USB_DEVICE_ID_ROCCAT_KONEXTD) }, { } }; MODULE_DEVICE_TABLE(hid, koneplus_devices); static struct hid_driver koneplus_driver = { .name = "koneplus", .id_table = koneplus_devices, .probe = koneplus_probe, .remove = koneplus_remove, .raw_event = koneplus_raw_event }; static int __init koneplus_init(void) { int retval; /* class name has to be same as driver name */ retval = class_register(&koneplus_class); if (retval) return retval; retval = hid_register_driver(&koneplus_driver); if (retval) class_unregister(&koneplus_class); return retval; } static void __exit koneplus_exit(void) { hid_unregister_driver(&koneplus_driver); class_unregister(&koneplus_class); } module_init(koneplus_init); module_exit(koneplus_exit); MODULE_AUTHOR("Stefan Achatz"); MODULE_DESCRIPTION("USB Roccat Kone[+]/XTD driver"); MODULE_LICENSE("GPL v2"); |
| 83 481 413 83 83 83 1065 430 652 1068 850 390 44 2 44 1 44 2 1144 77 1141 1 1146 1144 927 222 1125 39 1142 106 220 22 19 4 32 88 95 88 12 87 76 21 87 6 6 134 132 132 89 10 70 70 69 2 68 67 75 74 16 16 8 16 16 15 9 15 70 70 61 70 20 20 7 20 37 34 7 31 3 28 28 3 31 16 21 30 30 37 36 34 37 8 8 7 6 8 45 45 45 8 37 160 1465 1463 51 1126 1165 1124 1127 182 761 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 | // SPDX-License-Identifier: GPL-2.0 /* * linux/fs/stat.c * * Copyright (C) 1991, 1992 Linus Torvalds */ #include <linux/blkdev.h> #include <linux/export.h> #include <linux/mm.h> #include <linux/errno.h> #include <linux/file.h> #include <linux/highuid.h> #include <linux/fs.h> #include <linux/namei.h> #include <linux/security.h> #include <linux/cred.h> #include <linux/syscalls.h> #include <linux/pagemap.h> #include <linux/compat.h> #include <linux/iversion.h> #include <linux/uaccess.h> #include <asm/unistd.h> #include <trace/events/timestamp.h> #include "internal.h" #include "mount.h" /** * fill_mg_cmtime - Fill in the mtime and ctime and flag ctime as QUERIED * @stat: where to store the resulting values * @request_mask: STATX_* values requested * @inode: inode from which to grab the c/mtime * * Given @inode, grab the ctime and mtime out if it and store the result * in @stat. When fetching the value, flag it as QUERIED (if not already) * so the next write will record a distinct timestamp. * * NB: The QUERIED flag is tracked in the ctime, but we set it there even * if only the mtime was requested, as that ensures that the next mtime * change will be distinct. */ void fill_mg_cmtime(struct kstat *stat, u32 request_mask, struct inode *inode) { atomic_t *pcn = (atomic_t *)&inode->i_ctime_nsec; /* If neither time was requested, then don't report them */ if (!(request_mask & (STATX_CTIME|STATX_MTIME))) { stat->result_mask &= ~(STATX_CTIME|STATX_MTIME); return; } stat->mtime = inode_get_mtime(inode); stat->ctime.tv_sec = inode->i_ctime_sec; stat->ctime.tv_nsec = (u32)atomic_read(pcn); if (!(stat->ctime.tv_nsec & I_CTIME_QUERIED)) stat->ctime.tv_nsec = ((u32)atomic_fetch_or(I_CTIME_QUERIED, pcn)); stat->ctime.tv_nsec &= ~I_CTIME_QUERIED; trace_fill_mg_cmtime(inode, &stat->ctime, &stat->mtime); } EXPORT_SYMBOL(fill_mg_cmtime); /** * generic_fillattr - Fill in the basic attributes from the inode struct * @idmap: idmap of the mount the inode was found from * @request_mask: statx request_mask * @inode: Inode to use as the source * @stat: Where to fill in the attributes * * Fill in the basic attributes in the kstat structure from data that's to be * found on the VFS inode structure. This is the default if no getattr inode * operation is supplied. * * If the inode has been found through an idmapped mount the idmap of * the vfsmount must be passed through @idmap. This function will then * take care to map the inode according to @idmap before filling in the * uid and gid filds. On non-idmapped mounts or if permission checking is to be * performed on the raw inode simply pass @nop_mnt_idmap. */ void generic_fillattr(struct mnt_idmap *idmap, u32 request_mask, struct inode *inode, struct kstat *stat) { vfsuid_t vfsuid = i_uid_into_vfsuid(idmap, inode); vfsgid_t vfsgid = i_gid_into_vfsgid(idmap, inode); stat->dev = inode->i_sb->s_dev; stat->ino = inode->i_ino; stat->mode = inode->i_mode; stat->nlink = inode->i_nlink; stat->uid = vfsuid_into_kuid(vfsuid); stat->gid = vfsgid_into_kgid(vfsgid); stat->rdev = inode->i_rdev; stat->size = i_size_read(inode); stat->atime = inode_get_atime(inode); if (is_mgtime(inode)) { fill_mg_cmtime(stat, request_mask, inode); } else { stat->ctime = inode_get_ctime(inode); stat->mtime = inode_get_mtime(inode); } stat->blksize = i_blocksize(inode); stat->blocks = inode->i_blocks; if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { stat->result_mask |= STATX_CHANGE_COOKIE; stat->change_cookie = inode_query_iversion(inode); } } EXPORT_SYMBOL(generic_fillattr); /** * generic_fill_statx_attr - Fill in the statx attributes from the inode flags * @inode: Inode to use as the source * @stat: Where to fill in the attribute flags * * Fill in the STATX_ATTR_* flags in the kstat structure for properties of the * inode that are published on i_flags and enforced by the VFS. */ void generic_fill_statx_attr(struct inode *inode, struct kstat *stat) { if (inode->i_flags & S_IMMUTABLE) stat->attributes |= STATX_ATTR_IMMUTABLE; if (inode->i_flags & S_APPEND) stat->attributes |= STATX_ATTR_APPEND; stat->attributes_mask |= KSTAT_ATTR_VFS_FLAGS; } EXPORT_SYMBOL(generic_fill_statx_attr); /** * generic_fill_statx_atomic_writes - Fill in atomic writes statx attributes * @stat: Where to fill in the attribute flags * @unit_min: Minimum supported atomic write length in bytes * @unit_max: Maximum supported atomic write length in bytes * @unit_max_opt: Optimised maximum supported atomic write length in bytes * * Fill in the STATX{_ATTR}_WRITE_ATOMIC flags in the kstat structure from * atomic write unit_min and unit_max values. */ void generic_fill_statx_atomic_writes(struct kstat *stat, unsigned int unit_min, unsigned int unit_max, unsigned int unit_max_opt) { /* Confirm that the request type is known */ stat->result_mask |= STATX_WRITE_ATOMIC; /* Confirm that the file attribute type is known */ stat->attributes_mask |= STATX_ATTR_WRITE_ATOMIC; if (unit_min) { stat->atomic_write_unit_min = unit_min; stat->atomic_write_unit_max = unit_max; stat->atomic_write_unit_max_opt = unit_max_opt; /* Initially only allow 1x segment */ stat->atomic_write_segments_max = 1; /* Confirm atomic writes are actually supported */ stat->attributes |= STATX_ATTR_WRITE_ATOMIC; } } EXPORT_SYMBOL_GPL(generic_fill_statx_atomic_writes); /** * vfs_getattr_nosec - getattr without security checks * @path: file to get attributes from * @stat: structure to return attributes in * @request_mask: STATX_xxx flags indicating what the caller wants * @query_flags: Query mode (AT_STATX_SYNC_TYPE) * * Get attributes without calling security_inode_getattr. * * Currently the only caller other than vfs_getattr is internal to the * filehandle lookup code, which uses only the inode number and returns no * attributes to any user. Any other code probably wants vfs_getattr. */ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { struct mnt_idmap *idmap; struct inode *inode = d_backing_inode(path->dentry); memset(stat, 0, sizeof(*stat)); stat->result_mask |= STATX_BASIC_STATS; query_flags &= AT_STATX_SYNC_TYPE; /* allow the fs to override these if it really wants to */ /* SB_NOATIME means filesystem supplies dummy atime value */ if (inode->i_sb->s_flags & SB_NOATIME) stat->result_mask &= ~STATX_ATIME; /* * Note: If you add another clause to set an attribute flag, please * update attributes_mask below. */ if (IS_AUTOMOUNT(inode)) stat->attributes |= STATX_ATTR_AUTOMOUNT; if (IS_DAX(inode)) stat->attributes |= STATX_ATTR_DAX; stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | STATX_ATTR_DAX); idmap = mnt_idmap(path->mnt); if (inode->i_op->getattr) { int ret; ret = inode->i_op->getattr(idmap, path, stat, request_mask, query_flags); if (ret) return ret; } else { generic_fillattr(idmap, request_mask, inode, stat); } /* * If this is a block device inode, override the filesystem attributes * with the block device specific parameters that need to be obtained * from the bdev backing inode. */ if (S_ISBLK(stat->mode)) bdev_statx(path, stat, request_mask); return 0; } EXPORT_SYMBOL(vfs_getattr_nosec); /* * vfs_getattr - Get the enhanced basic attributes of a file * @path: The file of interest * @stat: Where to return the statistics * @request_mask: STATX_xxx flags indicating what the caller wants * @query_flags: Query mode (AT_STATX_SYNC_TYPE) * * Ask the filesystem for a file's attributes. The caller must indicate in * request_mask and query_flags to indicate what they want. * * If the file is remote, the filesystem can be forced to update the attributes * from the backing store by passing AT_STATX_FORCE_SYNC in query_flags or can * suppress the update by passing AT_STATX_DONT_SYNC. * * Bits must have been set in request_mask to indicate which attributes the * caller wants retrieving. Any such attribute not requested may be returned * anyway, but the value may be approximate, and, if remote, may not have been * synchronised with the server. * * 0 will be returned on success, and a -ve error code if unsuccessful. */ int vfs_getattr(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { int retval; retval = security_inode_getattr(path); if (unlikely(retval)) return retval; return vfs_getattr_nosec(path, stat, request_mask, query_flags); } EXPORT_SYMBOL(vfs_getattr); /** * vfs_fstat - Get the basic attributes by file descriptor * @fd: The file descriptor referring to the file of interest * @stat: The result structure to fill in. * * This function is a wrapper around vfs_getattr(). The main difference is * that it uses a file descriptor to determine the file location. * * 0 will be returned on success, and a -ve error code if unsuccessful. */ int vfs_fstat(int fd, struct kstat *stat) { CLASS(fd_raw, f)(fd); if (fd_empty(f)) return -EBADF; return vfs_getattr(&fd_file(f)->f_path, stat, STATX_BASIC_STATS, 0); } static int statx_lookup_flags(int flags) { int lookup_flags = 0; if (!(flags & AT_SYMLINK_NOFOLLOW)) lookup_flags |= LOOKUP_FOLLOW; if (!(flags & AT_NO_AUTOMOUNT)) lookup_flags |= LOOKUP_AUTOMOUNT; return lookup_flags; } static int vfs_statx_path(struct path *path, int flags, struct kstat *stat, u32 request_mask) { int error = vfs_getattr(path, stat, request_mask, flags); if (error) return error; if (request_mask & STATX_MNT_ID_UNIQUE) { stat->mnt_id = real_mount(path->mnt)->mnt_id_unique; stat->result_mask |= STATX_MNT_ID_UNIQUE; } else { stat->mnt_id = real_mount(path->mnt)->mnt_id; stat->result_mask |= STATX_MNT_ID; } if (path_mounted(path)) stat->attributes |= STATX_ATTR_MOUNT_ROOT; stat->attributes_mask |= STATX_ATTR_MOUNT_ROOT; return 0; } static int vfs_statx_fd(int fd, int flags, struct kstat *stat, u32 request_mask) { CLASS(fd_raw, f)(fd); if (fd_empty(f)) return -EBADF; return vfs_statx_path(&fd_file(f)->f_path, flags, stat, request_mask); } /** * vfs_statx - Get basic and extra attributes by filename * @dfd: A file descriptor representing the base dir for a relative filename * @filename: The name of the file of interest * @flags: Flags to control the query * @stat: The result structure to fill in. * @request_mask: STATX_xxx flags indicating what the caller wants * * This function is a wrapper around vfs_getattr(). The main difference is * that it uses a filename and base directory to determine the file location. * Additionally, the use of AT_SYMLINK_NOFOLLOW in flags will prevent a symlink * at the given name from being referenced. * * 0 will be returned on success, and a -ve error code if unsuccessful. */ static int vfs_statx(int dfd, struct filename *filename, int flags, struct kstat *stat, u32 request_mask) { struct path path; unsigned int lookup_flags = statx_lookup_flags(flags); int error; if (flags & ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT | AT_EMPTY_PATH | AT_STATX_SYNC_TYPE)) return -EINVAL; retry: error = filename_lookup(dfd, filename, lookup_flags, &path, NULL); if (error) return error; error = vfs_statx_path(&path, flags, stat, request_mask); path_put(&path); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } return error; } int vfs_fstatat(int dfd, const char __user *filename, struct kstat *stat, int flags) { int ret; int statx_flags = flags | AT_NO_AUTOMOUNT; struct filename *name = getname_maybe_null(filename, flags); if (!name && dfd >= 0) return vfs_fstat(dfd, stat); ret = vfs_statx(dfd, name, statx_flags, stat, STATX_BASIC_STATS); putname(name); return ret; } #ifdef __ARCH_WANT_OLD_STAT /* * For backward compatibility? Maybe this should be moved * into arch/i386 instead? */ static int cp_old_stat(struct kstat *stat, struct __old_kernel_stat __user * statbuf) { static int warncount = 5; struct __old_kernel_stat tmp; if (warncount > 0) { warncount--; printk(KERN_WARNING "VFS: Warning: %s using old stat() call. Recompile your binary.\n", current->comm); } else if (warncount < 0) { /* it's laughable, but... */ warncount = 0; } memset(&tmp, 0, sizeof(struct __old_kernel_stat)); tmp.st_dev = old_encode_dev(stat->dev); tmp.st_ino = stat->ino; if (sizeof(tmp.st_ino) < sizeof(stat->ino) && tmp.st_ino != stat->ino) return -EOVERFLOW; tmp.st_mode = stat->mode; tmp.st_nlink = stat->nlink; if (tmp.st_nlink != stat->nlink) return -EOVERFLOW; SET_UID(tmp.st_uid, from_kuid_munged(current_user_ns(), stat->uid)); SET_GID(tmp.st_gid, from_kgid_munged(current_user_ns(), stat->gid)); tmp.st_rdev = old_encode_dev(stat->rdev); #if BITS_PER_LONG == 32 if (stat->size > MAX_NON_LFS) return -EOVERFLOW; #endif tmp.st_size = stat->size; tmp.st_atime = stat->atime.tv_sec; tmp.st_mtime = stat->mtime.tv_sec; tmp.st_ctime = stat->ctime.tv_sec; return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0; } SYSCALL_DEFINE2(stat, const char __user *, filename, struct __old_kernel_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_stat(filename, &stat); if (unlikely(error)) return error; return cp_old_stat(&stat, statbuf); } SYSCALL_DEFINE2(lstat, const char __user *, filename, struct __old_kernel_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_lstat(filename, &stat); if (unlikely(error)) return error; return cp_old_stat(&stat, statbuf); } SYSCALL_DEFINE2(fstat, unsigned int, fd, struct __old_kernel_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_fstat(fd, &stat); if (unlikely(error)) return error; return cp_old_stat(&stat, statbuf); } #endif /* __ARCH_WANT_OLD_STAT */ #ifdef __ARCH_WANT_NEW_STAT #ifndef INIT_STRUCT_STAT_PADDING # define INIT_STRUCT_STAT_PADDING(st) memset(&st, 0, sizeof(st)) #endif static int cp_new_stat(struct kstat *stat, struct stat __user *statbuf) { struct stat tmp; if (sizeof(tmp.st_dev) < 4 && !old_valid_dev(stat->dev)) return -EOVERFLOW; if (sizeof(tmp.st_rdev) < 4 && !old_valid_dev(stat->rdev)) return -EOVERFLOW; #if BITS_PER_LONG == 32 if (stat->size > MAX_NON_LFS) return -EOVERFLOW; #endif INIT_STRUCT_STAT_PADDING(tmp); tmp.st_dev = new_encode_dev(stat->dev); tmp.st_ino = stat->ino; if (sizeof(tmp.st_ino) < sizeof(stat->ino) && tmp.st_ino != stat->ino) return -EOVERFLOW; tmp.st_mode = stat->mode; tmp.st_nlink = stat->nlink; if (tmp.st_nlink != stat->nlink) return -EOVERFLOW; SET_UID(tmp.st_uid, from_kuid_munged(current_user_ns(), stat->uid)); SET_GID(tmp.st_gid, from_kgid_munged(current_user_ns(), stat->gid)); tmp.st_rdev = new_encode_dev(stat->rdev); tmp.st_size = stat->size; tmp.st_atime = stat->atime.tv_sec; tmp.st_mtime = stat->mtime.tv_sec; tmp.st_ctime = stat->ctime.tv_sec; #ifdef STAT_HAVE_NSEC tmp.st_atime_nsec = stat->atime.tv_nsec; tmp.st_mtime_nsec = stat->mtime.tv_nsec; tmp.st_ctime_nsec = stat->ctime.tv_nsec; #endif tmp.st_blocks = stat->blocks; tmp.st_blksize = stat->blksize; return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0; } SYSCALL_DEFINE2(newstat, const char __user *, filename, struct stat __user *, statbuf) { struct kstat stat; int error; error = vfs_stat(filename, &stat); if (unlikely(error)) return error; return cp_new_stat(&stat, statbuf); } SYSCALL_DEFINE2(newlstat, const char __user *, filename, struct stat __user *, statbuf) { struct kstat stat; int error; error = vfs_lstat(filename, &stat); if (unlikely(error)) return error; return cp_new_stat(&stat, statbuf); } #if !defined(__ARCH_WANT_STAT64) || defined(__ARCH_WANT_SYS_NEWFSTATAT) SYSCALL_DEFINE4(newfstatat, int, dfd, const char __user *, filename, struct stat __user *, statbuf, int, flag) { struct kstat stat; int error; error = vfs_fstatat(dfd, filename, &stat, flag); if (unlikely(error)) return error; return cp_new_stat(&stat, statbuf); } #endif SYSCALL_DEFINE2(newfstat, unsigned int, fd, struct stat __user *, statbuf) { struct kstat stat; int error; error = vfs_fstat(fd, &stat); if (unlikely(error)) return error; return cp_new_stat(&stat, statbuf); } #endif static int do_readlinkat(int dfd, const char __user *pathname, char __user *buf, int bufsiz) { struct path path; struct filename *name; int error; unsigned int lookup_flags = LOOKUP_EMPTY; if (bufsiz <= 0) return -EINVAL; retry: name = getname_flags(pathname, lookup_flags); error = filename_lookup(dfd, name, lookup_flags, &path, NULL); if (unlikely(error)) { putname(name); return error; } /* * AFS mountpoints allow readlink(2) but are not symlinks */ if (d_is_symlink(path.dentry) || d_backing_inode(path.dentry)->i_op->readlink) { error = security_inode_readlink(path.dentry); if (!error) { touch_atime(&path); error = vfs_readlink(path.dentry, buf, bufsiz); } } else { error = (name->name[0] == '\0') ? -ENOENT : -EINVAL; } path_put(&path); putname(name); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } return error; } SYSCALL_DEFINE4(readlinkat, int, dfd, const char __user *, pathname, char __user *, buf, int, bufsiz) { return do_readlinkat(dfd, pathname, buf, bufsiz); } SYSCALL_DEFINE3(readlink, const char __user *, path, char __user *, buf, int, bufsiz) { return do_readlinkat(AT_FDCWD, path, buf, bufsiz); } /* ---------- LFS-64 ----------- */ #if defined(__ARCH_WANT_STAT64) || defined(__ARCH_WANT_COMPAT_STAT64) #ifndef INIT_STRUCT_STAT64_PADDING # define INIT_STRUCT_STAT64_PADDING(st) memset(&st, 0, sizeof(st)) #endif static long cp_new_stat64(struct kstat *stat, struct stat64 __user *statbuf) { struct stat64 tmp; INIT_STRUCT_STAT64_PADDING(tmp); #ifdef CONFIG_MIPS /* mips has weird padding, so we don't get 64 bits there */ tmp.st_dev = new_encode_dev(stat->dev); tmp.st_rdev = new_encode_dev(stat->rdev); #else tmp.st_dev = huge_encode_dev(stat->dev); tmp.st_rdev = huge_encode_dev(stat->rdev); #endif tmp.st_ino = stat->ino; if (sizeof(tmp.st_ino) < sizeof(stat->ino) && tmp.st_ino != stat->ino) return -EOVERFLOW; #ifdef STAT64_HAS_BROKEN_ST_INO tmp.__st_ino = stat->ino; #endif tmp.st_mode = stat->mode; tmp.st_nlink = stat->nlink; tmp.st_uid = from_kuid_munged(current_user_ns(), stat->uid); tmp.st_gid = from_kgid_munged(current_user_ns(), stat->gid); tmp.st_atime = stat->atime.tv_sec; tmp.st_atime_nsec = stat->atime.tv_nsec; tmp.st_mtime = stat->mtime.tv_sec; tmp.st_mtime_nsec = stat->mtime.tv_nsec; tmp.st_ctime = stat->ctime.tv_sec; tmp.st_ctime_nsec = stat->ctime.tv_nsec; tmp.st_size = stat->size; tmp.st_blocks = stat->blocks; tmp.st_blksize = stat->blksize; return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0; } SYSCALL_DEFINE2(stat64, const char __user *, filename, struct stat64 __user *, statbuf) { struct kstat stat; int error = vfs_stat(filename, &stat); if (!error) error = cp_new_stat64(&stat, statbuf); return error; } SYSCALL_DEFINE2(lstat64, const char __user *, filename, struct stat64 __user *, statbuf) { struct kstat stat; int error = vfs_lstat(filename, &stat); if (!error) error = cp_new_stat64(&stat, statbuf); return error; } SYSCALL_DEFINE2(fstat64, unsigned long, fd, struct stat64 __user *, statbuf) { struct kstat stat; int error = vfs_fstat(fd, &stat); if (!error) error = cp_new_stat64(&stat, statbuf); return error; } SYSCALL_DEFINE4(fstatat64, int, dfd, const char __user *, filename, struct stat64 __user *, statbuf, int, flag) { struct kstat stat; int error; error = vfs_fstatat(dfd, filename, &stat, flag); if (error) return error; return cp_new_stat64(&stat, statbuf); } #endif /* __ARCH_WANT_STAT64 || __ARCH_WANT_COMPAT_STAT64 */ static noinline_for_stack int cp_statx(const struct kstat *stat, struct statx __user *buffer) { struct statx tmp; memset(&tmp, 0, sizeof(tmp)); /* STATX_CHANGE_COOKIE is kernel-only for now */ tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE; tmp.stx_blksize = stat->blksize; /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */ tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC; tmp.stx_nlink = stat->nlink; tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); tmp.stx_mode = stat->mode; tmp.stx_ino = stat->ino; tmp.stx_size = stat->size; tmp.stx_blocks = stat->blocks; tmp.stx_attributes_mask = stat->attributes_mask; tmp.stx_atime.tv_sec = stat->atime.tv_sec; tmp.stx_atime.tv_nsec = stat->atime.tv_nsec; tmp.stx_btime.tv_sec = stat->btime.tv_sec; tmp.stx_btime.tv_nsec = stat->btime.tv_nsec; tmp.stx_ctime.tv_sec = stat->ctime.tv_sec; tmp.stx_ctime.tv_nsec = stat->ctime.tv_nsec; tmp.stx_mtime.tv_sec = stat->mtime.tv_sec; tmp.stx_mtime.tv_nsec = stat->mtime.tv_nsec; tmp.stx_rdev_major = MAJOR(stat->rdev); tmp.stx_rdev_minor = MINOR(stat->rdev); tmp.stx_dev_major = MAJOR(stat->dev); tmp.stx_dev_minor = MINOR(stat->dev); tmp.stx_mnt_id = stat->mnt_id; tmp.stx_dio_mem_align = stat->dio_mem_align; tmp.stx_dio_offset_align = stat->dio_offset_align; tmp.stx_dio_read_offset_align = stat->dio_read_offset_align; tmp.stx_subvol = stat->subvol; tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min; tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max; tmp.stx_atomic_write_segments_max = stat->atomic_write_segments_max; tmp.stx_atomic_write_unit_max_opt = stat->atomic_write_unit_max_opt; return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0; } int do_statx(int dfd, struct filename *filename, unsigned int flags, unsigned int mask, struct statx __user *buffer) { struct kstat stat; int error; if (mask & STATX__RESERVED) return -EINVAL; if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; /* * STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests * from userland. */ mask &= ~STATX_CHANGE_COOKIE; error = vfs_statx(dfd, filename, flags, &stat, mask); if (error) return error; return cp_statx(&stat, buffer); } int do_statx_fd(int fd, unsigned int flags, unsigned int mask, struct statx __user *buffer) { struct kstat stat; int error; if (mask & STATX__RESERVED) return -EINVAL; if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; /* * STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests * from userland. */ mask &= ~STATX_CHANGE_COOKIE; error = vfs_statx_fd(fd, flags, &stat, mask); if (error) return error; return cp_statx(&stat, buffer); } /** * sys_statx - System call to get enhanced stats * @dfd: Base directory to pathwalk from *or* fd to stat. * @filename: File to stat or either NULL or "" with AT_EMPTY_PATH * @flags: AT_* flags to control pathwalk. * @mask: Parts of statx struct actually required. * @buffer: Result buffer. * * Note that fstat() can be emulated by setting dfd to the fd of interest, * supplying "" (or preferably NULL) as the filename and setting AT_EMPTY_PATH * in the flags. */ SYSCALL_DEFINE5(statx, int, dfd, const char __user *, filename, unsigned, flags, unsigned int, mask, struct statx __user *, buffer) { int ret; struct filename *name = getname_maybe_null(filename, flags); if (!name && dfd >= 0) return do_statx_fd(dfd, flags & ~AT_NO_AUTOMOUNT, mask, buffer); ret = do_statx(dfd, name, flags, mask, buffer); putname(name); return ret; } #if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_STAT) static int cp_compat_stat(struct kstat *stat, struct compat_stat __user *ubuf) { struct compat_stat tmp; if (sizeof(tmp.st_dev) < 4 && !old_valid_dev(stat->dev)) return -EOVERFLOW; if (sizeof(tmp.st_rdev) < 4 && !old_valid_dev(stat->rdev)) return -EOVERFLOW; memset(&tmp, 0, sizeof(tmp)); tmp.st_dev = new_encode_dev(stat->dev); tmp.st_ino = stat->ino; if (sizeof(tmp.st_ino) < sizeof(stat->ino) && tmp.st_ino != stat->ino) return -EOVERFLOW; tmp.st_mode = stat->mode; tmp.st_nlink = stat->nlink; if (tmp.st_nlink != stat->nlink) return -EOVERFLOW; SET_UID(tmp.st_uid, from_kuid_munged(current_user_ns(), stat->uid)); SET_GID(tmp.st_gid, from_kgid_munged(current_user_ns(), stat->gid)); tmp.st_rdev = new_encode_dev(stat->rdev); if ((u64) stat->size > MAX_NON_LFS) return -EOVERFLOW; tmp.st_size = stat->size; tmp.st_atime = stat->atime.tv_sec; tmp.st_atime_nsec = stat->atime.tv_nsec; tmp.st_mtime = stat->mtime.tv_sec; tmp.st_mtime_nsec = stat->mtime.tv_nsec; tmp.st_ctime = stat->ctime.tv_sec; tmp.st_ctime_nsec = stat->ctime.tv_nsec; tmp.st_blocks = stat->blocks; tmp.st_blksize = stat->blksize; return copy_to_user(ubuf, &tmp, sizeof(tmp)) ? -EFAULT : 0; } COMPAT_SYSCALL_DEFINE2(newstat, const char __user *, filename, struct compat_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_stat(filename, &stat); if (error) return error; return cp_compat_stat(&stat, statbuf); } COMPAT_SYSCALL_DEFINE2(newlstat, const char __user *, filename, struct compat_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_lstat(filename, &stat); if (error) return error; return cp_compat_stat(&stat, statbuf); } #ifndef __ARCH_WANT_STAT64 COMPAT_SYSCALL_DEFINE4(newfstatat, unsigned int, dfd, const char __user *, filename, struct compat_stat __user *, statbuf, int, flag) { struct kstat stat; int error; error = vfs_fstatat(dfd, filename, &stat, flag); if (error) return error; return cp_compat_stat(&stat, statbuf); } #endif COMPAT_SYSCALL_DEFINE2(newfstat, unsigned int, fd, struct compat_stat __user *, statbuf) { struct kstat stat; int error = vfs_fstat(fd, &stat); if (!error) error = cp_compat_stat(&stat, statbuf); return error; } #endif /* Caller is here responsible for sufficient locking (ie. inode->i_lock) */ void __inode_add_bytes(struct inode *inode, loff_t bytes) { inode->i_blocks += bytes >> 9; bytes &= 511; inode->i_bytes += bytes; if (inode->i_bytes >= 512) { inode->i_blocks++; inode->i_bytes -= 512; } } EXPORT_SYMBOL(__inode_add_bytes); void inode_add_bytes(struct inode *inode, loff_t bytes) { spin_lock(&inode->i_lock); __inode_add_bytes(inode, bytes); spin_unlock(&inode->i_lock); } EXPORT_SYMBOL(inode_add_bytes); void __inode_sub_bytes(struct inode *inode, loff_t bytes) { inode->i_blocks -= bytes >> 9; bytes &= 511; if (inode->i_bytes < bytes) { inode->i_blocks--; inode->i_bytes += 512; } inode->i_bytes -= bytes; } EXPORT_SYMBOL(__inode_sub_bytes); void inode_sub_bytes(struct inode *inode, loff_t bytes) { spin_lock(&inode->i_lock); __inode_sub_bytes(inode, bytes); spin_unlock(&inode->i_lock); } EXPORT_SYMBOL(inode_sub_bytes); loff_t inode_get_bytes(struct inode *inode) { loff_t ret; spin_lock(&inode->i_lock); ret = __inode_get_bytes(inode); spin_unlock(&inode->i_lock); return ret; } EXPORT_SYMBOL(inode_get_bytes); void inode_set_bytes(struct inode *inode, loff_t bytes) { /* Caller is here responsible for sufficient locking * (ie. inode->i_lock) */ inode->i_blocks = bytes >> 9; inode->i_bytes = bytes & 511; } EXPORT_SYMBOL(inode_set_bytes); |
| 305 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Network event notifiers * * Authors: * Tom Tucker <tom@opengridcomputing.com> * Steve Wise <swise@opengridcomputing.com> * * Fixes: */ #include <linux/rtnetlink.h> #include <linux/notifier.h> #include <linux/export.h> #include <net/netevent.h> static ATOMIC_NOTIFIER_HEAD(netevent_notif_chain); /** * register_netevent_notifier - register a netevent notifier block * @nb: notifier * * Register a notifier to be called when a netevent occurs. * The notifier passed is linked into the kernel structures and must * not be reused until it has been unregistered. A negative errno code * is returned on a failure. */ int register_netevent_notifier(struct notifier_block *nb) { return atomic_notifier_chain_register(&netevent_notif_chain, nb); } EXPORT_SYMBOL_GPL(register_netevent_notifier); /** * unregister_netevent_notifier - unregister a netevent notifier block * @nb: notifier * * Unregister a notifier previously registered by * register_neigh_notifier(). The notifier is unlinked into the * kernel structures and may then be reused. A negative errno code * is returned on a failure. */ int unregister_netevent_notifier(struct notifier_block *nb) { return atomic_notifier_chain_unregister(&netevent_notif_chain, nb); } EXPORT_SYMBOL_GPL(unregister_netevent_notifier); /** * call_netevent_notifiers - call all netevent notifier blocks * @val: value passed unmodified to notifier function * @v: pointer passed unmodified to notifier function * * Call all neighbour notifier blocks. Parameters and return value * are as for notifier_call_chain(). */ int call_netevent_notifiers(unsigned long val, void *v) { return atomic_notifier_call_chain(&netevent_notif_chain, val, v); } EXPORT_SYMBOL_GPL(call_netevent_notifiers); |
| 15 15 15 15 15 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 | // SPDX-License-Identifier: GPL-2.0-or-later /* client.c: NFS client sharing and management code * * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #include <linux/module.h> #include <linux/init.h> #include <linux/sched.h> #include <linux/time.h> #include <linux/kernel.h> #include <linux/mm.h> #include <linux/string.h> #include <linux/stat.h> #include <linux/errno.h> #include <linux/unistd.h> #include <linux/sunrpc/addr.h> #include <linux/sunrpc/clnt.h> #include <linux/sunrpc/stats.h> #include <linux/sunrpc/metrics.h> #include <linux/sunrpc/xprtsock.h> #include <linux/sunrpc/xprtrdma.h> #include <linux/nfs_fs.h> #include <linux/nfs_mount.h> #include <linux/nfs4_mount.h> #include <linux/lockd/bind.h> #include <linux/seq_file.h> #include <linux/mount.h> #include <linux/vfs.h> #include <linux/inet.h> #include <linux/in6.h> #include <linux/slab.h> #include <linux/idr.h> #include <net/ipv6.h> #include <linux/nfs_xdr.h> #include <linux/sunrpc/bc_xprt.h> #include <linux/nsproxy.h> #include <linux/pid_namespace.h> #include <linux/nfslocalio.h> #include "nfs4_fs.h" #include "callback.h" #include "delegation.h" #include "iostat.h" #include "internal.h" #include "fscache.h" #include "pnfs.h" #include "nfs.h" #include "netns.h" #include "sysfs.h" #include "nfs42.h" #define NFSDBG_FACILITY NFSDBG_CLIENT static DECLARE_WAIT_QUEUE_HEAD(nfs_client_active_wq); static DEFINE_RWLOCK(nfs_version_lock); static struct nfs_subversion *nfs_version_mods[5] = { [2] = NULL, [3] = NULL, [4] = NULL, }; /* * RPC cruft for NFS */ static const struct rpc_version *nfs_version[5] = { [2] = NULL, [3] = NULL, [4] = NULL, }; const struct rpc_program nfs_program = { .name = "nfs", .number = NFS_PROGRAM, .nrvers = ARRAY_SIZE(nfs_version), .version = nfs_version, .pipe_dir_name = NFS_PIPE_DIRNAME, }; static struct nfs_subversion *__find_nfs_version(unsigned int version) { struct nfs_subversion *nfs; read_lock(&nfs_version_lock); nfs = nfs_version_mods[version]; read_unlock(&nfs_version_lock); return nfs; } struct nfs_subversion *find_nfs_version(unsigned int version) { struct nfs_subversion *nfs = __find_nfs_version(version); if (!nfs && request_module("nfsv%d", version) == 0) nfs = __find_nfs_version(version); if (!nfs) return ERR_PTR(-EPROTONOSUPPORT); if (!get_nfs_version(nfs)) return ERR_PTR(-EAGAIN); return nfs; } int get_nfs_version(struct nfs_subversion *nfs) { return try_module_get(nfs->owner); } EXPORT_SYMBOL_GPL(get_nfs_version); void put_nfs_version(struct nfs_subversion *nfs) { module_put(nfs->owner); } void register_nfs_version(struct nfs_subversion *nfs) { write_lock(&nfs_version_lock); nfs_version_mods[nfs->rpc_ops->version] = nfs; nfs_version[nfs->rpc_ops->version] = nfs->rpc_vers; write_unlock(&nfs_version_lock); } EXPORT_SYMBOL_GPL(register_nfs_version); void unregister_nfs_version(struct nfs_subversion *nfs) { write_lock(&nfs_version_lock); nfs_version[nfs->rpc_ops->version] = NULL; nfs_version_mods[nfs->rpc_ops->version] = NULL; write_unlock(&nfs_version_lock); } EXPORT_SYMBOL_GPL(unregister_nfs_version); /* * Allocate a shared client record * * Since these are allocated/deallocated very rarely, we don't * bother putting them in a slab cache... */ struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_init) { struct nfs_client *clp; int err = -ENOMEM; if ((clp = kzalloc(sizeof(*clp), GFP_KERNEL)) == NULL) goto error_0; clp->cl_minorversion = cl_init->minorversion; clp->cl_nfs_mod = cl_init->nfs_mod; if (!get_nfs_version(clp->cl_nfs_mod)) goto error_dealloc; clp->rpc_ops = clp->cl_nfs_mod->rpc_ops; refcount_set(&clp->cl_count, 1); clp->cl_cons_state = NFS_CS_INITING; memcpy(&clp->cl_addr, cl_init->addr, cl_init->addrlen); clp->cl_addrlen = cl_init->addrlen; if (cl_init->hostname) { err = -ENOMEM; clp->cl_hostname = kstrdup(cl_init->hostname, GFP_KERNEL); if (!clp->cl_hostname) goto error_cleanup; } INIT_LIST_HEAD(&clp->cl_superblocks); clp->cl_rpcclient = ERR_PTR(-EINVAL); clp->cl_flags = cl_init->init_flags; clp->cl_proto = cl_init->proto; clp->cl_nconnect = cl_init->nconnect; clp->cl_max_connect = cl_init->max_connect ? cl_init->max_connect : 1; clp->cl_net = get_net_track(cl_init->net, &clp->cl_ns_tracker, GFP_KERNEL); #if IS_ENABLED(CONFIG_NFS_LOCALIO) seqlock_init(&clp->cl_boot_lock); ktime_get_real_ts64(&clp->cl_nfssvc_boot); nfs_uuid_init(&clp->cl_uuid); INIT_WORK(&clp->cl_local_probe_work, nfs_local_probe_async_work); #endif /* CONFIG_NFS_LOCALIO */ clp->cl_principal = "*"; clp->cl_xprtsec = cl_init->xprtsec; return clp; error_cleanup: put_nfs_version(clp->cl_nfs_mod); error_dealloc: kfree(clp); error_0: return ERR_PTR(err); } EXPORT_SYMBOL_GPL(nfs_alloc_client); #if IS_ENABLED(CONFIG_NFS_V4) static void nfs_cleanup_cb_ident_idr(struct net *net) { struct nfs_net *nn = net_generic(net, nfs_net_id); idr_destroy(&nn->cb_ident_idr); } /* nfs_client_lock held */ static void nfs_cb_idr_remove_locked(struct nfs_client *clp) { struct nfs_net *nn = net_generic(clp->cl_net, nfs_net_id); if (clp->cl_cb_ident) idr_remove(&nn->cb_ident_idr, clp->cl_cb_ident); } static void pnfs_init_server(struct nfs_server *server) { rpc_init_wait_queue(&server->roc_rpcwaitq, "pNFS ROC"); } #else static void nfs_cleanup_cb_ident_idr(struct net *net) { } static void nfs_cb_idr_remove_locked(struct nfs_client *clp) { } static void pnfs_init_server(struct nfs_server *server) { } #endif /* CONFIG_NFS_V4 */ /* * Destroy a shared client record */ void nfs_free_client(struct nfs_client *clp) { nfs_localio_disable_client(clp); /* -EIO all pending I/O */ if (!IS_ERR(clp->cl_rpcclient)) rpc_shutdown_client(clp->cl_rpcclient); put_net_track(clp->cl_net, &clp->cl_ns_tracker); put_nfs_version(clp->cl_nfs_mod); kfree(clp->cl_hostname); kfree(clp->cl_acceptor); kfree_rcu(clp, rcu); } EXPORT_SYMBOL_GPL(nfs_free_client); /* * Release a reference to a shared client record */ void nfs_put_client(struct nfs_client *clp) { struct nfs_net *nn; if (!clp) return; nn = net_generic(clp->cl_net, nfs_net_id); if (refcount_dec_and_lock(&clp->cl_count, &nn->nfs_client_lock)) { list_del(&clp->cl_share_link); nfs_cb_idr_remove_locked(clp); spin_unlock(&nn->nfs_client_lock); WARN_ON_ONCE(!list_empty(&clp->cl_superblocks)); clp->rpc_ops->free_client(clp); } } EXPORT_SYMBOL_GPL(nfs_put_client); /* * Find an nfs_client on the list that matches the initialisation data * that is supplied. */ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *data) { struct nfs_client *clp; const struct sockaddr *sap = (struct sockaddr *)data->addr; struct nfs_net *nn = net_generic(data->net, nfs_net_id); int error; again: list_for_each_entry(clp, &nn->nfs_client_list, cl_share_link) { const struct sockaddr *clap = (struct sockaddr *)&clp->cl_addr; /* Don't match clients that failed to initialise properly */ if (clp->cl_cons_state < 0) continue; /* If a client is still initializing then we need to wait */ if (clp->cl_cons_state > NFS_CS_READY) { refcount_inc(&clp->cl_count); spin_unlock(&nn->nfs_client_lock); error = nfs_wait_client_init_complete(clp); nfs_put_client(clp); spin_lock(&nn->nfs_client_lock); if (error < 0) return ERR_PTR(error); goto again; } /* Different NFS versions cannot share the same nfs_client */ if (clp->rpc_ops != data->nfs_mod->rpc_ops) continue; if (clp->cl_proto != data->proto) continue; /* Match nfsv4 minorversion */ if (clp->cl_minorversion != data->minorversion) continue; /* Match request for a dedicated DS */ if (test_bit(NFS_CS_DS, &data->init_flags) != test_bit(NFS_CS_DS, &clp->cl_flags)) continue; /* Match the full socket address */ if (!rpc_cmp_addr_port(sap, clap)) /* Match all xprt_switch full socket addresses */ if (IS_ERR(clp->cl_rpcclient) || !rpc_clnt_xprt_switch_has_addr(clp->cl_rpcclient, sap)) continue; /* Match the xprt security policy */ if (clp->cl_xprtsec.policy != data->xprtsec.policy) continue; refcount_inc(&clp->cl_count); return clp; } return NULL; } /* * Return true if @clp is done initializing, false if still working on it. * * Use nfs_client_init_status to check if it was successful. */ bool nfs_client_init_is_complete(const struct nfs_client *clp) { return clp->cl_cons_state <= NFS_CS_READY; } EXPORT_SYMBOL_GPL(nfs_client_init_is_complete); /* * Return 0 if @clp was successfully initialized, -errno otherwise. * * This must be called *after* nfs_client_init_is_complete() returns true, * otherwise it will pop WARN_ON_ONCE and return -EINVAL */ int nfs_client_init_status(const struct nfs_client *clp) { /* called without checking nfs_client_init_is_complete */ if (clp->cl_cons_state > NFS_CS_READY) { WARN_ON_ONCE(1); return -EINVAL; } return clp->cl_cons_state; } EXPORT_SYMBOL_GPL(nfs_client_init_status); int nfs_wait_client_init_complete(const struct nfs_client *clp) { return wait_event_killable(nfs_client_active_wq, nfs_client_init_is_complete(clp)); } EXPORT_SYMBOL_GPL(nfs_wait_client_init_complete); /* * Found an existing client. Make sure it's ready before returning. */ static struct nfs_client * nfs_found_client(const struct nfs_client_initdata *cl_init, struct nfs_client *clp) { int error; error = nfs_wait_client_init_complete(clp); if (error < 0) { nfs_put_client(clp); return ERR_PTR(-ERESTARTSYS); } if (clp->cl_cons_state < NFS_CS_READY) { error = clp->cl_cons_state; nfs_put_client(clp); return ERR_PTR(error); } smp_rmb(); return clp; } /* * Look up a client by IP address and protocol version * - creates a new record if one doesn't yet exist */ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init) { struct nfs_client *clp, *new = NULL; struct nfs_net *nn = net_generic(cl_init->net, nfs_net_id); const struct nfs_rpc_ops *rpc_ops = cl_init->nfs_mod->rpc_ops; if (cl_init->hostname == NULL) { WARN_ON(1); return ERR_PTR(-EINVAL); } /* see if the client already exists */ do { spin_lock(&nn->nfs_client_lock); clp = nfs_match_client(cl_init); if (clp) { spin_unlock(&nn->nfs_client_lock); if (new) new->rpc_ops->free_client(new); if (IS_ERR(clp)) return clp; return nfs_found_client(cl_init, clp); } if (new) { list_add_tail(&new->cl_share_link, &nn->nfs_client_list); spin_unlock(&nn->nfs_client_lock); new = rpc_ops->init_client(new, cl_init); if (!IS_ERR(new)) nfs_local_probe_async(new); return new; } spin_unlock(&nn->nfs_client_lock); new = rpc_ops->alloc_client(cl_init); } while (!IS_ERR(new)); return new; } EXPORT_SYMBOL_GPL(nfs_get_client); /* * Mark a server as ready or failed */ void nfs_mark_client_ready(struct nfs_client *clp, int state) { smp_wmb(); clp->cl_cons_state = state; wake_up_all(&nfs_client_active_wq); } EXPORT_SYMBOL_GPL(nfs_mark_client_ready); /* * Initialise the timeout values for a connection */ void nfs_init_timeout_values(struct rpc_timeout *to, int proto, int timeo, int retrans) { to->to_initval = timeo * HZ / 10; to->to_retries = retrans; switch (proto) { case XPRT_TRANSPORT_TCP: case XPRT_TRANSPORT_TCP_TLS: case XPRT_TRANSPORT_RDMA: if (retrans == NFS_UNSPEC_RETRANS) to->to_retries = NFS_DEF_TCP_RETRANS; if (timeo == NFS_UNSPEC_TIMEO || to->to_initval == 0) to->to_initval = NFS_DEF_TCP_TIMEO * HZ / 10; if (to->to_initval > NFS_MAX_TCP_TIMEOUT) to->to_initval = NFS_MAX_TCP_TIMEOUT; to->to_increment = to->to_initval; to->to_maxval = to->to_initval + (to->to_increment * to->to_retries); if (to->to_maxval > NFS_MAX_TCP_TIMEOUT) to->to_maxval = NFS_MAX_TCP_TIMEOUT; if (to->to_maxval < to->to_initval) to->to_maxval = to->to_initval; to->to_exponential = 0; break; case XPRT_TRANSPORT_UDP: if (retrans == NFS_UNSPEC_RETRANS) to->to_retries = NFS_DEF_UDP_RETRANS; if (timeo == NFS_UNSPEC_TIMEO || to->to_initval == 0) to->to_initval = NFS_DEF_UDP_TIMEO * HZ / 10; if (to->to_initval > NFS_MAX_UDP_TIMEOUT) to->to_initval = NFS_MAX_UDP_TIMEOUT; to->to_maxval = NFS_MAX_UDP_TIMEOUT; to->to_exponential = 1; break; default: BUG(); } } EXPORT_SYMBOL_GPL(nfs_init_timeout_values); /* * Create an RPC client handle */ int nfs_create_rpc_client(struct nfs_client *clp, const struct nfs_client_initdata *cl_init, rpc_authflavor_t flavor) { struct nfs_net *nn = net_generic(clp->cl_net, nfs_net_id); struct rpc_clnt *clnt = NULL; struct rpc_create_args args = { .net = clp->cl_net, .protocol = clp->cl_proto, .nconnect = clp->cl_nconnect, .address = (struct sockaddr *)&clp->cl_addr, .addrsize = clp->cl_addrlen, .timeout = cl_init->timeparms, .servername = clp->cl_hostname, .nodename = cl_init->nodename, .program = &nfs_program, .stats = &nn->rpcstats, .version = clp->rpc_ops->version, .authflavor = flavor, .cred = cl_init->cred, .xprtsec = cl_init->xprtsec, .connect_timeout = cl_init->connect_timeout, .reconnect_timeout = cl_init->reconnect_timeout, }; if (test_bit(NFS_CS_DISCRTRY, &clp->cl_flags)) args.flags |= RPC_CLNT_CREATE_DISCRTRY; if (test_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags)) args.flags |= RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT; if (test_bit(NFS_CS_NORESVPORT, &clp->cl_flags)) args.flags |= RPC_CLNT_CREATE_NONPRIVPORT; if (test_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags)) args.flags |= RPC_CLNT_CREATE_INFINITE_SLOTS; if (test_bit(NFS_CS_NOPING, &clp->cl_flags)) args.flags |= RPC_CLNT_CREATE_NOPING; if (test_bit(NFS_CS_REUSEPORT, &clp->cl_flags)) args.flags |= RPC_CLNT_CREATE_REUSEPORT; if (test_bit(NFS_CS_NETUNREACH_FATAL, &clp->cl_flags)) args.flags |= RPC_CLNT_CREATE_NETUNREACH_FATAL; if (!IS_ERR(clp->cl_rpcclient)) return 0; clnt = rpc_create(&args); if (IS_ERR(clnt)) { dprintk("%s: cannot create RPC client. Error = %ld\n", __func__, PTR_ERR(clnt)); return PTR_ERR(clnt); } clnt->cl_principal = clp->cl_principal; clp->cl_rpcclient = clnt; clnt->cl_max_connect = clp->cl_max_connect; return 0; } EXPORT_SYMBOL_GPL(nfs_create_rpc_client); /* * Version 2 or 3 client destruction */ static void nfs_destroy_server(struct nfs_server *server) { if (server->nlm_host) nlmclnt_done(server->nlm_host); } /* * Version 2 or 3 lockd setup */ static int nfs_start_lockd(struct nfs_server *server) { struct nlm_host *host; struct nfs_client *clp = server->nfs_client; struct nlmclnt_initdata nlm_init = { .hostname = clp->cl_hostname, .address = (struct sockaddr *)&clp->cl_addr, .addrlen = clp->cl_addrlen, .nfs_version = clp->rpc_ops->version, .noresvport = server->flags & NFS_MOUNT_NORESVPORT ? 1 : 0, .net = clp->cl_net, .nlmclnt_ops = clp->cl_nfs_mod->rpc_ops->nlmclnt_ops, .cred = server->cred, }; if (nlm_init.nfs_version > 3) return 0; if ((server->flags & NFS_MOUNT_LOCAL_FLOCK) && (server->flags & NFS_MOUNT_LOCAL_FCNTL)) return 0; switch (clp->cl_proto) { default: nlm_init.protocol = IPPROTO_TCP; break; #ifndef CONFIG_NFS_DISABLE_UDP_SUPPORT case XPRT_TRANSPORT_UDP: nlm_init.protocol = IPPROTO_UDP; #endif } host = nlmclnt_init(&nlm_init); if (IS_ERR(host)) return PTR_ERR(host); server->nlm_host = host; server->destroy = nfs_destroy_server; nfs_sysfs_link_rpc_client(server, nlmclnt_rpc_clnt(host), NULL); return 0; } /* * Create a general RPC client */ int nfs_init_server_rpcclient(struct nfs_server *server, const struct rpc_timeout *timeo, rpc_authflavor_t pseudoflavour) { struct nfs_client *clp = server->nfs_client; server->client = rpc_clone_client_set_auth(clp->cl_rpcclient, pseudoflavour); if (IS_ERR(server->client)) { dprintk("%s: couldn't create rpc_client!\n", __func__); return PTR_ERR(server->client); } memcpy(&server->client->cl_timeout_default, timeo, sizeof(server->client->cl_timeout_default)); server->client->cl_timeout = &server->client->cl_timeout_default; server->client->cl_softrtry = 0; if (server->flags & NFS_MOUNT_SOFTERR) server->client->cl_softerr = 1; if (server->flags & NFS_MOUNT_SOFT) server->client->cl_softrtry = 1; nfs_sysfs_link_rpc_client(server, server->client, NULL); return 0; } EXPORT_SYMBOL_GPL(nfs_init_server_rpcclient); /** * nfs_init_client - Initialise an NFS2 or NFS3 client * * @clp: nfs_client to initialise * @cl_init: Initialisation parameters * * Returns pointer to an NFS client, or an ERR_PTR value. */ struct nfs_client *nfs_init_client(struct nfs_client *clp, const struct nfs_client_initdata *cl_init) { int error; /* the client is already initialised */ if (clp->cl_cons_state == NFS_CS_READY) return clp; /* * Create a client RPC handle for doing FSSTAT with UNIX auth only * - RFC 2623, sec 2.3.2 */ error = nfs_create_rpc_client(clp, cl_init, RPC_AUTH_UNIX); nfs_mark_client_ready(clp, error == 0 ? NFS_CS_READY : error); if (error < 0) { nfs_put_client(clp); clp = ERR_PTR(error); } return clp; } EXPORT_SYMBOL_GPL(nfs_init_client); /* * Create a version 2 or 3 client */ static int nfs_init_server(struct nfs_server *server, const struct fs_context *fc) { const struct nfs_fs_context *ctx = nfs_fc2context(fc); struct rpc_timeout timeparms; struct nfs_client_initdata cl_init = { .hostname = ctx->nfs_server.hostname, .addr = &ctx->nfs_server._address, .addrlen = ctx->nfs_server.addrlen, .nfs_mod = ctx->nfs_mod, .proto = ctx->nfs_server.protocol, .net = fc->net_ns, .timeparms = &timeparms, .cred = server->cred, .nconnect = ctx->nfs_server.nconnect, .init_flags = (1UL << NFS_CS_REUSEPORT), .xprtsec = ctx->xprtsec, }; struct nfs_client *clp; int error; nfs_init_timeout_values(&timeparms, ctx->nfs_server.protocol, ctx->timeo, ctx->retrans); if (ctx->flags & NFS_MOUNT_NORESVPORT) set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags); if (ctx->flags & NFS_MOUNT_NETUNREACH_FATAL) __set_bit(NFS_CS_NETUNREACH_FATAL, &cl_init.init_flags); /* Allocate or find a client reference we can use */ clp = nfs_get_client(&cl_init); if (IS_ERR(clp)) return PTR_ERR(clp); server->nfs_client = clp; nfs_sysfs_add_server(server); nfs_sysfs_link_rpc_client(server, clp->cl_rpcclient, "_state"); /* Initialise the client representation from the mount data */ server->flags = ctx->flags; server->options = ctx->options; server->caps |= NFS_CAP_HARDLINKS | NFS_CAP_SYMLINKS; switch (clp->rpc_ops->version) { case 2: server->fattr_valid = NFS_ATTR_FATTR_V2; break; case 3: server->fattr_valid = NFS_ATTR_FATTR_V3; break; default: server->fattr_valid = NFS_ATTR_FATTR_V4; } if (ctx->rsize) server->rsize = nfs_io_size(ctx->rsize, clp->cl_proto); if (ctx->wsize) server->wsize = nfs_io_size(ctx->wsize, clp->cl_proto); server->acregmin = ctx->acregmin * HZ; server->acregmax = ctx->acregmax * HZ; server->acdirmin = ctx->acdirmin * HZ; server->acdirmax = ctx->acdirmax * HZ; /* Start lockd here, before we might error out */ error = nfs_start_lockd(server); if (error < 0) goto error; server->port = ctx->nfs_server.port; server->auth_info = ctx->auth_info; error = nfs_init_server_rpcclient(server, &timeparms, ctx->selected_flavor); if (error < 0) goto error; /* Preserve the values of mount_server-related mount options */ if (ctx->mount_server.addrlen) { memcpy(&server->mountd_address, &ctx->mount_server.address, ctx->mount_server.addrlen); server->mountd_addrlen = ctx->mount_server.addrlen; } server->mountd_version = ctx->mount_server.version; server->mountd_port = ctx->mount_server.port; server->mountd_protocol = ctx->mount_server.protocol; server->namelen = ctx->namlen; return 0; error: server->nfs_client = NULL; nfs_put_client(clp); return error; } /* * Load up the server record from information gained in an fsinfo record */ static void nfs_server_set_fsinfo(struct nfs_server *server, struct nfs_fsinfo *fsinfo) { struct nfs_client *clp = server->nfs_client; unsigned long max_rpc_payload, raw_max_rpc_payload; /* Work out a lot of parameters */ if (server->rsize == 0) server->rsize = nfs_io_size(fsinfo->rtpref, clp->cl_proto); if (server->wsize == 0) server->wsize = nfs_io_size(fsinfo->wtpref, clp->cl_proto); if (fsinfo->rtmax >= 512 && server->rsize > fsinfo->rtmax) server->rsize = nfs_io_size(fsinfo->rtmax, clp->cl_proto); if (fsinfo->wtmax >= 512 && server->wsize > fsinfo->wtmax) server->wsize = nfs_io_size(fsinfo->wtmax, clp->cl_proto); raw_max_rpc_payload = rpc_max_payload(server->client); max_rpc_payload = nfs_block_size(raw_max_rpc_payload, NULL); if (server->rsize > max_rpc_payload) server->rsize = max_rpc_payload; if (server->rsize > NFS_MAX_FILE_IO_SIZE) server->rsize = NFS_MAX_FILE_IO_SIZE; server->rpages = (server->rsize + PAGE_SIZE - 1) >> PAGE_SHIFT; if (server->wsize > max_rpc_payload) server->wsize = max_rpc_payload; if (server->wsize > NFS_MAX_FILE_IO_SIZE) server->wsize = NFS_MAX_FILE_IO_SIZE; server->wpages = (server->wsize + PAGE_SIZE - 1) >> PAGE_SHIFT; server->wtmult = nfs_block_bits(fsinfo->wtmult, NULL); server->dtsize = nfs_block_size(fsinfo->dtpref, NULL); if (server->dtsize > NFS_MAX_FILE_IO_SIZE) server->dtsize = NFS_MAX_FILE_IO_SIZE; if (server->dtsize > server->rsize) server->dtsize = server->rsize; if (server->flags & NFS_MOUNT_NOAC) { server->acregmin = server->acregmax = 0; server->acdirmin = server->acdirmax = 0; } server->maxfilesize = fsinfo->maxfilesize; server->time_delta = fsinfo->time_delta; server->change_attr_type = fsinfo->change_attr_type; server->clone_blksize = fsinfo->clone_blksize; /* We're airborne Set socket buffersize */ rpc_setbufsize(server->client, server->wsize + 100, server->rsize + 100); #ifdef CONFIG_NFS_V4_2 /* * Defaults until limited by the session parameters. */ server->gxasize = min_t(unsigned int, raw_max_rpc_payload, XATTR_SIZE_MAX); server->sxasize = min_t(unsigned int, raw_max_rpc_payload, XATTR_SIZE_MAX); server->lxasize = min_t(unsigned int, raw_max_rpc_payload, nfs42_listxattr_xdrsize(XATTR_LIST_MAX)); if (fsinfo->xattr_support) server->caps |= NFS_CAP_XATTR; #endif } /* * Probe filesystem information, including the FSID on v2/v3 */ static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, struct nfs_fattr *fattr) { struct nfs_fsinfo fsinfo; struct nfs_client *clp = server->nfs_client; int error; if (clp->rpc_ops->set_capabilities != NULL) { error = clp->rpc_ops->set_capabilities(server, mntfh); if (error < 0) return error; } fsinfo.fattr = fattr; fsinfo.nlayouttypes = 0; memset(fsinfo.layouttype, 0, sizeof(fsinfo.layouttype)); error = clp->rpc_ops->fsinfo(server, mntfh, &fsinfo); if (error < 0) return error; nfs_server_set_fsinfo(server, &fsinfo); /* Get some general file system info */ if (server->namelen == 0) { struct nfs_pathconf pathinfo; pathinfo.fattr = fattr; nfs_fattr_init(fattr); if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0) server->namelen = pathinfo.max_namelen; } if (clp->rpc_ops->discover_trunking != NULL && (server->caps & NFS_CAP_FS_LOCATIONS && (server->flags & NFS_MOUNT_TRUNK_DISCOVERY))) { error = clp->rpc_ops->discover_trunking(server, mntfh); if (error < 0) return error; } return 0; } /* * Grab the destination's particulars, including lease expiry time. * * Returns zero if probe succeeded and retrieved FSID matches the FSID * we have cached. */ int nfs_probe_server(struct nfs_server *server, struct nfs_fh *mntfh) { struct nfs_fattr *fattr; int error; fattr = nfs_alloc_fattr(); if (fattr == NULL) return -ENOMEM; /* Sanity: the probe won't work if the destination server * does not recognize the migrated FH. */ error = nfs_probe_fsinfo(server, mntfh, fattr); nfs_free_fattr(fattr); return error; } EXPORT_SYMBOL_GPL(nfs_probe_server); /* * Copy useful information when duplicating a server record */ void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source) { target->flags = source->flags; target->rsize = source->rsize; target->wsize = source->wsize; target->acregmin = source->acregmin; target->acregmax = source->acregmax; target->acdirmin = source->acdirmin; target->acdirmax = source->acdirmax; target->caps = source->caps; target->options = source->options; target->auth_info = source->auth_info; target->port = source->port; } EXPORT_SYMBOL_GPL(nfs_server_copy_userdata); void nfs_server_insert_lists(struct nfs_server *server) { struct nfs_client *clp = server->nfs_client; struct nfs_net *nn = net_generic(clp->cl_net, nfs_net_id); spin_lock(&nn->nfs_client_lock); list_add_tail_rcu(&server->client_link, &clp->cl_superblocks); list_add_tail(&server->master_link, &nn->nfs_volume_list); clear_bit(NFS_CS_STOP_RENEW, &clp->cl_res_state); spin_unlock(&nn->nfs_client_lock); } EXPORT_SYMBOL_GPL(nfs_server_insert_lists); void nfs_server_remove_lists(struct nfs_server *server) { struct nfs_client *clp = server->nfs_client; struct nfs_net *nn; if (clp == NULL) return; nn = net_generic(clp->cl_net, nfs_net_id); spin_lock(&nn->nfs_client_lock); list_del_rcu(&server->client_link); if (list_empty(&clp->cl_superblocks)) set_bit(NFS_CS_STOP_RENEW, &clp->cl_res_state); list_del(&server->master_link); spin_unlock(&nn->nfs_client_lock); synchronize_rcu(); } EXPORT_SYMBOL_GPL(nfs_server_remove_lists); static DEFINE_IDA(s_sysfs_ids); /* * Allocate and initialise a server record */ struct nfs_server *nfs_alloc_server(void) { struct nfs_server *server; server = kzalloc(sizeof(struct nfs_server), GFP_KERNEL); if (!server) return NULL; server->s_sysfs_id = ida_alloc(&s_sysfs_ids, GFP_KERNEL); if (server->s_sysfs_id < 0) { kfree(server); return NULL; } server->client = server->client_acl = ERR_PTR(-EINVAL); /* Zero out the NFS state stuff */ INIT_LIST_HEAD(&server->client_link); INIT_LIST_HEAD(&server->master_link); INIT_LIST_HEAD(&server->delegations); INIT_LIST_HEAD(&server->layouts); INIT_LIST_HEAD(&server->state_owners_lru); INIT_LIST_HEAD(&server->ss_copies); INIT_LIST_HEAD(&server->ss_src_copies); atomic_set(&server->active, 0); server->io_stats = nfs_alloc_iostats(); if (!server->io_stats) { kfree(server); return NULL; } server->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED; init_waitqueue_head(&server->write_congestion_wait); atomic_long_set(&server->writeback, 0); atomic64_set(&server->owner_ctr, 0); pnfs_init_server(server); rpc_init_wait_queue(&server->uoc_rpcwaitq, "NFS UOC"); return server; } EXPORT_SYMBOL_GPL(nfs_alloc_server); static void delayed_free(struct rcu_head *p) { struct nfs_server *server = container_of(p, struct nfs_server, rcu); nfs_free_iostats(server->io_stats); kfree(server); } /* * Free up a server record */ void nfs_free_server(struct nfs_server *server) { nfs_server_remove_lists(server); if (server->destroy != NULL) server->destroy(server); if (!IS_ERR(server->client_acl)) rpc_shutdown_client(server->client_acl); if (!IS_ERR(server->client)) rpc_shutdown_client(server->client); nfs_put_client(server->nfs_client); if (server->kobj.state_initialized) { nfs_sysfs_remove_server(server); kobject_put(&server->kobj); } ida_free(&s_sysfs_ids, server->s_sysfs_id); put_cred(server->cred); nfs_release_automount_timer(); call_rcu(&server->rcu, delayed_free); } EXPORT_SYMBOL_GPL(nfs_free_server); /* * Create a version 2 or 3 volume record * - keyed on server and FSID */ struct nfs_server *nfs_create_server(struct fs_context *fc) { struct nfs_fs_context *ctx = nfs_fc2context(fc); struct nfs_server *server; struct nfs_fattr *fattr; int error; server = nfs_alloc_server(); if (!server) return ERR_PTR(-ENOMEM); server->cred = get_cred(fc->cred); error = -ENOMEM; fattr = nfs_alloc_fattr(); if (fattr == NULL) goto error; /* Get a client representation */ error = nfs_init_server(server, fc); if (error < 0) goto error; /* Probe the root fh to retrieve its FSID */ error = nfs_probe_fsinfo(server, ctx->mntfh, fattr); if (error < 0) goto error; if (server->nfs_client->rpc_ops->version == 3) { if (server->namelen == 0 || server->namelen > NFS3_MAXNAMLEN) server->namelen = NFS3_MAXNAMLEN; if (!(ctx->flags & NFS_MOUNT_NORDIRPLUS)) server->caps |= NFS_CAP_READDIRPLUS; } else { if (server->namelen == 0 || server->namelen > NFS2_MAXNAMLEN) server->namelen = NFS2_MAXNAMLEN; } /* Linux 'subtree_check' borkenness mandates this setting */ server->fh_expire_type = NFS_FH_VOL_RENAME; if (!(fattr->valid & NFS_ATTR_FATTR)) { error = ctx->nfs_mod->rpc_ops->getattr(server, ctx->mntfh, fattr, NULL); if (error < 0) { dprintk("nfs_create_server: getattr error = %d\n", -error); goto error; } } memcpy(&server->fsid, &fattr->fsid, sizeof(server->fsid)); dprintk("Server FSID: %llx:%llx\n", (unsigned long long) server->fsid.major, (unsigned long long) server->fsid.minor); nfs_server_insert_lists(server); server->mount_time = jiffies; nfs_free_fattr(fattr); return server; error: nfs_free_fattr(fattr); nfs_free_server(server); return ERR_PTR(error); } EXPORT_SYMBOL_GPL(nfs_create_server); /* * Clone an NFS2, NFS3 or NFS4 server record */ struct nfs_server *nfs_clone_server(struct nfs_server *source, struct nfs_fh *fh, struct nfs_fattr *fattr, rpc_authflavor_t flavor) { struct nfs_server *server; int error; server = nfs_alloc_server(); if (!server) return ERR_PTR(-ENOMEM); server->cred = get_cred(source->cred); /* Copy data from the source */ server->nfs_client = source->nfs_client; server->destroy = source->destroy; refcount_inc(&server->nfs_client->cl_count); nfs_server_copy_userdata(server, source); server->fsid = fattr->fsid; nfs_sysfs_add_server(server); nfs_sysfs_link_rpc_client(server, server->nfs_client->cl_rpcclient, "_state"); error = nfs_init_server_rpcclient(server, source->client->cl_timeout, flavor); if (error < 0) goto out_free_server; /* probe the filesystem info for this server filesystem */ error = nfs_probe_server(server, fh); if (error < 0) goto out_free_server; if (server->namelen == 0 || server->namelen > NFS4_MAXNAMLEN) server->namelen = NFS4_MAXNAMLEN; error = nfs_start_lockd(server); if (error < 0) goto out_free_server; nfs_server_insert_lists(server); server->mount_time = jiffies; return server; out_free_server: nfs_free_server(server); return ERR_PTR(error); } EXPORT_SYMBOL_GPL(nfs_clone_server); void nfs_clients_init(struct net *net) { struct nfs_net *nn = net_generic(net, nfs_net_id); INIT_LIST_HEAD(&nn->nfs_client_list); INIT_LIST_HEAD(&nn->nfs_volume_list); #if IS_ENABLED(CONFIG_NFS_V4) idr_init(&nn->cb_ident_idr); #endif #if IS_ENABLED(CONFIG_NFS_V4_1) INIT_LIST_HEAD(&nn->nfs4_data_server_cache); spin_lock_init(&nn->nfs4_data_server_lock); #endif spin_lock_init(&nn->nfs_client_lock); nn->boot_time = ktime_get_real(); memset(&nn->rpcstats, 0, sizeof(nn->rpcstats)); nn->rpcstats.program = &nfs_program; nfs_netns_sysfs_setup(nn, net); } void nfs_clients_exit(struct net *net) { struct nfs_net *nn = net_generic(net, nfs_net_id); nfs_netns_sysfs_destroy(nn); nfs_cleanup_cb_ident_idr(net); WARN_ON_ONCE(!list_empty(&nn->nfs_client_list)); WARN_ON_ONCE(!list_empty(&nn->nfs_volume_list)); #if IS_ENABLED(CONFIG_NFS_V4_1) WARN_ON_ONCE(!list_empty(&nn->nfs4_data_server_cache)); #endif } #ifdef CONFIG_PROC_FS static void *nfs_server_list_start(struct seq_file *p, loff_t *pos); static void *nfs_server_list_next(struct seq_file *p, void *v, loff_t *pos); static void nfs_server_list_stop(struct seq_file *p, void *v); static int nfs_server_list_show(struct seq_file *m, void *v); static const struct seq_operations nfs_server_list_ops = { .start = nfs_server_list_start, .next = nfs_server_list_next, .stop = nfs_server_list_stop, .show = nfs_server_list_show, }; static void *nfs_volume_list_start(struct seq_file *p, loff_t *pos); static void *nfs_volume_list_next(struct seq_file *p, void *v, loff_t *pos); static void nfs_volume_list_stop(struct seq_file *p, void *v); static int nfs_volume_list_show(struct seq_file *m, void *v); static const struct seq_operations nfs_volume_list_ops = { .start = nfs_volume_list_start, .next = nfs_volume_list_next, .stop = nfs_volume_list_stop, .show = nfs_volume_list_show, }; /* * set up the iterator to start reading from the server list and return the first item */ static void *nfs_server_list_start(struct seq_file *m, loff_t *_pos) __acquires(&nn->nfs_client_lock) { struct nfs_net *nn = net_generic(seq_file_net(m), nfs_net_id); /* lock the list against modification */ spin_lock(&nn->nfs_client_lock); return seq_list_start_head(&nn->nfs_client_list, *_pos); } /* * move to next server */ static void *nfs_server_list_next(struct seq_file *p, void *v, loff_t *pos) { struct nfs_net *nn = net_generic(seq_file_net(p), nfs_net_id); return seq_list_next(v, &nn->nfs_client_list, pos); } /* * clean up after reading from the transports list */ static void nfs_server_list_stop(struct seq_file *p, void *v) __releases(&nn->nfs_client_lock) { struct nfs_net *nn = net_generic(seq_file_net(p), nfs_net_id); spin_unlock(&nn->nfs_client_lock); } /* * display a header line followed by a load of call lines */ static int nfs_server_list_show(struct seq_file *m, void *v) { struct nfs_client *clp; struct nfs_net *nn = net_generic(seq_file_net(m), nfs_net_id); /* display header on line 1 */ if (v == &nn->nfs_client_list) { seq_puts(m, "NV SERVER PORT USE HOSTNAME\n"); return 0; } /* display one transport per line on subsequent lines */ clp = list_entry(v, struct nfs_client, cl_share_link); /* Check if the client is initialized */ if (clp->cl_cons_state != NFS_CS_READY) return 0; rcu_read_lock(); seq_printf(m, "v%u %s %s %3d %s\n", clp->rpc_ops->version, rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_ADDR), rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_PORT), refcount_read(&clp->cl_count), clp->cl_hostname); rcu_read_unlock(); return 0; } /* * set up the iterator to start reading from the volume list and return the first item */ static void *nfs_volume_list_start(struct seq_file *m, loff_t *_pos) __acquires(&nn->nfs_client_lock) { struct nfs_net *nn = net_generic(seq_file_net(m), nfs_net_id); /* lock the list against modification */ spin_lock(&nn->nfs_client_lock); return seq_list_start_head(&nn->nfs_volume_list, *_pos); } /* * move to next volume */ static void *nfs_volume_list_next(struct seq_file *p, void *v, loff_t *pos) { struct nfs_net *nn = net_generic(seq_file_net(p), nfs_net_id); return seq_list_next(v, &nn->nfs_volume_list, pos); } /* * clean up after reading from the transports list */ static void nfs_volume_list_stop(struct seq_file *p, void *v) __releases(&nn->nfs_client_lock) { struct nfs_net *nn = net_generic(seq_file_net(p), nfs_net_id); spin_unlock(&nn->nfs_client_lock); } /* * display a header line followed by a load of call lines */ static int nfs_volume_list_show(struct seq_file *m, void *v) { struct nfs_server *server; struct nfs_client *clp; char dev[13]; // 8 for 2^24, 1 for ':', 3 for 2^8, 1 for '\0' char fsid[34]; // 2 * 16 for %llx, 1 for ':', 1 for '\0' struct nfs_net *nn = net_generic(seq_file_net(m), nfs_net_id); /* display header on line 1 */ if (v == &nn->nfs_volume_list) { seq_puts(m, "NV SERVER PORT DEV FSID" " FSC\n"); return 0; } /* display one transport per line on subsequent lines */ server = list_entry(v, struct nfs_server, master_link); clp = server->nfs_client; snprintf(dev, sizeof(dev), "%u:%u", MAJOR(server->s_dev), MINOR(server->s_dev)); snprintf(fsid, sizeof(fsid), "%llx:%llx", (unsigned long long) server->fsid.major, (unsigned long long) server->fsid.minor); rcu_read_lock(); seq_printf(m, "v%u %s %s %-12s %-33s %s\n", clp->rpc_ops->version, rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_ADDR), rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_PORT), dev, fsid, nfs_server_fscache_state(server)); rcu_read_unlock(); return 0; } int nfs_fs_proc_net_init(struct net *net) { struct nfs_net *nn = net_generic(net, nfs_net_id); struct proc_dir_entry *p; nn->proc_nfsfs = proc_net_mkdir(net, "nfsfs", net->proc_net); if (!nn->proc_nfsfs) goto error_0; /* a file of servers with which we're dealing */ p = proc_create_net("servers", S_IFREG|S_IRUGO, nn->proc_nfsfs, &nfs_server_list_ops, sizeof(struct seq_net_private)); if (!p) goto error_1; /* a file of volumes that we have mounted */ p = proc_create_net("volumes", S_IFREG|S_IRUGO, nn->proc_nfsfs, &nfs_volume_list_ops, sizeof(struct seq_net_private)); if (!p) goto error_1; return 0; error_1: remove_proc_subtree("nfsfs", net->proc_net); error_0: return -ENOMEM; } void nfs_fs_proc_net_exit(struct net *net) { remove_proc_subtree("nfsfs", net->proc_net); } /* * initialise the /proc/fs/nfsfs/ directory */ int __init nfs_fs_proc_init(void) { if (!proc_mkdir("fs/nfsfs", NULL)) goto error_0; /* a file of servers with which we're dealing */ if (!proc_symlink("fs/nfsfs/servers", NULL, "../../net/nfsfs/servers")) goto error_1; /* a file of volumes that we have mounted */ if (!proc_symlink("fs/nfsfs/volumes", NULL, "../../net/nfsfs/volumes")) goto error_1; return 0; error_1: remove_proc_subtree("fs/nfsfs", NULL); error_0: return -ENOMEM; } /* * clean up the /proc/fs/nfsfs/ directory */ void nfs_fs_proc_exit(void) { remove_proc_subtree("fs/nfsfs", NULL); ida_destroy(&s_sysfs_ids); } #endif /* CONFIG_PROC_FS */ |
| 24 3 3 24 24 24 24 24 24 24 24 77 76 77 60 60 60 24 24 103 103 16 16 16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Asynchronous Compression operations * * Copyright (c) 2016, Intel Corporation * Authors: Weigang Li <weigang.li@intel.com> * Giovanni Cabiddu <giovanni.cabiddu@intel.com> */ #include <crypto/internal/acompress.h> #include <crypto/scatterwalk.h> #include <linux/cryptouser.h> #include <linux/cpumask.h> #include <linux/err.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/percpu.h> #include <linux/scatterlist.h> #include <linux/sched.h> #include <linux/seq_file.h> #include <linux/smp.h> #include <linux/spinlock.h> #include <linux/string.h> #include <linux/workqueue.h> #include <net/netlink.h> #include "compress.h" struct crypto_scomp; enum { ACOMP_WALK_SLEEP = 1 << 0, ACOMP_WALK_SRC_LINEAR = 1 << 1, ACOMP_WALK_DST_LINEAR = 1 << 2, }; static const struct crypto_type crypto_acomp_type; static void acomp_reqchain_done(void *data, int err); static inline struct acomp_alg *__crypto_acomp_alg(struct crypto_alg *alg) { return container_of(alg, struct acomp_alg, calg.base); } static inline struct acomp_alg *crypto_acomp_alg(struct crypto_acomp *tfm) { return __crypto_acomp_alg(crypto_acomp_tfm(tfm)->__crt_alg); } static int __maybe_unused crypto_acomp_report( struct sk_buff *skb, struct crypto_alg *alg) { struct crypto_report_acomp racomp; memset(&racomp, 0, sizeof(racomp)); strscpy(racomp.type, "acomp", sizeof(racomp.type)); return nla_put(skb, CRYPTOCFGA_REPORT_ACOMP, sizeof(racomp), &racomp); } static void crypto_acomp_show(struct seq_file *m, struct crypto_alg *alg) __maybe_unused; static void crypto_acomp_show(struct seq_file *m, struct crypto_alg *alg) { seq_puts(m, "type : acomp\n"); } static void crypto_acomp_exit_tfm(struct crypto_tfm *tfm) { struct crypto_acomp *acomp = __crypto_acomp_tfm(tfm); struct acomp_alg *alg = crypto_acomp_alg(acomp); if (alg->exit) alg->exit(acomp); if (acomp_is_async(acomp)) crypto_free_acomp(crypto_acomp_fb(acomp)); } static int crypto_acomp_init_tfm(struct crypto_tfm *tfm) { struct crypto_acomp *acomp = __crypto_acomp_tfm(tfm); struct acomp_alg *alg = crypto_acomp_alg(acomp); struct crypto_acomp *fb = NULL; int err; if (tfm->__crt_alg->cra_type != &crypto_acomp_type) return crypto_init_scomp_ops_async(tfm); if (acomp_is_async(acomp)) { fb = crypto_alloc_acomp(crypto_acomp_alg_name(acomp), 0, CRYPTO_ALG_ASYNC); if (IS_ERR(fb)) return PTR_ERR(fb); err = -EINVAL; if (crypto_acomp_reqsize(fb) > MAX_SYNC_COMP_REQSIZE) goto out_free_fb; tfm->fb = crypto_acomp_tfm(fb); } acomp->compress = alg->compress; acomp->decompress = alg->decompress; acomp->reqsize = alg->base.cra_reqsize; acomp->base.exit = crypto_acomp_exit_tfm; if (!alg->init) return 0; err = alg->init(acomp); if (err) goto out_free_fb; return 0; out_free_fb: crypto_free_acomp(fb); return err; } static unsigned int crypto_acomp_extsize(struct crypto_alg *alg) { int extsize = crypto_alg_extsize(alg); if (alg->cra_type != &crypto_acomp_type) extsize += sizeof(struct crypto_scomp *); return extsize; } static const struct crypto_type crypto_acomp_type = { .extsize = crypto_acomp_extsize, .init_tfm = crypto_acomp_init_tfm, #ifdef CONFIG_PROC_FS .show = crypto_acomp_show, #endif #if IS_ENABLED(CONFIG_CRYPTO_USER) .report = crypto_acomp_report, #endif .maskclear = ~CRYPTO_ALG_TYPE_MASK, .maskset = CRYPTO_ALG_TYPE_ACOMPRESS_MASK, .type = CRYPTO_ALG_TYPE_ACOMPRESS, .tfmsize = offsetof(struct crypto_acomp, base), .algsize = offsetof(struct acomp_alg, base), }; struct crypto_acomp *crypto_alloc_acomp(const char *alg_name, u32 type, u32 mask) { return crypto_alloc_tfm(alg_name, &crypto_acomp_type, type, mask); } EXPORT_SYMBOL_GPL(crypto_alloc_acomp); struct crypto_acomp *crypto_alloc_acomp_node(const char *alg_name, u32 type, u32 mask, int node) { return crypto_alloc_tfm_node(alg_name, &crypto_acomp_type, type, mask, node); } EXPORT_SYMBOL_GPL(crypto_alloc_acomp_node); static void acomp_save_req(struct acomp_req *req, crypto_completion_t cplt) { struct acomp_req_chain *state = &req->chain; state->compl = req->base.complete; state->data = req->base.data; req->base.complete = cplt; req->base.data = state; } static void acomp_restore_req(struct acomp_req *req) { struct acomp_req_chain *state = req->base.data; req->base.complete = state->compl; req->base.data = state->data; } static void acomp_reqchain_virt(struct acomp_req *req) { struct acomp_req_chain *state = &req->chain; unsigned int slen = req->slen; unsigned int dlen = req->dlen; if (state->flags & CRYPTO_ACOMP_REQ_SRC_VIRT) acomp_request_set_src_dma(req, state->src, slen); if (state->flags & CRYPTO_ACOMP_REQ_DST_VIRT) acomp_request_set_dst_dma(req, state->dst, dlen); } static void acomp_virt_to_sg(struct acomp_req *req) { struct acomp_req_chain *state = &req->chain; state->flags = req->base.flags & (CRYPTO_ACOMP_REQ_SRC_VIRT | CRYPTO_ACOMP_REQ_DST_VIRT); if (acomp_request_src_isvirt(req)) { unsigned int slen = req->slen; const u8 *svirt = req->svirt; state->src = svirt; sg_init_one(&state->ssg, svirt, slen); acomp_request_set_src_sg(req, &state->ssg, slen); } if (acomp_request_dst_isvirt(req)) { unsigned int dlen = req->dlen; u8 *dvirt = req->dvirt; state->dst = dvirt; sg_init_one(&state->dsg, dvirt, dlen); acomp_request_set_dst_sg(req, &state->dsg, dlen); } } static int acomp_do_nondma(struct acomp_req *req, bool comp) { ACOMP_FBREQ_ON_STACK(fbreq, req); int err; if (comp) err = crypto_acomp_compress(fbreq); else err = crypto_acomp_decompress(fbreq); req->dlen = fbreq->dlen; return err; } static int acomp_do_one_req(struct acomp_req *req, bool comp) { if (acomp_request_isnondma(req)) return acomp_do_nondma(req, comp); acomp_virt_to_sg(req); return comp ? crypto_acomp_reqtfm(req)->compress(req) : crypto_acomp_reqtfm(req)->decompress(req); } static int acomp_reqchain_finish(struct acomp_req *req, int err) { acomp_reqchain_virt(req); acomp_restore_req(req); return err; } static void acomp_reqchain_done(void *data, int err) { struct acomp_req *req = data; crypto_completion_t compl; compl = req->chain.compl; data = req->chain.data; if (err == -EINPROGRESS) goto notify; err = acomp_reqchain_finish(req, err); notify: compl(data, err); } static int acomp_do_req_chain(struct acomp_req *req, bool comp) { int err; acomp_save_req(req, acomp_reqchain_done); err = acomp_do_one_req(req, comp); if (err == -EBUSY || err == -EINPROGRESS) return err; return acomp_reqchain_finish(req, err); } int crypto_acomp_compress(struct acomp_req *req) { struct crypto_acomp *tfm = crypto_acomp_reqtfm(req); if (acomp_req_on_stack(req) && acomp_is_async(tfm)) return -EAGAIN; if (crypto_acomp_req_virt(tfm) || acomp_request_issg(req)) return crypto_acomp_reqtfm(req)->compress(req); return acomp_do_req_chain(req, true); } EXPORT_SYMBOL_GPL(crypto_acomp_compress); int crypto_acomp_decompress(struct acomp_req *req) { struct crypto_acomp *tfm = crypto_acomp_reqtfm(req); if (acomp_req_on_stack(req) && acomp_is_async(tfm)) return -EAGAIN; if (crypto_acomp_req_virt(tfm) || acomp_request_issg(req)) return crypto_acomp_reqtfm(req)->decompress(req); return acomp_do_req_chain(req, false); } EXPORT_SYMBOL_GPL(crypto_acomp_decompress); void comp_prepare_alg(struct comp_alg_common *alg) { struct crypto_alg *base = &alg->base; base->cra_flags &= ~CRYPTO_ALG_TYPE_MASK; } int crypto_register_acomp(struct acomp_alg *alg) { struct crypto_alg *base = &alg->calg.base; comp_prepare_alg(&alg->calg); base->cra_type = &crypto_acomp_type; base->cra_flags |= CRYPTO_ALG_TYPE_ACOMPRESS; return crypto_register_alg(base); } EXPORT_SYMBOL_GPL(crypto_register_acomp); void crypto_unregister_acomp(struct acomp_alg *alg) { crypto_unregister_alg(&alg->base); } EXPORT_SYMBOL_GPL(crypto_unregister_acomp); int crypto_register_acomps(struct acomp_alg *algs, int count) { int i, ret; for (i = 0; i < count; i++) { ret = crypto_register_acomp(&algs[i]); if (ret) goto err; } return 0; err: for (--i; i >= 0; --i) crypto_unregister_acomp(&algs[i]); return ret; } EXPORT_SYMBOL_GPL(crypto_register_acomps); void crypto_unregister_acomps(struct acomp_alg *algs, int count) { int i; for (i = count - 1; i >= 0; --i) crypto_unregister_acomp(&algs[i]); } EXPORT_SYMBOL_GPL(crypto_unregister_acomps); static void acomp_stream_workfn(struct work_struct *work) { struct crypto_acomp_streams *s = container_of(work, struct crypto_acomp_streams, stream_work); struct crypto_acomp_stream __percpu *streams = s->streams; int cpu; for_each_cpu(cpu, &s->stream_want) { struct crypto_acomp_stream *ps; void *ctx; ps = per_cpu_ptr(streams, cpu); if (ps->ctx) continue; ctx = s->alloc_ctx(); if (IS_ERR(ctx)) break; spin_lock_bh(&ps->lock); ps->ctx = ctx; spin_unlock_bh(&ps->lock); cpumask_clear_cpu(cpu, &s->stream_want); } } void crypto_acomp_free_streams(struct crypto_acomp_streams *s) { struct crypto_acomp_stream __percpu *streams = s->streams; void (*free_ctx)(void *); int i; s->streams = NULL; if (!streams) return; cancel_work_sync(&s->stream_work); free_ctx = s->free_ctx; for_each_possible_cpu(i) { struct crypto_acomp_stream *ps = per_cpu_ptr(streams, i); if (!ps->ctx) continue; free_ctx(ps->ctx); } free_percpu(streams); } EXPORT_SYMBOL_GPL(crypto_acomp_free_streams); int crypto_acomp_alloc_streams(struct crypto_acomp_streams *s) { struct crypto_acomp_stream __percpu *streams; struct crypto_acomp_stream *ps; unsigned int i; void *ctx; if (s->streams) return 0; streams = alloc_percpu(struct crypto_acomp_stream); if (!streams) return -ENOMEM; ctx = s->alloc_ctx(); if (IS_ERR(ctx)) { free_percpu(streams); return PTR_ERR(ctx); } i = cpumask_first(cpu_possible_mask); ps = per_cpu_ptr(streams, i); ps->ctx = ctx; for_each_possible_cpu(i) { ps = per_cpu_ptr(streams, i); spin_lock_init(&ps->lock); } s->streams = streams; INIT_WORK(&s->stream_work, acomp_stream_workfn); return 0; } EXPORT_SYMBOL_GPL(crypto_acomp_alloc_streams); struct crypto_acomp_stream *crypto_acomp_lock_stream_bh( struct crypto_acomp_streams *s) __acquires(stream) { struct crypto_acomp_stream __percpu *streams = s->streams; int cpu = raw_smp_processor_id(); struct crypto_acomp_stream *ps; ps = per_cpu_ptr(streams, cpu); spin_lock_bh(&ps->lock); if (likely(ps->ctx)) return ps; spin_unlock(&ps->lock); cpumask_set_cpu(cpu, &s->stream_want); schedule_work(&s->stream_work); ps = per_cpu_ptr(streams, cpumask_first(cpu_possible_mask)); spin_lock(&ps->lock); return ps; } EXPORT_SYMBOL_GPL(crypto_acomp_lock_stream_bh); void acomp_walk_done_src(struct acomp_walk *walk, int used) { walk->slen -= used; if ((walk->flags & ACOMP_WALK_SRC_LINEAR)) scatterwalk_advance(&walk->in, used); else scatterwalk_done_src(&walk->in, used); if ((walk->flags & ACOMP_WALK_SLEEP)) cond_resched(); } EXPORT_SYMBOL_GPL(acomp_walk_done_src); void acomp_walk_done_dst(struct acomp_walk *walk, int used) { walk->dlen -= used; if ((walk->flags & ACOMP_WALK_DST_LINEAR)) scatterwalk_advance(&walk->out, used); else scatterwalk_done_dst(&walk->out, used); if ((walk->flags & ACOMP_WALK_SLEEP)) cond_resched(); } EXPORT_SYMBOL_GPL(acomp_walk_done_dst); int acomp_walk_next_src(struct acomp_walk *walk) { unsigned int slen = walk->slen; unsigned int max = UINT_MAX; if (!preempt_model_preemptible() && (walk->flags & ACOMP_WALK_SLEEP)) max = PAGE_SIZE; if ((walk->flags & ACOMP_WALK_SRC_LINEAR)) { walk->in.__addr = (void *)(((u8 *)walk->in.sg) + walk->in.offset); return min(slen, max); } return slen ? scatterwalk_next(&walk->in, slen) : 0; } EXPORT_SYMBOL_GPL(acomp_walk_next_src); int acomp_walk_next_dst(struct acomp_walk *walk) { unsigned int dlen = walk->dlen; unsigned int max = UINT_MAX; if (!preempt_model_preemptible() && (walk->flags & ACOMP_WALK_SLEEP)) max = PAGE_SIZE; if ((walk->flags & ACOMP_WALK_DST_LINEAR)) { walk->out.__addr = (void *)(((u8 *)walk->out.sg) + walk->out.offset); return min(dlen, max); } return dlen ? scatterwalk_next(&walk->out, dlen) : 0; } EXPORT_SYMBOL_GPL(acomp_walk_next_dst); int acomp_walk_virt(struct acomp_walk *__restrict walk, struct acomp_req *__restrict req, bool atomic) { struct scatterlist *src = req->src; struct scatterlist *dst = req->dst; walk->slen = req->slen; walk->dlen = req->dlen; if (!walk->slen || !walk->dlen) return -EINVAL; walk->flags = 0; if ((req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) && !atomic) walk->flags |= ACOMP_WALK_SLEEP; if ((req->base.flags & CRYPTO_ACOMP_REQ_SRC_VIRT)) walk->flags |= ACOMP_WALK_SRC_LINEAR; if ((req->base.flags & CRYPTO_ACOMP_REQ_DST_VIRT)) walk->flags |= ACOMP_WALK_DST_LINEAR; if ((walk->flags & ACOMP_WALK_SRC_LINEAR)) { walk->in.sg = (void *)req->svirt; walk->in.offset = 0; } else scatterwalk_start(&walk->in, src); if ((walk->flags & ACOMP_WALK_DST_LINEAR)) { walk->out.sg = (void *)req->dvirt; walk->out.offset = 0; } else scatterwalk_start(&walk->out, dst); return 0; } EXPORT_SYMBOL_GPL(acomp_walk_virt); struct acomp_req *acomp_request_clone(struct acomp_req *req, size_t total, gfp_t gfp) { struct acomp_req *nreq; nreq = container_of(crypto_request_clone(&req->base, total, gfp), struct acomp_req, base); if (nreq == req) return req; if (req->src == &req->chain.ssg) nreq->src = &nreq->chain.ssg; if (req->dst == &req->chain.dsg) nreq->dst = &nreq->chain.dsg; return nreq; } EXPORT_SYMBOL_GPL(acomp_request_clone); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Asynchronous compression type"); |
| 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Nano River Technologies viperboard driver * * This is the core driver for the viperboard. There are cell drivers * available for I2C, ADC and both GPIOs. SPI is not yet supported. * The drivers do not support all features the board exposes. See user * manual of the viperboard. * * (C) 2012 by Lemonage GmbH * Author: Lars Poeschel <poeschel@lemonage.de> * All rights reserved. */ #include <linux/kernel.h> #include <linux/errno.h> #include <linux/module.h> #include <linux/slab.h> #include <linux/types.h> #include <linux/mutex.h> #include <linux/mfd/core.h> #include <linux/mfd/viperboard.h> #include <linux/usb.h> static const struct usb_device_id vprbrd_table[] = { { USB_DEVICE(0x2058, 0x1005) }, /* Nano River Technologies */ { } /* Terminating entry */ }; MODULE_DEVICE_TABLE(usb, vprbrd_table); static const struct mfd_cell vprbrd_devs[] = { { .name = "viperboard-gpio", }, { .name = "viperboard-i2c", }, { .name = "viperboard-adc", }, }; static int vprbrd_probe(struct usb_interface *interface, const struct usb_device_id *id) { struct vprbrd *vb; u16 version = 0; int pipe, ret; /* allocate memory for our device state and initialize it */ vb = kzalloc(sizeof(*vb), GFP_KERNEL); if (!vb) return -ENOMEM; mutex_init(&vb->lock); vb->usb_dev = usb_get_dev(interface_to_usbdev(interface)); /* save our data pointer in this interface device */ usb_set_intfdata(interface, vb); dev_set_drvdata(&vb->pdev.dev, vb); /* get version information, major first, minor then */ pipe = usb_rcvctrlpipe(vb->usb_dev, 0); ret = usb_control_msg(vb->usb_dev, pipe, VPRBRD_USB_REQUEST_MAJOR, VPRBRD_USB_TYPE_IN, 0x0000, 0x0000, vb->buf, 1, VPRBRD_USB_TIMEOUT_MS); if (ret == 1) version = vb->buf[0]; ret = usb_control_msg(vb->usb_dev, pipe, VPRBRD_USB_REQUEST_MINOR, VPRBRD_USB_TYPE_IN, 0x0000, 0x0000, vb->buf, 1, VPRBRD_USB_TIMEOUT_MS); if (ret == 1) { version <<= 8; version = version | vb->buf[0]; } dev_info(&interface->dev, "version %x.%02x found at bus %03d address %03d\n", version >> 8, version & 0xff, vb->usb_dev->bus->busnum, vb->usb_dev->devnum); ret = mfd_add_hotplug_devices(&interface->dev, vprbrd_devs, ARRAY_SIZE(vprbrd_devs)); if (ret != 0) { dev_err(&interface->dev, "Failed to add mfd devices to core."); goto error; } return 0; error: if (vb) { usb_put_dev(vb->usb_dev); kfree(vb); } return ret; } static void vprbrd_disconnect(struct usb_interface *interface) { struct vprbrd *vb = usb_get_intfdata(interface); mfd_remove_devices(&interface->dev); usb_set_intfdata(interface, NULL); usb_put_dev(vb->usb_dev); kfree(vb); dev_dbg(&interface->dev, "disconnected\n"); } static struct usb_driver vprbrd_driver = { .name = "viperboard", .probe = vprbrd_probe, .disconnect = vprbrd_disconnect, .id_table = vprbrd_table, }; module_usb_driver(vprbrd_driver); MODULE_DESCRIPTION("Nano River Technologies viperboard mfd core driver"); MODULE_AUTHOR("Lars Poeschel <poeschel@lemonage.de>"); MODULE_LICENSE("GPL"); |
| 197 197 197 182 1 182 195 196 1 3 196 66 18 16 18 3 15 63 57 63 12 12 66 60 60 60 46 60 20 18 60 8 8 1 7 27 27 27 26 7 3 27 3 22 22 22 26 7 12 10 5 2 8 8 8 12 8 4 4 8 8 8 8 3 5 3 8 23 10 10 10 2 8 2 1 2 1 1 2 10 23 10 10 10 16 16 4 12 12 12 12 16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 | // SPDX-License-Identifier: GPL-2.0 /* * linux/fs/hfs/brec.c * * Copyright (C) 2001 * Brad Boyer (flar@allandria.com) * (C) 2003 Ardis Technologies <roman@ardistech.com> * * Handle individual btree records */ #include "btree.h" static struct hfs_bnode *hfs_bnode_split(struct hfs_find_data *fd); static int hfs_brec_update_parent(struct hfs_find_data *fd); static int hfs_btree_inc_height(struct hfs_btree *tree); /* Get the length and offset of the given record in the given node */ u16 hfs_brec_lenoff(struct hfs_bnode *node, u16 rec, u16 *off) { __be16 retval[2]; u16 dataoff; dataoff = node->tree->node_size - (rec + 2) * 2; hfs_bnode_read(node, retval, dataoff, 4); *off = be16_to_cpu(retval[1]); return be16_to_cpu(retval[0]) - *off; } /* Get the length of the key from a keyed record */ u16 hfs_brec_keylen(struct hfs_bnode *node, u16 rec) { u16 retval, recoff; if (node->type != HFS_NODE_INDEX && node->type != HFS_NODE_LEAF) return 0; if ((node->type == HFS_NODE_INDEX) && !(node->tree->attributes & HFS_TREE_VARIDXKEYS)) { if (node->tree->attributes & HFS_TREE_BIGKEYS) retval = node->tree->max_key_len + 2; else retval = node->tree->max_key_len + 1; } else { recoff = hfs_bnode_read_u16(node, node->tree->node_size - (rec + 1) * 2); if (!recoff) return 0; if (node->tree->attributes & HFS_TREE_BIGKEYS) { retval = hfs_bnode_read_u16(node, recoff) + 2; if (retval > node->tree->max_key_len + 2) { pr_err("keylen %d too large\n", retval); retval = 0; } } else { retval = (hfs_bnode_read_u8(node, recoff) | 1) + 1; if (retval > node->tree->max_key_len + 1) { pr_err("keylen %d too large\n", retval); retval = 0; } } } return retval; } int hfs_brec_insert(struct hfs_find_data *fd, void *entry, int entry_len) { struct hfs_btree *tree; struct hfs_bnode *node, *new_node; int size, key_len, rec; int data_off, end_off; int idx_rec_off, data_rec_off, end_rec_off; __be32 cnid; tree = fd->tree; if (!fd->bnode) { if (!tree->root) hfs_btree_inc_height(tree); node = hfs_bnode_find(tree, tree->leaf_head); if (IS_ERR(node)) return PTR_ERR(node); fd->bnode = node; fd->record = -1; } new_node = NULL; key_len = (fd->search_key->key_len | 1) + 1; again: /* new record idx and complete record size */ rec = fd->record + 1; size = key_len + entry_len; node = fd->bnode; hfs_bnode_dump(node); /* get last offset */ end_rec_off = tree->node_size - (node->num_recs + 1) * 2; end_off = hfs_bnode_read_u16(node, end_rec_off); end_rec_off -= 2; hfs_dbg(BNODE_MOD, "insert_rec: %d, %d, %d, %d\n", rec, size, end_off, end_rec_off); if (size > end_rec_off - end_off) { if (new_node) panic("not enough room!\n"); new_node = hfs_bnode_split(fd); if (IS_ERR(new_node)) return PTR_ERR(new_node); goto again; } if (node->type == HFS_NODE_LEAF) { tree->leaf_count++; mark_inode_dirty(tree->inode); } node->num_recs++; /* write new last offset */ hfs_bnode_write_u16(node, offsetof(struct hfs_bnode_desc, num_recs), node->num_recs); hfs_bnode_write_u16(node, end_rec_off, end_off + size); data_off = end_off; data_rec_off = end_rec_off + 2; idx_rec_off = tree->node_size - (rec + 1) * 2; if (idx_rec_off == data_rec_off) goto skip; /* move all following entries */ do { data_off = hfs_bnode_read_u16(node, data_rec_off + 2); hfs_bnode_write_u16(node, data_rec_off, data_off + size); data_rec_off += 2; } while (data_rec_off < idx_rec_off); /* move data away */ hfs_bnode_move(node, data_off + size, data_off, end_off - data_off); skip: hfs_bnode_write(node, fd->search_key, data_off, key_len); hfs_bnode_write(node, entry, data_off + key_len, entry_len); hfs_bnode_dump(node); /* * update parent key if we inserted a key * at the start of the node and it is not the new node */ if (!rec && new_node != node) { hfs_bnode_read_key(node, fd->search_key, data_off + size); hfs_brec_update_parent(fd); } if (new_node) { hfs_bnode_put(fd->bnode); if (!new_node->parent) { hfs_btree_inc_height(tree); new_node->parent = tree->root; } fd->bnode = hfs_bnode_find(tree, new_node->parent); /* create index data entry */ cnid = cpu_to_be32(new_node->this); entry = &cnid; entry_len = sizeof(cnid); /* get index key */ hfs_bnode_read_key(new_node, fd->search_key, 14); __hfs_brec_find(fd->bnode, fd); hfs_bnode_put(new_node); new_node = NULL; if (tree->attributes & HFS_TREE_VARIDXKEYS) key_len = fd->search_key->key_len + 1; else { fd->search_key->key_len = tree->max_key_len; key_len = tree->max_key_len + 1; } goto again; } return 0; } int hfs_brec_remove(struct hfs_find_data *fd) { struct hfs_btree *tree; struct hfs_bnode *node, *parent; int end_off, rec_off, data_off, size; tree = fd->tree; node = fd->bnode; again: rec_off = tree->node_size - (fd->record + 2) * 2; end_off = tree->node_size - (node->num_recs + 1) * 2; if (node->type == HFS_NODE_LEAF) { tree->leaf_count--; mark_inode_dirty(tree->inode); } hfs_bnode_dump(node); hfs_dbg(BNODE_MOD, "remove_rec: %d, %d\n", fd->record, fd->keylength + fd->entrylength); if (!--node->num_recs) { hfs_bnode_unlink(node); if (!node->parent) return 0; parent = hfs_bnode_find(tree, node->parent); if (IS_ERR(parent)) return PTR_ERR(parent); hfs_bnode_put(node); node = fd->bnode = parent; __hfs_brec_find(node, fd); goto again; } hfs_bnode_write_u16(node, offsetof(struct hfs_bnode_desc, num_recs), node->num_recs); if (rec_off == end_off) goto skip; size = fd->keylength + fd->entrylength; do { data_off = hfs_bnode_read_u16(node, rec_off); hfs_bnode_write_u16(node, rec_off + 2, data_off - size); rec_off -= 2; } while (rec_off >= end_off); /* fill hole */ hfs_bnode_move(node, fd->keyoffset, fd->keyoffset + size, data_off - fd->keyoffset - size); skip: hfs_bnode_dump(node); if (!fd->record) hfs_brec_update_parent(fd); return 0; } static struct hfs_bnode *hfs_bnode_split(struct hfs_find_data *fd) { struct hfs_btree *tree; struct hfs_bnode *node, *new_node, *next_node; struct hfs_bnode_desc node_desc; int num_recs, new_rec_off, new_off, old_rec_off; int data_start, data_end, size; tree = fd->tree; node = fd->bnode; new_node = hfs_bmap_alloc(tree); if (IS_ERR(new_node)) return new_node; hfs_bnode_get(node); hfs_dbg(BNODE_MOD, "split_nodes: %d - %d - %d\n", node->this, new_node->this, node->next); new_node->next = node->next; new_node->prev = node->this; new_node->parent = node->parent; new_node->type = node->type; new_node->height = node->height; if (node->next) next_node = hfs_bnode_find(tree, node->next); else next_node = NULL; if (IS_ERR(next_node)) { hfs_bnode_put(node); hfs_bnode_put(new_node); return next_node; } size = tree->node_size / 2 - node->num_recs * 2 - 14; old_rec_off = tree->node_size - 4; num_recs = 1; for (;;) { data_start = hfs_bnode_read_u16(node, old_rec_off); if (data_start > size) break; old_rec_off -= 2; if (++num_recs < node->num_recs) continue; /* panic? */ hfs_bnode_put(node); hfs_bnode_put(new_node); if (next_node) hfs_bnode_put(next_node); return ERR_PTR(-ENOSPC); } if (fd->record + 1 < num_recs) { /* new record is in the lower half, * so leave some more space there */ old_rec_off += 2; num_recs--; data_start = hfs_bnode_read_u16(node, old_rec_off); } else { hfs_bnode_put(node); hfs_bnode_get(new_node); fd->bnode = new_node; fd->record -= num_recs; fd->keyoffset -= data_start - 14; fd->entryoffset -= data_start - 14; } new_node->num_recs = node->num_recs - num_recs; node->num_recs = num_recs; new_rec_off = tree->node_size - 2; new_off = 14; size = data_start - new_off; num_recs = new_node->num_recs; data_end = data_start; while (num_recs) { hfs_bnode_write_u16(new_node, new_rec_off, new_off); old_rec_off -= 2; new_rec_off -= 2; data_end = hfs_bnode_read_u16(node, old_rec_off); new_off = data_end - size; num_recs--; } hfs_bnode_write_u16(new_node, new_rec_off, new_off); hfs_bnode_copy(new_node, 14, node, data_start, data_end - data_start); /* update new bnode header */ node_desc.next = cpu_to_be32(new_node->next); node_desc.prev = cpu_to_be32(new_node->prev); node_desc.type = new_node->type; node_desc.height = new_node->height; node_desc.num_recs = cpu_to_be16(new_node->num_recs); node_desc.reserved = 0; hfs_bnode_write(new_node, &node_desc, 0, sizeof(node_desc)); /* update previous bnode header */ node->next = new_node->this; hfs_bnode_read(node, &node_desc, 0, sizeof(node_desc)); node_desc.next = cpu_to_be32(node->next); node_desc.num_recs = cpu_to_be16(node->num_recs); hfs_bnode_write(node, &node_desc, 0, sizeof(node_desc)); /* update next bnode header */ if (next_node) { next_node->prev = new_node->this; hfs_bnode_read(next_node, &node_desc, 0, sizeof(node_desc)); node_desc.prev = cpu_to_be32(next_node->prev); hfs_bnode_write(next_node, &node_desc, 0, sizeof(node_desc)); hfs_bnode_put(next_node); } else if (node->this == tree->leaf_tail) { /* if there is no next node, this might be the new tail */ tree->leaf_tail = new_node->this; mark_inode_dirty(tree->inode); } hfs_bnode_dump(node); hfs_bnode_dump(new_node); hfs_bnode_put(node); return new_node; } static int hfs_brec_update_parent(struct hfs_find_data *fd) { struct hfs_btree *tree; struct hfs_bnode *node, *new_node, *parent; int newkeylen, diff; int rec, rec_off, end_rec_off; int start_off, end_off; tree = fd->tree; node = fd->bnode; new_node = NULL; if (!node->parent) return 0; again: parent = hfs_bnode_find(tree, node->parent); if (IS_ERR(parent)) return PTR_ERR(parent); __hfs_brec_find(parent, fd); if (fd->record < 0) return -ENOENT; hfs_bnode_dump(parent); rec = fd->record; /* size difference between old and new key */ if (tree->attributes & HFS_TREE_VARIDXKEYS) newkeylen = (hfs_bnode_read_u8(node, 14) | 1) + 1; else fd->keylength = newkeylen = tree->max_key_len + 1; hfs_dbg(BNODE_MOD, "update_rec: %d, %d, %d\n", rec, fd->keylength, newkeylen); rec_off = tree->node_size - (rec + 2) * 2; end_rec_off = tree->node_size - (parent->num_recs + 1) * 2; diff = newkeylen - fd->keylength; if (!diff) goto skip; if (diff > 0) { end_off = hfs_bnode_read_u16(parent, end_rec_off); if (end_rec_off - end_off < diff) { printk(KERN_DEBUG "splitting index node...\n"); fd->bnode = parent; new_node = hfs_bnode_split(fd); if (IS_ERR(new_node)) return PTR_ERR(new_node); parent = fd->bnode; rec = fd->record; rec_off = tree->node_size - (rec + 2) * 2; end_rec_off = tree->node_size - (parent->num_recs + 1) * 2; } } end_off = start_off = hfs_bnode_read_u16(parent, rec_off); hfs_bnode_write_u16(parent, rec_off, start_off + diff); start_off -= 4; /* move previous cnid too */ while (rec_off > end_rec_off) { rec_off -= 2; end_off = hfs_bnode_read_u16(parent, rec_off); hfs_bnode_write_u16(parent, rec_off, end_off + diff); } hfs_bnode_move(parent, start_off + diff, start_off, end_off - start_off); skip: hfs_bnode_copy(parent, fd->keyoffset, node, 14, newkeylen); if (!(tree->attributes & HFS_TREE_VARIDXKEYS)) hfs_bnode_write_u8(parent, fd->keyoffset, newkeylen - 1); hfs_bnode_dump(parent); hfs_bnode_put(node); node = parent; if (new_node) { __be32 cnid; if (!new_node->parent) { hfs_btree_inc_height(tree); new_node->parent = tree->root; } fd->bnode = hfs_bnode_find(tree, new_node->parent); /* create index key and entry */ hfs_bnode_read_key(new_node, fd->search_key, 14); cnid = cpu_to_be32(new_node->this); __hfs_brec_find(fd->bnode, fd); hfs_brec_insert(fd, &cnid, sizeof(cnid)); hfs_bnode_put(fd->bnode); hfs_bnode_put(new_node); if (!rec) { if (new_node == node) goto out; /* restore search_key */ hfs_bnode_read_key(node, fd->search_key, 14); } new_node = NULL; } if (!rec && node->parent) goto again; out: fd->bnode = node; return 0; } static int hfs_btree_inc_height(struct hfs_btree *tree) { struct hfs_bnode *node, *new_node; struct hfs_bnode_desc node_desc; int key_size, rec; __be32 cnid; node = NULL; if (tree->root) { node = hfs_bnode_find(tree, tree->root); if (IS_ERR(node)) return PTR_ERR(node); } new_node = hfs_bmap_alloc(tree); if (IS_ERR(new_node)) { hfs_bnode_put(node); return PTR_ERR(new_node); } tree->root = new_node->this; if (!tree->depth) { tree->leaf_head = tree->leaf_tail = new_node->this; new_node->type = HFS_NODE_LEAF; new_node->num_recs = 0; } else { new_node->type = HFS_NODE_INDEX; new_node->num_recs = 1; } new_node->parent = 0; new_node->next = 0; new_node->prev = 0; new_node->height = ++tree->depth; node_desc.next = cpu_to_be32(new_node->next); node_desc.prev = cpu_to_be32(new_node->prev); node_desc.type = new_node->type; node_desc.height = new_node->height; node_desc.num_recs = cpu_to_be16(new_node->num_recs); node_desc.reserved = 0; hfs_bnode_write(new_node, &node_desc, 0, sizeof(node_desc)); rec = tree->node_size - 2; hfs_bnode_write_u16(new_node, rec, 14); if (node) { /* insert old root idx into new root */ node->parent = tree->root; if (node->type == HFS_NODE_LEAF || tree->attributes & HFS_TREE_VARIDXKEYS) key_size = hfs_bnode_read_u8(node, 14) + 1; else key_size = tree->max_key_len + 1; hfs_bnode_copy(new_node, 14, node, 14, key_size); if (!(tree->attributes & HFS_TREE_VARIDXKEYS)) { key_size = tree->max_key_len + 1; hfs_bnode_write_u8(new_node, 14, tree->max_key_len); } key_size = (key_size + 1) & -2; cnid = cpu_to_be32(node->this); hfs_bnode_write(new_node, &cnid, 14 + key_size, 4); rec -= 2; hfs_bnode_write_u16(new_node, rec, 14 + key_size + 4); hfs_bnode_put(node); } hfs_bnode_put(new_node); mark_inode_dirty(tree->inode); return 0; } |
| 5 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 | /* * Copyright (c) 2006-2008 Intel Corporation * Copyright (c) 2007 Dave Airlie <airlied@linux.ie> * * DRM core CRTC related functions * * Permission to use, copy, modify, distribute, and sell this software and its * documentation for any purpose is hereby granted without fee, provided that * the above copyright notice appear in all copies and that both that copyright * notice and this permission notice appear in supporting documentation, and * that the name of the copyright holders not be used in advertising or * publicity pertaining to distribution of the software without specific, * written prior permission. The copyright holders make no representations * about the suitability of this software for any purpose. It is provided "as * is" without express or implied warranty. * * THE COPYRIGHT HOLDERS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, * INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO * EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY SPECIAL, INDIRECT OR * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, * DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER * TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE * OF THIS SOFTWARE. * * Authors: * Keith Packard * Eric Anholt <eric@anholt.net> * Dave Airlie <airlied@linux.ie> * Jesse Barnes <jesse.barnes@intel.com> */ #include <linux/export.h> #include <linux/kernel.h> #include <linux/moduleparam.h> #include <linux/dynamic_debug.h> #include <drm/drm_atomic.h> #include <drm/drm_atomic_helper.h> #include <drm/drm_atomic_uapi.h> #include <drm/drm_bridge.h> #include <drm/drm_crtc.h> #include <drm/drm_crtc_helper.h> #include <drm/drm_drv.h> #include <drm/drm_edid.h> #include <drm/drm_encoder.h> #include <drm/drm_fourcc.h> #include <drm/drm_framebuffer.h> #include <drm/drm_print.h> #include <drm/drm_vblank.h> #include "drm_crtc_helper_internal.h" DECLARE_DYNDBG_CLASSMAP(drm_debug_classes, DD_CLASS_TYPE_DISJOINT_BITS, 0, "DRM_UT_CORE", "DRM_UT_DRIVER", "DRM_UT_KMS", "DRM_UT_PRIME", "DRM_UT_ATOMIC", "DRM_UT_VBL", "DRM_UT_STATE", "DRM_UT_LEASE", "DRM_UT_DP", "DRM_UT_DRMRES"); /** * DOC: overview * * The CRTC modeset helper library provides a default set_config implementation * in drm_crtc_helper_set_config(). Plus a few other convenience functions using * the same callbacks which drivers can use to e.g. restore the modeset * configuration on resume with drm_helper_resume_force_mode(). * * Note that this helper library doesn't track the current power state of CRTCs * and encoders. It can call callbacks like &drm_encoder_helper_funcs.dpms even * though the hardware is already in the desired state. This deficiency has been * fixed in the atomic helpers. * * The driver callbacks are mostly compatible with the atomic modeset helpers, * except for the handling of the primary plane: Atomic helpers require that the * primary plane is implemented as a real standalone plane and not directly tied * to the CRTC state. For easier transition this library provides functions to * implement the old semantics required by the CRTC helpers using the new plane * and atomic helper callbacks. * * Drivers are strongly urged to convert to the atomic helpers (by way of first * converting to the plane helpers). New drivers must not use these functions * but need to implement the atomic interface instead, potentially using the * atomic helpers for that. * * These legacy modeset helpers use the same function table structures as * all other modesetting helpers. See the documentation for struct * &drm_crtc_helper_funcs, &struct drm_encoder_helper_funcs and struct * &drm_connector_helper_funcs. */ /** * drm_helper_encoder_in_use - check if a given encoder is in use * @encoder: encoder to check * * Checks whether @encoder is with the current mode setting output configuration * in use by any connector. This doesn't mean that it is actually enabled since * the DPMS state is tracked separately. * * Returns: * True if @encoder is used, false otherwise. */ bool drm_helper_encoder_in_use(struct drm_encoder *encoder) { struct drm_connector *connector; struct drm_connector_list_iter conn_iter; struct drm_device *dev = encoder->dev; drm_WARN_ON(dev, drm_drv_uses_atomic_modeset(dev)); /* * We can expect this mutex to be locked if we are not panicking. * Locking is currently fubar in the panic handler. */ if (!oops_in_progress) { drm_WARN_ON(dev, !mutex_is_locked(&dev->mode_config.mutex)); drm_WARN_ON(dev, !drm_modeset_is_locked(&dev->mode_config.connection_mutex)); } drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) { if (connector->encoder == encoder) { drm_connector_list_iter_end(&conn_iter); return true; } } drm_connector_list_iter_end(&conn_iter); return false; } EXPORT_SYMBOL(drm_helper_encoder_in_use); /** * drm_helper_crtc_in_use - check if a given CRTC is in a mode_config * @crtc: CRTC to check * * Checks whether @crtc is with the current mode setting output configuration * in use by any connector. This doesn't mean that it is actually enabled since * the DPMS state is tracked separately. * * Returns: * True if @crtc is used, false otherwise. */ bool drm_helper_crtc_in_use(struct drm_crtc *crtc) { struct drm_encoder *encoder; struct drm_device *dev = crtc->dev; drm_WARN_ON(dev, drm_drv_uses_atomic_modeset(dev)); /* * We can expect this mutex to be locked if we are not panicking. * Locking is currently fubar in the panic handler. */ if (!oops_in_progress) drm_WARN_ON(dev, !mutex_is_locked(&dev->mode_config.mutex)); drm_for_each_encoder(encoder, dev) if (encoder->crtc == crtc && drm_helper_encoder_in_use(encoder)) return true; return false; } EXPORT_SYMBOL(drm_helper_crtc_in_use); static void drm_encoder_disable(struct drm_encoder *encoder) { const struct drm_encoder_helper_funcs *encoder_funcs = encoder->helper_private; if (!encoder_funcs) return; if (encoder_funcs->disable) (*encoder_funcs->disable)(encoder); else if (encoder_funcs->dpms) (*encoder_funcs->dpms)(encoder, DRM_MODE_DPMS_OFF); } static void __drm_helper_disable_unused_functions(struct drm_device *dev) { struct drm_encoder *encoder; struct drm_crtc *crtc; drm_warn_on_modeset_not_all_locked(dev); drm_for_each_encoder(encoder, dev) { if (!drm_helper_encoder_in_use(encoder)) { drm_encoder_disable(encoder); /* disconnect encoder from any connector */ encoder->crtc = NULL; } } drm_for_each_crtc(crtc, dev) { const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private; crtc->enabled = drm_helper_crtc_in_use(crtc); if (!crtc->enabled) { if (crtc_funcs->disable) (*crtc_funcs->disable)(crtc); else (*crtc_funcs->dpms)(crtc, DRM_MODE_DPMS_OFF); crtc->primary->fb = NULL; } } } /** * drm_helper_disable_unused_functions - disable unused objects * @dev: DRM device * * This function walks through the entire mode setting configuration of @dev. It * will remove any CRTC links of unused encoders and encoder links of * disconnected connectors. Then it will disable all unused encoders and CRTCs * either by calling their disable callback if available or by calling their * dpms callback with DRM_MODE_DPMS_OFF. * * NOTE: * * This function is part of the legacy modeset helper library and will cause * major confusion with atomic drivers. This is because atomic helpers guarantee * to never call ->disable() hooks on a disabled function, or ->enable() hooks * on an enabled functions. drm_helper_disable_unused_functions() on the other * hand throws such guarantees into the wind and calls disable hooks * unconditionally on unused functions. */ void drm_helper_disable_unused_functions(struct drm_device *dev) { drm_WARN_ON(dev, drm_drv_uses_atomic_modeset(dev)); drm_modeset_lock_all(dev); __drm_helper_disable_unused_functions(dev); drm_modeset_unlock_all(dev); } EXPORT_SYMBOL(drm_helper_disable_unused_functions); /* * Check the CRTC we're going to map each output to vs. its current * CRTC. If they don't match, we have to disable the output and the CRTC * since the driver will have to re-route things. */ static void drm_crtc_prepare_encoders(struct drm_device *dev) { const struct drm_encoder_helper_funcs *encoder_funcs; struct drm_encoder *encoder; drm_for_each_encoder(encoder, dev) { encoder_funcs = encoder->helper_private; if (!encoder_funcs) continue; /* Disable unused encoders */ if (encoder->crtc == NULL) drm_encoder_disable(encoder); } } /** * drm_crtc_helper_set_mode - internal helper to set a mode * @crtc: CRTC to program * @mode: mode to use * @x: horizontal offset into the surface * @y: vertical offset into the surface * @old_fb: old framebuffer, for cleanup * * Try to set @mode on @crtc. Give @crtc and its associated connectors a chance * to fixup or reject the mode prior to trying to set it. This is an internal * helper that drivers could e.g. use to update properties that require the * entire output pipe to be disabled and re-enabled in a new configuration. For * example for changing whether audio is enabled on a hdmi link or for changing * panel fitter or dither attributes. It is also called by the * drm_crtc_helper_set_config() helper function to drive the mode setting * sequence. * * Returns: * True if the mode was set successfully, false otherwise. */ bool drm_crtc_helper_set_mode(struct drm_crtc *crtc, struct drm_display_mode *mode, int x, int y, struct drm_framebuffer *old_fb) { struct drm_device *dev = crtc->dev; struct drm_display_mode *adjusted_mode, saved_mode, saved_hwmode; const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private; const struct drm_encoder_helper_funcs *encoder_funcs; int saved_x, saved_y; bool saved_enabled; struct drm_encoder *encoder; bool ret = true; drm_WARN_ON(dev, drm_drv_uses_atomic_modeset(dev)); drm_warn_on_modeset_not_all_locked(dev); saved_enabled = crtc->enabled; crtc->enabled = drm_helper_crtc_in_use(crtc); if (!crtc->enabled) return true; adjusted_mode = drm_mode_duplicate(dev, mode); if (!adjusted_mode) { crtc->enabled = saved_enabled; return false; } drm_mode_init(&saved_mode, &crtc->mode); drm_mode_init(&saved_hwmode, &crtc->hwmode); saved_x = crtc->x; saved_y = crtc->y; /* Update crtc values up front so the driver can rely on them for mode * setting. */ drm_mode_copy(&crtc->mode, mode); crtc->x = x; crtc->y = y; /* Pass our mode to the connectors and the CRTC to give them a chance to * adjust it according to limitations or connector properties, and also * a chance to reject the mode entirely. */ drm_for_each_encoder(encoder, dev) { if (encoder->crtc != crtc) continue; encoder_funcs = encoder->helper_private; if (!encoder_funcs) continue; if (encoder_funcs->mode_fixup) { if (!(ret = encoder_funcs->mode_fixup(encoder, mode, adjusted_mode))) { drm_dbg_kms(dev, "[ENCODER:%d:%s] mode fixup failed\n", encoder->base.id, encoder->name); goto done; } } } if (crtc_funcs->mode_fixup) { if (!(ret = crtc_funcs->mode_fixup(crtc, mode, adjusted_mode))) { drm_dbg_kms(dev, "[CRTC:%d:%s] mode fixup failed\n", crtc->base.id, crtc->name); goto done; } } drm_dbg_kms(dev, "[CRTC:%d:%s]\n", crtc->base.id, crtc->name); drm_mode_copy(&crtc->hwmode, adjusted_mode); /* Prepare the encoders and CRTCs before setting the mode. */ drm_for_each_encoder(encoder, dev) { if (encoder->crtc != crtc) continue; encoder_funcs = encoder->helper_private; if (!encoder_funcs) continue; /* Disable the encoders as the first thing we do. */ if (encoder_funcs->prepare) encoder_funcs->prepare(encoder); } drm_crtc_prepare_encoders(dev); crtc_funcs->prepare(crtc); /* Set up the DPLL and any encoders state that needs to adjust or depend * on the DPLL. */ ret = !crtc_funcs->mode_set(crtc, mode, adjusted_mode, x, y, old_fb); if (!ret) goto done; drm_for_each_encoder(encoder, dev) { if (encoder->crtc != crtc) continue; encoder_funcs = encoder->helper_private; if (!encoder_funcs) continue; drm_dbg_kms(dev, "[ENCODER:%d:%s] set [MODE:%s]\n", encoder->base.id, encoder->name, mode->name); if (encoder_funcs->mode_set) encoder_funcs->mode_set(encoder, mode, adjusted_mode); } /* Now enable the clocks, plane, pipe, and connectors that we set up. */ crtc_funcs->commit(crtc); drm_for_each_encoder(encoder, dev) { if (encoder->crtc != crtc) continue; encoder_funcs = encoder->helper_private; if (!encoder_funcs) continue; if (encoder_funcs->commit) encoder_funcs->commit(encoder); } /* Calculate and store various constants which * are later needed by vblank and swap-completion * timestamping. They are derived from true hwmode. */ drm_calc_timestamping_constants(crtc, &crtc->hwmode); /* FIXME: add subpixel order */ done: drm_mode_destroy(dev, adjusted_mode); if (!ret) { crtc->enabled = saved_enabled; drm_mode_copy(&crtc->mode, &saved_mode); drm_mode_copy(&crtc->hwmode, &saved_hwmode); crtc->x = saved_x; crtc->y = saved_y; } return ret; } EXPORT_SYMBOL(drm_crtc_helper_set_mode); /** * drm_crtc_helper_atomic_check() - Helper to check CRTC atomic-state * @crtc: CRTC to check * @state: atomic state object * * Provides a default CRTC-state check handler for CRTCs that only have * one primary plane attached to it. This is often the case for the CRTC * of simple framebuffers. * * RETURNS: * Zero on success, or an errno code otherwise. */ int drm_crtc_helper_atomic_check(struct drm_crtc *crtc, struct drm_atomic_state *state) { struct drm_crtc_state *new_crtc_state = drm_atomic_get_new_crtc_state(state, crtc); if (!new_crtc_state->enable) return 0; return drm_atomic_helper_check_crtc_primary_plane(new_crtc_state); } EXPORT_SYMBOL(drm_crtc_helper_atomic_check); static void drm_crtc_helper_disable(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; struct drm_connector *connector; struct drm_encoder *encoder; /* Decouple all encoders and their attached connectors from this crtc */ drm_for_each_encoder(encoder, dev) { struct drm_connector_list_iter conn_iter; if (encoder->crtc != crtc) continue; drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) { if (connector->encoder != encoder) continue; connector->encoder = NULL; /* * drm_helper_disable_unused_functions() ought to be * doing this, but since we've decoupled the encoder * from the connector above, the required connection * between them is henceforth no longer available. */ connector->dpms = DRM_MODE_DPMS_OFF; /* we keep a reference while the encoder is bound */ drm_connector_put(connector); } drm_connector_list_iter_end(&conn_iter); } __drm_helper_disable_unused_functions(dev); } /* * For connectors that support multiple encoders, either the * .atomic_best_encoder() or .best_encoder() operation must be implemented. */ struct drm_encoder * drm_connector_get_single_encoder(struct drm_connector *connector) { struct drm_encoder *encoder; drm_WARN_ON(connector->dev, hweight32(connector->possible_encoders) > 1); drm_connector_for_each_possible_encoder(connector, encoder) return encoder; return NULL; } /** * drm_crtc_helper_set_config - set a new config from userspace * @set: mode set configuration * @ctx: lock acquire context, not used here * * The drm_crtc_helper_set_config() helper function implements the of * &drm_crtc_funcs.set_config callback for drivers using the legacy CRTC * helpers. * * It first tries to locate the best encoder for each connector by calling the * connector @drm_connector_helper_funcs.best_encoder helper operation. * * After locating the appropriate encoders, the helper function will call the * mode_fixup encoder and CRTC helper operations to adjust the requested mode, * or reject it completely in which case an error will be returned to the * application. If the new configuration after mode adjustment is identical to * the current configuration the helper function will return without performing * any other operation. * * If the adjusted mode is identical to the current mode but changes to the * frame buffer need to be applied, the drm_crtc_helper_set_config() function * will call the CRTC &drm_crtc_helper_funcs.mode_set_base helper operation. * * If the adjusted mode differs from the current mode, or if the * ->mode_set_base() helper operation is not provided, the helper function * performs a full mode set sequence by calling the ->prepare(), ->mode_set() * and ->commit() CRTC and encoder helper operations, in that order. * Alternatively it can also use the dpms and disable helper operations. For * details see &struct drm_crtc_helper_funcs and struct * &drm_encoder_helper_funcs. * * This function is deprecated. New drivers must implement atomic modeset * support, for which this function is unsuitable. Instead drivers should use * drm_atomic_helper_set_config(). * * Returns: * Returns 0 on success, negative errno numbers on failure. */ int drm_crtc_helper_set_config(struct drm_mode_set *set, struct drm_modeset_acquire_ctx *ctx) { struct drm_device *dev; struct drm_crtc **save_encoder_crtcs, *new_crtc; struct drm_encoder **save_connector_encoders, *new_encoder, *encoder; bool mode_changed = false; /* if true do a full mode set */ bool fb_changed = false; /* if true and !mode_changed just do a flip */ struct drm_connector *connector; struct drm_connector_list_iter conn_iter; int count = 0, ro, fail = 0; const struct drm_crtc_helper_funcs *crtc_funcs; struct drm_mode_set save_set; int ret; int i; BUG_ON(!set); BUG_ON(!set->crtc); BUG_ON(!set->crtc->helper_private); /* Enforce sane interface api - has been abused by the fb helper. */ BUG_ON(!set->mode && set->fb); BUG_ON(set->fb && set->num_connectors == 0); crtc_funcs = set->crtc->helper_private; dev = set->crtc->dev; drm_dbg_kms(dev, "\n"); drm_WARN_ON(dev, drm_drv_uses_atomic_modeset(dev)); if (!set->mode) set->fb = NULL; if (set->fb) { drm_dbg_kms(dev, "[CRTC:%d:%s] [FB:%d] #connectors=%d (x y) (%i %i)\n", set->crtc->base.id, set->crtc->name, set->fb->base.id, (int)set->num_connectors, set->x, set->y); } else { drm_dbg_kms(dev, "[CRTC:%d:%s] [NOFB]\n", set->crtc->base.id, set->crtc->name); drm_crtc_helper_disable(set->crtc); return 0; } drm_warn_on_modeset_not_all_locked(dev); /* * Allocate space for the backup of all (non-pointer) encoder and * connector data. */ save_encoder_crtcs = kcalloc(dev->mode_config.num_encoder, sizeof(struct drm_crtc *), GFP_KERNEL); if (!save_encoder_crtcs) return -ENOMEM; save_connector_encoders = kcalloc(dev->mode_config.num_connector, sizeof(struct drm_encoder *), GFP_KERNEL); if (!save_connector_encoders) { kfree(save_encoder_crtcs); return -ENOMEM; } /* * Copy data. Note that driver private data is not affected. * Should anything bad happen only the expected state is * restored, not the drivers personal bookkeeping. */ count = 0; drm_for_each_encoder(encoder, dev) { save_encoder_crtcs[count++] = encoder->crtc; } count = 0; drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) save_connector_encoders[count++] = connector->encoder; drm_connector_list_iter_end(&conn_iter); save_set.crtc = set->crtc; save_set.mode = &set->crtc->mode; save_set.x = set->crtc->x; save_set.y = set->crtc->y; save_set.fb = set->crtc->primary->fb; /* We should be able to check here if the fb has the same properties * and then just flip_or_move it */ if (set->crtc->primary->fb != set->fb) { /* If we have no fb then treat it as a full mode set */ if (set->crtc->primary->fb == NULL) { drm_dbg_kms(dev, "[CRTC:%d:%s] no fb, full mode set\n", set->crtc->base.id, set->crtc->name); mode_changed = true; } else if (set->fb->format != set->crtc->primary->fb->format) { mode_changed = true; } else fb_changed = true; } if (set->x != set->crtc->x || set->y != set->crtc->y) fb_changed = true; if (!drm_mode_equal(set->mode, &set->crtc->mode)) { drm_dbg_kms(dev, "[CRTC:%d:%s] modes are different, full mode set:\n", set->crtc->base.id, set->crtc->name); drm_dbg_kms(dev, DRM_MODE_FMT "\n", DRM_MODE_ARG(&set->crtc->mode)); drm_dbg_kms(dev, DRM_MODE_FMT "\n", DRM_MODE_ARG(set->mode)); mode_changed = true; } /* take a reference on all unbound connectors in set, reuse the * already taken reference for bound connectors */ for (ro = 0; ro < set->num_connectors; ro++) { if (set->connectors[ro]->encoder) continue; drm_connector_get(set->connectors[ro]); } /* a) traverse passed in connector list and get encoders for them */ count = 0; drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) { const struct drm_connector_helper_funcs *connector_funcs = connector->helper_private; new_encoder = connector->encoder; for (ro = 0; ro < set->num_connectors; ro++) { if (set->connectors[ro] == connector) { if (connector_funcs->best_encoder) new_encoder = connector_funcs->best_encoder(connector); else new_encoder = drm_connector_get_single_encoder(connector); /* if we can't get an encoder for a connector we are setting now - then fail */ if (new_encoder == NULL) /* don't break so fail path works correct */ fail = 1; if (connector->dpms != DRM_MODE_DPMS_ON) { drm_dbg_kms(dev, "[CONNECTOR:%d:%s] DPMS not on, full mode switch\n", connector->base.id, connector->name); mode_changed = true; } break; } } if (new_encoder != connector->encoder) { drm_dbg_kms(dev, "[CONNECTOR:%d:%s] encoder changed, full mode switch\n", connector->base.id, connector->name); mode_changed = true; /* If the encoder is reused for another connector, then * the appropriate crtc will be set later. */ if (connector->encoder) connector->encoder->crtc = NULL; connector->encoder = new_encoder; } } drm_connector_list_iter_end(&conn_iter); if (fail) { ret = -EINVAL; goto fail; } count = 0; drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) { if (!connector->encoder) continue; if (connector->encoder->crtc == set->crtc) new_crtc = NULL; else new_crtc = connector->encoder->crtc; for (ro = 0; ro < set->num_connectors; ro++) { if (set->connectors[ro] == connector) new_crtc = set->crtc; } /* Make sure the new CRTC will work with the encoder */ if (new_crtc && !drm_encoder_crtc_ok(connector->encoder, new_crtc)) { ret = -EINVAL; drm_connector_list_iter_end(&conn_iter); goto fail; } if (new_crtc != connector->encoder->crtc) { drm_dbg_kms(dev, "[CONNECTOR:%d:%s] CRTC changed, full mode switch\n", connector->base.id, connector->name); mode_changed = true; connector->encoder->crtc = new_crtc; } if (new_crtc) { drm_dbg_kms(dev, "[CONNECTOR:%d:%s] to [CRTC:%d:%s]\n", connector->base.id, connector->name, new_crtc->base.id, new_crtc->name); } else { drm_dbg_kms(dev, "[CONNECTOR:%d:%s] to [NOCRTC]\n", connector->base.id, connector->name); } } drm_connector_list_iter_end(&conn_iter); /* mode_set_base is not a required function */ if (fb_changed && !crtc_funcs->mode_set_base) mode_changed = true; if (mode_changed) { if (drm_helper_crtc_in_use(set->crtc)) { drm_dbg_kms(dev, "[CRTC:%d:%s] attempting to set mode from userspace: " DRM_MODE_FMT "\n", set->crtc->base.id, set->crtc->name, DRM_MODE_ARG(set->mode)); set->crtc->primary->fb = set->fb; if (!drm_crtc_helper_set_mode(set->crtc, set->mode, set->x, set->y, save_set.fb)) { drm_err(dev, "[CRTC:%d:%s] failed to set mode\n", set->crtc->base.id, set->crtc->name); set->crtc->primary->fb = save_set.fb; ret = -EINVAL; goto fail; } drm_dbg_kms(dev, "[CRTC:%d:%s] Setting connector DPMS state to on\n", set->crtc->base.id, set->crtc->name); for (i = 0; i < set->num_connectors; i++) { drm_dbg_kms(dev, "\t[CONNECTOR:%d:%s] set DPMS on\n", set->connectors[i]->base.id, set->connectors[i]->name); set->connectors[i]->funcs->dpms(set->connectors[i], DRM_MODE_DPMS_ON); } } __drm_helper_disable_unused_functions(dev); } else if (fb_changed) { set->crtc->x = set->x; set->crtc->y = set->y; set->crtc->primary->fb = set->fb; ret = crtc_funcs->mode_set_base(set->crtc, set->x, set->y, save_set.fb); if (ret != 0) { set->crtc->x = save_set.x; set->crtc->y = save_set.y; set->crtc->primary->fb = save_set.fb; goto fail; } } kfree(save_connector_encoders); kfree(save_encoder_crtcs); return 0; fail: /* Restore all previous data. */ count = 0; drm_for_each_encoder(encoder, dev) { encoder->crtc = save_encoder_crtcs[count++]; } count = 0; drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) connector->encoder = save_connector_encoders[count++]; drm_connector_list_iter_end(&conn_iter); /* after fail drop reference on all unbound connectors in set, let * bound connectors keep their reference */ for (ro = 0; ro < set->num_connectors; ro++) { if (set->connectors[ro]->encoder) continue; drm_connector_put(set->connectors[ro]); } /* Try to restore the config */ if (mode_changed && !drm_crtc_helper_set_mode(save_set.crtc, save_set.mode, save_set.x, save_set.y, save_set.fb)) drm_err(dev, "failed to restore config after modeset failure\n"); kfree(save_connector_encoders); kfree(save_encoder_crtcs); return ret; } EXPORT_SYMBOL(drm_crtc_helper_set_config); static int drm_helper_choose_encoder_dpms(struct drm_encoder *encoder) { int dpms = DRM_MODE_DPMS_OFF; struct drm_connector *connector; struct drm_connector_list_iter conn_iter; struct drm_device *dev = encoder->dev; drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) if (connector->encoder == encoder) if (connector->dpms < dpms) dpms = connector->dpms; drm_connector_list_iter_end(&conn_iter); return dpms; } /* Helper which handles bridge ordering around encoder dpms */ static void drm_helper_encoder_dpms(struct drm_encoder *encoder, int mode) { const struct drm_encoder_helper_funcs *encoder_funcs; encoder_funcs = encoder->helper_private; if (!encoder_funcs) return; if (encoder_funcs->dpms) encoder_funcs->dpms(encoder, mode); } static int drm_helper_choose_crtc_dpms(struct drm_crtc *crtc) { int dpms = DRM_MODE_DPMS_OFF; struct drm_connector *connector; struct drm_connector_list_iter conn_iter; struct drm_device *dev = crtc->dev; drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) if (connector->encoder && connector->encoder->crtc == crtc) if (connector->dpms < dpms) dpms = connector->dpms; drm_connector_list_iter_end(&conn_iter); return dpms; } /** * drm_helper_connector_dpms() - connector dpms helper implementation * @connector: affected connector * @mode: DPMS mode * * The drm_helper_connector_dpms() helper function implements the * &drm_connector_funcs.dpms callback for drivers using the legacy CRTC * helpers. * * This is the main helper function provided by the CRTC helper framework for * implementing the DPMS connector attribute. It computes the new desired DPMS * state for all encoders and CRTCs in the output mesh and calls the * &drm_crtc_helper_funcs.dpms and &drm_encoder_helper_funcs.dpms callbacks * provided by the driver. * * This function is deprecated. New drivers must implement atomic modeset * support, where DPMS is handled in the DRM core. * * Returns: * Always returns 0. */ int drm_helper_connector_dpms(struct drm_connector *connector, int mode) { struct drm_encoder *encoder = connector->encoder; struct drm_crtc *crtc = encoder ? encoder->crtc : NULL; int old_dpms, encoder_dpms = DRM_MODE_DPMS_OFF; drm_WARN_ON(connector->dev, drm_drv_uses_atomic_modeset(connector->dev)); if (mode == connector->dpms) return 0; old_dpms = connector->dpms; connector->dpms = mode; if (encoder) encoder_dpms = drm_helper_choose_encoder_dpms(encoder); /* from off to on, do crtc then encoder */ if (mode < old_dpms) { if (crtc) { const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private; if (crtc_funcs->dpms) (*crtc_funcs->dpms) (crtc, drm_helper_choose_crtc_dpms(crtc)); } if (encoder) drm_helper_encoder_dpms(encoder, encoder_dpms); } /* from on to off, do encoder then crtc */ if (mode > old_dpms) { if (encoder) drm_helper_encoder_dpms(encoder, encoder_dpms); if (crtc) { const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private; if (crtc_funcs->dpms) (*crtc_funcs->dpms) (crtc, drm_helper_choose_crtc_dpms(crtc)); } } return 0; } EXPORT_SYMBOL(drm_helper_connector_dpms); /** * drm_helper_resume_force_mode - force-restore mode setting configuration * @dev: drm_device which should be restored * * Drivers which use the mode setting helpers can use this function to * force-restore the mode setting configuration e.g. on resume or when something * else might have trampled over the hw state (like some overzealous old BIOSen * tended to do). * * This helper doesn't provide a error return value since restoring the old * config should never fail due to resource allocation issues since the driver * has successfully set the restored configuration already. Hence this should * boil down to the equivalent of a few dpms on calls, which also don't provide * an error code. * * Drivers where simply restoring an old configuration again might fail (e.g. * due to slight differences in allocating shared resources when the * configuration is restored in a different order than when userspace set it up) * need to use their own restore logic. * * This function is deprecated. New drivers should implement atomic mode- * setting and use the atomic suspend/resume helpers. * * See also: * drm_atomic_helper_suspend(), drm_atomic_helper_resume() */ void drm_helper_resume_force_mode(struct drm_device *dev) { struct drm_crtc *crtc; struct drm_encoder *encoder; const struct drm_crtc_helper_funcs *crtc_funcs; int encoder_dpms; bool ret; drm_WARN_ON(dev, drm_drv_uses_atomic_modeset(dev)); drm_modeset_lock_all(dev); drm_for_each_crtc(crtc, dev) { if (!crtc->enabled) continue; ret = drm_crtc_helper_set_mode(crtc, &crtc->mode, crtc->x, crtc->y, crtc->primary->fb); /* Restoring the old config should never fail! */ if (ret == false) drm_err(dev, "failed to set mode on crtc %p\n", crtc); /* Turn off outputs that were already powered off */ if (drm_helper_choose_crtc_dpms(crtc)) { drm_for_each_encoder(encoder, dev) { if(encoder->crtc != crtc) continue; encoder_dpms = drm_helper_choose_encoder_dpms( encoder); drm_helper_encoder_dpms(encoder, encoder_dpms); } crtc_funcs = crtc->helper_private; if (crtc_funcs->dpms) (*crtc_funcs->dpms) (crtc, drm_helper_choose_crtc_dpms(crtc)); } } /* disable the unused connectors while restoring the modesetting */ __drm_helper_disable_unused_functions(dev); drm_modeset_unlock_all(dev); } EXPORT_SYMBOL(drm_helper_resume_force_mode); /** * drm_helper_force_disable_all - Forcibly turn off all enabled CRTCs * @dev: DRM device whose CRTCs to turn off * * Drivers may want to call this on unload to ensure that all displays are * unlit and the GPU is in a consistent, low power state. Takes modeset locks. * * Note: This should only be used by non-atomic legacy drivers. For an atomic * version look at drm_atomic_helper_shutdown(). * * Returns: * Zero on success, error code on failure. */ int drm_helper_force_disable_all(struct drm_device *dev) { struct drm_crtc *crtc; int ret = 0; drm_modeset_lock_all(dev); drm_for_each_crtc(crtc, dev) if (crtc->enabled) { struct drm_mode_set set = { .crtc = crtc, }; ret = drm_mode_set_config_internal(&set); if (ret) goto out; } out: drm_modeset_unlock_all(dev); return ret; } EXPORT_SYMBOL(drm_helper_force_disable_all); |
| 2 2 1 1 1 1 3 3 2 1 3 2 2 2 2 2 2 2 2 2 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 | // SPDX-License-Identifier: GPL-2.0 /* * FPU register's regset abstraction, for ptrace, core dumps, etc. */ #include <linux/sched/task_stack.h> #include <linux/vmalloc.h> #include <asm/fpu/api.h> #include <asm/fpu/signal.h> #include <asm/fpu/regset.h> #include <asm/prctl.h> #include "context.h" #include "internal.h" #include "legacy.h" #include "xstate.h" /* * The xstateregs_active() routine is the same as the regset_fpregs_active() routine, * as the "regset->n" for the xstate regset will be updated based on the feature * capabilities supported by the xsave. */ int regset_fpregs_active(struct task_struct *target, const struct user_regset *regset) { return regset->n; } int regset_xregset_fpregs_active(struct task_struct *target, const struct user_regset *regset) { if (boot_cpu_has(X86_FEATURE_FXSR)) return regset->n; else return 0; } /* * The regset get() functions are invoked from: * * - coredump to dump the current task's fpstate. If the current task * owns the FPU then the memory state has to be synchronized and the * FPU register state preserved. Otherwise fpstate is already in sync. * * - ptrace to dump fpstate of a stopped task, in which case the registers * have already been saved to fpstate on context switch. */ static void sync_fpstate(struct fpu *fpu) { if (fpu == x86_task_fpu(current)) fpu_sync_fpstate(fpu); } /* * Invalidate cached FPU registers before modifying the stopped target * task's fpstate. * * This forces the target task on resume to restore the FPU registers from * modified fpstate. Otherwise the task might skip the restore and operate * with the cached FPU registers which discards the modifications. */ static void fpu_force_restore(struct fpu *fpu) { /* * Only stopped child tasks can be used to modify the FPU * state in the fpstate buffer: */ WARN_ON_FPU(fpu == x86_task_fpu(current)); __fpu_invalidate_fpregs_state(fpu); } int xfpregs_get(struct task_struct *target, const struct user_regset *regset, struct membuf to) { struct fpu *fpu = x86_task_fpu(target); if (!cpu_feature_enabled(X86_FEATURE_FXSR)) return -ENODEV; sync_fpstate(fpu); if (!use_xsave()) { return membuf_write(&to, &fpu->fpstate->regs.fxsave, sizeof(fpu->fpstate->regs.fxsave)); } copy_xstate_to_uabi_buf(to, target, XSTATE_COPY_FX); return 0; } int xfpregs_set(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { struct fpu *fpu = x86_task_fpu(target); struct fxregs_state newstate; int ret; if (!cpu_feature_enabled(X86_FEATURE_FXSR)) return -ENODEV; /* No funny business with partial or oversized writes is permitted. */ if (pos != 0 || count != sizeof(newstate)) return -EINVAL; ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &newstate, 0, -1); if (ret) return ret; /* Do not allow an invalid MXCSR value. */ if (newstate.mxcsr & ~mxcsr_feature_mask) return -EINVAL; fpu_force_restore(fpu); /* Copy the state */ memcpy(&fpu->fpstate->regs.fxsave, &newstate, sizeof(newstate)); /* Clear xmm8..15 for 32-bit callers */ BUILD_BUG_ON(sizeof(fpu->__fpstate.regs.fxsave.xmm_space) != 16 * 16); if (in_ia32_syscall()) memset(&fpu->fpstate->regs.fxsave.xmm_space[8*4], 0, 8 * 16); /* Mark FP and SSE as in use when XSAVE is enabled */ if (use_xsave()) fpu->fpstate->regs.xsave.header.xfeatures |= XFEATURE_MASK_FPSSE; return 0; } int xstateregs_get(struct task_struct *target, const struct user_regset *regset, struct membuf to) { if (!cpu_feature_enabled(X86_FEATURE_XSAVE)) return -ENODEV; sync_fpstate(x86_task_fpu(target)); copy_xstate_to_uabi_buf(to, target, XSTATE_COPY_XSAVE); return 0; } int xstateregs_set(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { struct fpu *fpu = x86_task_fpu(target); struct xregs_state *tmpbuf = NULL; int ret; if (!cpu_feature_enabled(X86_FEATURE_XSAVE)) return -ENODEV; /* * A whole standard-format XSAVE buffer is needed: */ if (pos != 0 || count != fpu_user_cfg.max_size) return -EFAULT; if (!kbuf) { tmpbuf = vmalloc(count); if (!tmpbuf) return -ENOMEM; if (copy_from_user(tmpbuf, ubuf, count)) { ret = -EFAULT; goto out; } } fpu_force_restore(fpu); ret = copy_uabi_from_kernel_to_xstate(fpu->fpstate, kbuf ?: tmpbuf, &target->thread.pkru); out: vfree(tmpbuf); return ret; } #ifdef CONFIG_X86_USER_SHADOW_STACK int ssp_active(struct task_struct *target, const struct user_regset *regset) { if (target->thread.features & ARCH_SHSTK_SHSTK) return regset->n; return 0; } int ssp_get(struct task_struct *target, const struct user_regset *regset, struct membuf to) { struct fpu *fpu = x86_task_fpu(target); struct cet_user_state *cetregs; if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK) || !ssp_active(target, regset)) return -ENODEV; sync_fpstate(fpu); cetregs = get_xsave_addr(&fpu->fpstate->regs.xsave, XFEATURE_CET_USER); if (WARN_ON(!cetregs)) { /* * This shouldn't ever be NULL because shadow stack was * verified to be enabled above. This means * MSR_IA32_U_CET.CET_SHSTK_EN should be 1 and so * XFEATURE_CET_USER should not be in the init state. */ return -ENODEV; } return membuf_write(&to, (unsigned long *)&cetregs->user_ssp, sizeof(cetregs->user_ssp)); } int ssp_set(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { struct fpu *fpu = x86_task_fpu(target); struct xregs_state *xsave = &fpu->fpstate->regs.xsave; struct cet_user_state *cetregs; unsigned long user_ssp; int r; if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK) || !ssp_active(target, regset)) return -ENODEV; if (pos != 0 || count != sizeof(user_ssp)) return -EINVAL; r = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &user_ssp, 0, -1); if (r) return r; /* * Some kernel instructions (IRET, etc) can cause exceptions in the case * of disallowed CET register values. Just prevent invalid values. */ if (user_ssp >= TASK_SIZE_MAX || !IS_ALIGNED(user_ssp, 8)) return -EINVAL; fpu_force_restore(fpu); cetregs = get_xsave_addr(xsave, XFEATURE_CET_USER); if (WARN_ON(!cetregs)) { /* * This shouldn't ever be NULL because shadow stack was * verified to be enabled above. This means * MSR_IA32_U_CET.CET_SHSTK_EN should be 1 and so * XFEATURE_CET_USER should not be in the init state. */ return -ENODEV; } cetregs->user_ssp = user_ssp; return 0; } #endif /* CONFIG_X86_USER_SHADOW_STACK */ #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION /* * FPU tag word conversions. */ static inline unsigned short twd_i387_to_fxsr(unsigned short twd) { unsigned int tmp; /* to avoid 16 bit prefixes in the code */ /* Transform each pair of bits into 01 (valid) or 00 (empty) */ tmp = ~twd; tmp = (tmp | (tmp>>1)) & 0x5555; /* 0V0V0V0V0V0V0V0V */ /* and move the valid bits to the lower byte. */ tmp = (tmp | (tmp >> 1)) & 0x3333; /* 00VV00VV00VV00VV */ tmp = (tmp | (tmp >> 2)) & 0x0f0f; /* 0000VVVV0000VVVV */ tmp = (tmp | (tmp >> 4)) & 0x00ff; /* 00000000VVVVVVVV */ return tmp; } #define FPREG_ADDR(f, n) ((void *)&(f)->st_space + (n) * 16) #define FP_EXP_TAG_VALID 0 #define FP_EXP_TAG_ZERO 1 #define FP_EXP_TAG_SPECIAL 2 #define FP_EXP_TAG_EMPTY 3 static inline u32 twd_fxsr_to_i387(struct fxregs_state *fxsave) { struct _fpxreg *st; u32 tos = (fxsave->swd >> 11) & 7; u32 twd = (unsigned long) fxsave->twd; u32 tag; u32 ret = 0xffff0000u; int i; for (i = 0; i < 8; i++, twd >>= 1) { if (twd & 0x1) { st = FPREG_ADDR(fxsave, (i - tos) & 7); switch (st->exponent & 0x7fff) { case 0x7fff: tag = FP_EXP_TAG_SPECIAL; break; case 0x0000: if (!st->significand[0] && !st->significand[1] && !st->significand[2] && !st->significand[3]) tag = FP_EXP_TAG_ZERO; else tag = FP_EXP_TAG_SPECIAL; break; default: if (st->significand[3] & 0x8000) tag = FP_EXP_TAG_VALID; else tag = FP_EXP_TAG_SPECIAL; break; } } else { tag = FP_EXP_TAG_EMPTY; } ret |= tag << (2 * i); } return ret; } /* * FXSR floating point environment conversions. */ static void __convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk, struct fxregs_state *fxsave) { struct _fpreg *to = (struct _fpreg *) &env->st_space[0]; struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0]; int i; env->cwd = fxsave->cwd | 0xffff0000u; env->swd = fxsave->swd | 0xffff0000u; env->twd = twd_fxsr_to_i387(fxsave); #ifdef CONFIG_X86_64 env->fip = fxsave->rip; env->foo = fxsave->rdp; /* * should be actually ds/cs at fpu exception time, but * that information is not available in 64bit mode. */ env->fcs = task_pt_regs(tsk)->cs; if (tsk == current) { savesegment(ds, env->fos); } else { env->fos = tsk->thread.ds; } env->fos |= 0xffff0000; #else env->fip = fxsave->fip; env->fcs = (u16) fxsave->fcs | ((u32) fxsave->fop << 16); env->foo = fxsave->foo; env->fos = fxsave->fos; #endif for (i = 0; i < 8; ++i) memcpy(&to[i], &from[i], sizeof(to[0])); } void convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk) { __convert_from_fxsr(env, tsk, &x86_task_fpu(tsk)->fpstate->regs.fxsave); } void convert_to_fxsr(struct fxregs_state *fxsave, const struct user_i387_ia32_struct *env) { struct _fpreg *from = (struct _fpreg *) &env->st_space[0]; struct _fpxreg *to = (struct _fpxreg *) &fxsave->st_space[0]; int i; fxsave->cwd = env->cwd; fxsave->swd = env->swd; fxsave->twd = twd_i387_to_fxsr(env->twd); fxsave->fop = (u16) ((u32) env->fcs >> 16); #ifdef CONFIG_X86_64 fxsave->rip = env->fip; fxsave->rdp = env->foo; /* cs and ds ignored */ #else fxsave->fip = env->fip; fxsave->fcs = (env->fcs & 0xffff); fxsave->foo = env->foo; fxsave->fos = env->fos; #endif for (i = 0; i < 8; ++i) memcpy(&to[i], &from[i], sizeof(from[0])); } int fpregs_get(struct task_struct *target, const struct user_regset *regset, struct membuf to) { struct fpu *fpu = x86_task_fpu(target); struct user_i387_ia32_struct env; struct fxregs_state fxsave, *fx; sync_fpstate(fpu); if (!cpu_feature_enabled(X86_FEATURE_FPU)) return fpregs_soft_get(target, regset, to); if (!cpu_feature_enabled(X86_FEATURE_FXSR)) { return membuf_write(&to, &fpu->fpstate->regs.fsave, sizeof(struct fregs_state)); } if (use_xsave()) { struct membuf mb = { .p = &fxsave, .left = sizeof(fxsave) }; /* Handle init state optimized xstate correctly */ copy_xstate_to_uabi_buf(mb, target, XSTATE_COPY_FP); fx = &fxsave; } else { fx = &fpu->fpstate->regs.fxsave; } __convert_from_fxsr(&env, target, fx); return membuf_write(&to, &env, sizeof(env)); } int fpregs_set(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { struct fpu *fpu = x86_task_fpu(target); struct user_i387_ia32_struct env; int ret; /* No funny business with partial or oversized writes is permitted. */ if (pos != 0 || count != sizeof(struct user_i387_ia32_struct)) return -EINVAL; if (!cpu_feature_enabled(X86_FEATURE_FPU)) return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf); ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1); if (ret) return ret; fpu_force_restore(fpu); if (cpu_feature_enabled(X86_FEATURE_FXSR)) convert_to_fxsr(&fpu->fpstate->regs.fxsave, &env); else memcpy(&fpu->fpstate->regs.fsave, &env, sizeof(env)); /* * Update the header bit in the xsave header, indicating the * presence of FP. */ if (cpu_feature_enabled(X86_FEATURE_XSAVE)) fpu->fpstate->regs.xsave.header.xfeatures |= XFEATURE_MASK_FP; return 0; } #endif /* CONFIG_X86_32 || CONFIG_IA32_EMULATION */ |
| 16 16 16 16 16 11 10 10 10 10 10 10 10 6 6 6 6 10 10 10 10 1 1 4 4 4 4 4 2 2 1 2 1 1 1 1 4 4 4 4 3 3 2 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 | // SPDX-License-Identifier: GPL-2.0-only /* * vivid-vbi-out.c - vbi output support functions. * * Copyright 2014 Cisco Systems, Inc. and/or its affiliates. All rights reserved. */ #include <linux/errno.h> #include <linux/kernel.h> #include <linux/videodev2.h> #include <media/v4l2-common.h> #include "vivid-core.h" #include "vivid-kthread-out.h" #include "vivid-vbi-out.h" #include "vivid-vbi-cap.h" static int vbi_out_queue_setup(struct vb2_queue *vq, unsigned *nbuffers, unsigned *nplanes, unsigned sizes[], struct device *alloc_devs[]) { struct vivid_dev *dev = vb2_get_drv_priv(vq); bool is_60hz = dev->std_out & V4L2_STD_525_60; unsigned size = vq->type == V4L2_BUF_TYPE_SLICED_VBI_OUTPUT ? 36 * sizeof(struct v4l2_sliced_vbi_data) : 1440 * 2 * (is_60hz ? 12 : 18); if (!vivid_is_svid_out(dev)) return -EINVAL; if (*nplanes) return sizes[0] < size ? -EINVAL : 0; sizes[0] = size; *nplanes = 1; return 0; } static int vbi_out_buf_prepare(struct vb2_buffer *vb) { struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); bool is_60hz = dev->std_out & V4L2_STD_525_60; unsigned size = vb->vb2_queue->type == V4L2_BUF_TYPE_SLICED_VBI_OUTPUT ? 36 * sizeof(struct v4l2_sliced_vbi_data) : 1440 * 2 * (is_60hz ? 12 : 18); dprintk(dev, 1, "%s\n", __func__); if (dev->buf_prepare_error) { /* * Error injection: test what happens if buf_prepare() returns * an error. */ dev->buf_prepare_error = false; return -EINVAL; } if (vb2_plane_size(vb, 0) < size) { dprintk(dev, 1, "%s data will not fit into plane (%lu < %u)\n", __func__, vb2_plane_size(vb, 0), size); return -EINVAL; } vb2_set_plane_payload(vb, 0, size); return 0; } static void vbi_out_buf_queue(struct vb2_buffer *vb) { struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); struct vivid_buffer *buf = container_of(vbuf, struct vivid_buffer, vb); dprintk(dev, 1, "%s\n", __func__); spin_lock(&dev->slock); list_add_tail(&buf->list, &dev->vbi_out_active); spin_unlock(&dev->slock); } static int vbi_out_start_streaming(struct vb2_queue *vq, unsigned count) { struct vivid_dev *dev = vb2_get_drv_priv(vq); int err; dprintk(dev, 1, "%s\n", __func__); dev->vbi_out_seq_count = 0; if (dev->start_streaming_error) { dev->start_streaming_error = false; err = -EINVAL; } else { err = vivid_start_generating_vid_out(dev, &dev->vbi_out_streaming); } if (err) { struct vivid_buffer *buf, *tmp; list_for_each_entry_safe(buf, tmp, &dev->vbi_out_active, list) { list_del(&buf->list); vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_QUEUED); } } return err; } /* abort streaming and wait for last buffer */ static void vbi_out_stop_streaming(struct vb2_queue *vq) { struct vivid_dev *dev = vb2_get_drv_priv(vq); dprintk(dev, 1, "%s\n", __func__); vivid_stop_generating_vid_out(dev, &dev->vbi_out_streaming); dev->vbi_out_have_wss = false; dev->vbi_out_have_cc[0] = false; dev->vbi_out_have_cc[1] = false; } static void vbi_out_buf_request_complete(struct vb2_buffer *vb) { struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); v4l2_ctrl_request_complete(vb->req_obj.req, &dev->ctrl_hdl_vbi_out); } const struct vb2_ops vivid_vbi_out_qops = { .queue_setup = vbi_out_queue_setup, .buf_prepare = vbi_out_buf_prepare, .buf_queue = vbi_out_buf_queue, .start_streaming = vbi_out_start_streaming, .stop_streaming = vbi_out_stop_streaming, .buf_request_complete = vbi_out_buf_request_complete, }; int vidioc_g_fmt_vbi_out(struct file *file, void *priv, struct v4l2_format *f) { struct vivid_dev *dev = video_drvdata(file); struct v4l2_vbi_format *vbi = &f->fmt.vbi; bool is_60hz = dev->std_out & V4L2_STD_525_60; if (!vivid_is_svid_out(dev) || !dev->has_raw_vbi_out) return -EINVAL; vbi->sampling_rate = 25000000; vbi->offset = 24; vbi->samples_per_line = 1440; vbi->sample_format = V4L2_PIX_FMT_GREY; vbi->start[0] = is_60hz ? V4L2_VBI_ITU_525_F1_START + 9 : V4L2_VBI_ITU_625_F1_START + 5; vbi->start[1] = is_60hz ? V4L2_VBI_ITU_525_F2_START + 9 : V4L2_VBI_ITU_625_F2_START + 5; vbi->count[0] = vbi->count[1] = is_60hz ? 12 : 18; vbi->flags = dev->vbi_cap_interlaced ? V4L2_VBI_INTERLACED : 0; vbi->reserved[0] = 0; vbi->reserved[1] = 0; return 0; } int vidioc_s_fmt_vbi_out(struct file *file, void *priv, struct v4l2_format *f) { struct vivid_dev *dev = video_drvdata(file); int ret = vidioc_g_fmt_vbi_out(file, priv, f); if (ret) return ret; if (vb2_is_busy(&dev->vb_vbi_out_q)) return -EBUSY; dev->stream_sliced_vbi_out = false; dev->vbi_out_dev.queue->type = V4L2_BUF_TYPE_VBI_OUTPUT; return 0; } int vidioc_g_fmt_sliced_vbi_out(struct file *file, void *fh, struct v4l2_format *fmt) { struct vivid_dev *dev = video_drvdata(file); struct v4l2_sliced_vbi_format *vbi = &fmt->fmt.sliced; if (!vivid_is_svid_out(dev) || !dev->has_sliced_vbi_out) return -EINVAL; vivid_fill_service_lines(vbi, dev->service_set_out); return 0; } int vidioc_try_fmt_sliced_vbi_out(struct file *file, void *fh, struct v4l2_format *fmt) { struct vivid_dev *dev = video_drvdata(file); struct v4l2_sliced_vbi_format *vbi = &fmt->fmt.sliced; bool is_60hz = dev->std_out & V4L2_STD_525_60; u32 service_set = vbi->service_set; if (!vivid_is_svid_out(dev) || !dev->has_sliced_vbi_out) return -EINVAL; service_set &= is_60hz ? V4L2_SLICED_CAPTION_525 : V4L2_SLICED_WSS_625 | V4L2_SLICED_TELETEXT_B; vivid_fill_service_lines(vbi, service_set); return 0; } int vidioc_s_fmt_sliced_vbi_out(struct file *file, void *fh, struct v4l2_format *fmt) { struct vivid_dev *dev = video_drvdata(file); struct v4l2_sliced_vbi_format *vbi = &fmt->fmt.sliced; int ret = vidioc_try_fmt_sliced_vbi_out(file, fh, fmt); if (ret) return ret; if (vb2_is_busy(&dev->vb_vbi_out_q)) return -EBUSY; dev->service_set_out = vbi->service_set; dev->stream_sliced_vbi_out = true; dev->vbi_out_dev.queue->type = V4L2_BUF_TYPE_SLICED_VBI_OUTPUT; return 0; } void vivid_sliced_vbi_out_process(struct vivid_dev *dev, struct vivid_buffer *buf) { struct v4l2_sliced_vbi_data *vbi = vb2_plane_vaddr(&buf->vb.vb2_buf, 0); unsigned elems = vb2_get_plane_payload(&buf->vb.vb2_buf, 0) / sizeof(*vbi); dev->vbi_out_have_cc[0] = false; dev->vbi_out_have_cc[1] = false; dev->vbi_out_have_wss = false; while (elems--) { switch (vbi->id) { case V4L2_SLICED_CAPTION_525: if ((dev->std_out & V4L2_STD_525_60) && vbi->line == 21) { dev->vbi_out_have_cc[!!vbi->field] = true; dev->vbi_out_cc[!!vbi->field][0] = vbi->data[0]; dev->vbi_out_cc[!!vbi->field][1] = vbi->data[1]; } break; case V4L2_SLICED_WSS_625: if ((dev->std_out & V4L2_STD_625_50) && vbi->field == 0 && vbi->line == 23) { dev->vbi_out_have_wss = true; dev->vbi_out_wss[0] = vbi->data[0]; dev->vbi_out_wss[1] = vbi->data[1]; } break; } vbi++; } } |
| 2 5 2 5 6 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 | // SPDX-License-Identifier: GPL-2.0-only /* * (C) 1999-2001 Paul `Rusty' Russell * (C) 2002-2006 Netfilter Core Team <coreteam@netfilter.org> * (C) 2011 Patrick McHardy <kaber@trash.net> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/module.h> #include <linux/skbuff.h> #include <linux/netfilter.h> #include <linux/netfilter/x_tables.h> #include <net/netfilter/nf_nat.h> static int xt_nat_checkentry_v0(const struct xt_tgchk_param *par) { const struct nf_nat_ipv4_multi_range_compat *mr = par->targinfo; if (mr->rangesize != 1) { pr_info_ratelimited("multiple ranges no longer supported\n"); return -EINVAL; } return nf_ct_netns_get(par->net, par->family); } static int xt_nat_checkentry(const struct xt_tgchk_param *par) { return nf_ct_netns_get(par->net, par->family); } static void xt_nat_destroy(const struct xt_tgdtor_param *par) { nf_ct_netns_put(par->net, par->family); } static void xt_nat_convert_range(struct nf_nat_range2 *dst, const struct nf_nat_ipv4_range *src) { memset(&dst->min_addr, 0, sizeof(dst->min_addr)); memset(&dst->max_addr, 0, sizeof(dst->max_addr)); memset(&dst->base_proto, 0, sizeof(dst->base_proto)); dst->flags = src->flags; dst->min_addr.ip = src->min_ip; dst->max_addr.ip = src->max_ip; dst->min_proto = src->min; dst->max_proto = src->max; } static unsigned int xt_snat_target_v0(struct sk_buff *skb, const struct xt_action_param *par) { const struct nf_nat_ipv4_multi_range_compat *mr = par->targinfo; struct nf_nat_range2 range; enum ip_conntrack_info ctinfo; struct nf_conn *ct; ct = nf_ct_get(skb, &ctinfo); WARN_ON(!(ct != NULL && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED || ctinfo == IP_CT_RELATED_REPLY))); xt_nat_convert_range(&range, &mr->range[0]); return nf_nat_setup_info(ct, &range, NF_NAT_MANIP_SRC); } static unsigned int xt_dnat_target_v0(struct sk_buff *skb, const struct xt_action_param *par) { const struct nf_nat_ipv4_multi_range_compat *mr = par->targinfo; struct nf_nat_range2 range; enum ip_conntrack_info ctinfo; struct nf_conn *ct; ct = nf_ct_get(skb, &ctinfo); WARN_ON(!(ct != NULL && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED))); xt_nat_convert_range(&range, &mr->range[0]); return nf_nat_setup_info(ct, &range, NF_NAT_MANIP_DST); } static unsigned int xt_snat_target_v1(struct sk_buff *skb, const struct xt_action_param *par) { const struct nf_nat_range *range_v1 = par->targinfo; struct nf_nat_range2 range; enum ip_conntrack_info ctinfo; struct nf_conn *ct; ct = nf_ct_get(skb, &ctinfo); WARN_ON(!(ct != NULL && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED || ctinfo == IP_CT_RELATED_REPLY))); memcpy(&range, range_v1, sizeof(*range_v1)); memset(&range.base_proto, 0, sizeof(range.base_proto)); return nf_nat_setup_info(ct, &range, NF_NAT_MANIP_SRC); } static unsigned int xt_dnat_target_v1(struct sk_buff *skb, const struct xt_action_param *par) { const struct nf_nat_range *range_v1 = par->targinfo; struct nf_nat_range2 range; enum ip_conntrack_info ctinfo; struct nf_conn *ct; ct = nf_ct_get(skb, &ctinfo); WARN_ON(!(ct != NULL && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED))); memcpy(&range, range_v1, sizeof(*range_v1)); memset(&range.base_proto, 0, sizeof(range.base_proto)); return nf_nat_setup_info(ct, &range, NF_NAT_MANIP_DST); } static unsigned int xt_snat_target_v2(struct sk_buff *skb, const struct xt_action_param *par) { const struct nf_nat_range2 *range = par->targinfo; enum ip_conntrack_info ctinfo; struct nf_conn *ct; ct = nf_ct_get(skb, &ctinfo); WARN_ON(!(ct != NULL && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED || ctinfo == IP_CT_RELATED_REPLY))); return nf_nat_setup_info(ct, range, NF_NAT_MANIP_SRC); } static unsigned int xt_dnat_target_v2(struct sk_buff *skb, const struct xt_action_param *par) { const struct nf_nat_range2 *range = par->targinfo; enum ip_conntrack_info ctinfo; struct nf_conn *ct; ct = nf_ct_get(skb, &ctinfo); WARN_ON(!(ct != NULL && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED))); return nf_nat_setup_info(ct, range, NF_NAT_MANIP_DST); } static struct xt_target xt_nat_target_reg[] __read_mostly = { { .name = "SNAT", .revision = 0, .checkentry = xt_nat_checkentry_v0, .destroy = xt_nat_destroy, .target = xt_snat_target_v0, .targetsize = sizeof(struct nf_nat_ipv4_multi_range_compat), .family = NFPROTO_IPV4, .table = "nat", .hooks = (1 << NF_INET_POST_ROUTING) | (1 << NF_INET_LOCAL_IN), .me = THIS_MODULE, }, { .name = "DNAT", .revision = 0, .checkentry = xt_nat_checkentry_v0, .destroy = xt_nat_destroy, .target = xt_dnat_target_v0, .targetsize = sizeof(struct nf_nat_ipv4_multi_range_compat), .family = NFPROTO_IPV4, .table = "nat", .hooks = (1 << NF_INET_PRE_ROUTING) | (1 << NF_INET_LOCAL_OUT), .me = THIS_MODULE, }, { .name = "SNAT", .revision = 1, .checkentry = xt_nat_checkentry, .destroy = xt_nat_destroy, .target = xt_snat_target_v1, .targetsize = sizeof(struct nf_nat_range), .table = "nat", .hooks = (1 << NF_INET_POST_ROUTING) | (1 << NF_INET_LOCAL_IN), .me = THIS_MODULE, }, { .name = "DNAT", .revision = 1, .checkentry = xt_nat_checkentry, .destroy = xt_nat_destroy, .target = xt_dnat_target_v1, .targetsize = sizeof(struct nf_nat_range), .table = "nat", .hooks = (1 << NF_INET_PRE_ROUTING) | (1 << NF_INET_LOCAL_OUT), .me = THIS_MODULE, }, { .name = "SNAT", .revision = 2, .checkentry = xt_nat_checkentry, .destroy = xt_nat_destroy, .target = xt_snat_target_v2, .targetsize = sizeof(struct nf_nat_range2), .table = "nat", .hooks = (1 << NF_INET_POST_ROUTING) | (1 << NF_INET_LOCAL_IN), .me = THIS_MODULE, }, { .name = "DNAT", .revision = 2, .checkentry = xt_nat_checkentry, .destroy = xt_nat_destroy, .target = xt_dnat_target_v2, .targetsize = sizeof(struct nf_nat_range2), .table = "nat", .hooks = (1 << NF_INET_PRE_ROUTING) | (1 << NF_INET_LOCAL_OUT), .me = THIS_MODULE, }, }; static int __init xt_nat_init(void) { return xt_register_targets(xt_nat_target_reg, ARRAY_SIZE(xt_nat_target_reg)); } static void __exit xt_nat_exit(void) { xt_unregister_targets(xt_nat_target_reg, ARRAY_SIZE(xt_nat_target_reg)); } module_init(xt_nat_init); module_exit(xt_nat_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>"); MODULE_ALIAS("ipt_SNAT"); MODULE_ALIAS("ipt_DNAT"); MODULE_ALIAS("ip6t_SNAT"); MODULE_ALIAS("ip6t_DNAT"); MODULE_DESCRIPTION("SNAT and DNAT targets support"); |
| 168 56 46 11 75 30 30 13 8 8 6 76 30 21 21 21 13 13 76 68 68 19 19 17 17 76 68 25 66 67 66 67 76 12 12 12 12 9 92 24 92 87 92 54 14 84 10 2 14 64 47 46 35 34 25 25 63 23 23 23 15 15 15 23 23 23 1 64 18 18 11 9 12 13 56 10 19 65 65 8 8 8 8 24 13 24 2 20 22 22 22 2 13 20 4 4 1 19 11 11 7 19 16 19 5 2 5 24 21 11 17 21 42 42 2 42 28 2 15 42 11 42 20 12 1 22 4 4 3 40 38 9 116 39 12 116 78 14 64 64 30 2 34 32 32 32 32 31 32 32 68 4 8 3 6 2 68 4 64 2 2 66 13 2 60 2 60 18 42 20 20 42 4 38 38 38 22 14 8 1 7 38 38 38 38 38 32 84 7 7 6 7 7 7 1 1 7 19 1 1 1 12 13 2 13 11 10 10 8 13 9 8 8 4 1 1 1 1 4 3 5 5 3 2 1 3 19 39 38 37 38 38 40 18 19 19 18 19 18 19 37 37 38 57 23 57 55 54 57 6 4 3 3 161 166 15 15 15 15 15 15 15 15 15 15 15 57 14 43 43 42 18 42 3 2 3 42 57 11 6 4 2 2 2 2 4 11 65 65 65 65 65 14 65 6 6 6 6 86 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (c) 2013 Nicira, Inc. */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/capability.h> #include <linux/module.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/slab.h> #include <linux/uaccess.h> #include <linux/skbuff.h> #include <linux/netdevice.h> #include <linux/in.h> #include <linux/tcp.h> #include <linux/udp.h> #include <linux/if_arp.h> #include <linux/init.h> #include <linux/in6.h> #include <linux/inetdevice.h> #include <linux/igmp.h> #include <linux/netfilter_ipv4.h> #include <linux/etherdevice.h> #include <linux/if_ether.h> #include <linux/if_vlan.h> #include <linux/rculist.h> #include <linux/err.h> #include <net/sock.h> #include <net/ip.h> #include <net/icmp.h> #include <net/protocol.h> #include <net/ip_tunnels.h> #include <net/arp.h> #include <net/checksum.h> #include <net/dsfield.h> #include <net/inet_ecn.h> #include <net/xfrm.h> #include <net/net_namespace.h> #include <net/netns/generic.h> #include <net/netdev_lock.h> #include <net/rtnetlink.h> #include <net/udp.h> #include <net/dst_metadata.h> #include <net/inet_dscp.h> #if IS_ENABLED(CONFIG_IPV6) #include <net/ipv6.h> #include <net/ip6_fib.h> #include <net/ip6_route.h> #endif static unsigned int ip_tunnel_hash(__be32 key, __be32 remote) { return hash_32((__force u32)key ^ (__force u32)remote, IP_TNL_HASH_BITS); } static bool ip_tunnel_key_match(const struct ip_tunnel_parm_kern *p, const unsigned long *flags, __be32 key) { if (!test_bit(IP_TUNNEL_KEY_BIT, flags)) return !test_bit(IP_TUNNEL_KEY_BIT, p->i_flags); return test_bit(IP_TUNNEL_KEY_BIT, p->i_flags) && p->i_key == key; } /* Fallback tunnel: no source, no destination, no key, no options Tunnel hash table: We require exact key match i.e. if a key is present in packet it will match only tunnel with the same key; if it is not present, it will match only keyless tunnel. All keysless packets, if not matched configured keyless tunnels will match fallback tunnel. Given src, dst and key, find appropriate for input tunnel. */ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn, int link, const unsigned long *flags, __be32 remote, __be32 local, __be32 key) { struct ip_tunnel *t, *cand = NULL; struct hlist_head *head; struct net_device *ndev; unsigned int hash; hash = ip_tunnel_hash(key, remote); head = &itn->tunnels[hash]; hlist_for_each_entry_rcu(t, head, hash_node) { if (local != t->parms.iph.saddr || remote != t->parms.iph.daddr || !(t->dev->flags & IFF_UP)) continue; if (!ip_tunnel_key_match(&t->parms, flags, key)) continue; if (READ_ONCE(t->parms.link) == link) return t; cand = t; } hlist_for_each_entry_rcu(t, head, hash_node) { if (remote != t->parms.iph.daddr || t->parms.iph.saddr != 0 || !(t->dev->flags & IFF_UP)) continue; if (!ip_tunnel_key_match(&t->parms, flags, key)) continue; if (READ_ONCE(t->parms.link) == link) return t; if (!cand) cand = t; } hash = ip_tunnel_hash(key, 0); head = &itn->tunnels[hash]; hlist_for_each_entry_rcu(t, head, hash_node) { if ((local != t->parms.iph.saddr || t->parms.iph.daddr != 0) && (local != t->parms.iph.daddr || !ipv4_is_multicast(local))) continue; if (!(t->dev->flags & IFF_UP)) continue; if (!ip_tunnel_key_match(&t->parms, flags, key)) continue; if (READ_ONCE(t->parms.link) == link) return t; if (!cand) cand = t; } hlist_for_each_entry_rcu(t, head, hash_node) { if ((!test_bit(IP_TUNNEL_NO_KEY_BIT, flags) && t->parms.i_key != key) || t->parms.iph.saddr != 0 || t->parms.iph.daddr != 0 || !(t->dev->flags & IFF_UP)) continue; if (READ_ONCE(t->parms.link) == link) return t; if (!cand) cand = t; } if (cand) return cand; t = rcu_dereference(itn->collect_md_tun); if (t && t->dev->flags & IFF_UP) return t; ndev = READ_ONCE(itn->fb_tunnel_dev); if (ndev && ndev->flags & IFF_UP) return netdev_priv(ndev); return NULL; } EXPORT_SYMBOL_GPL(ip_tunnel_lookup); static struct hlist_head *ip_bucket(struct ip_tunnel_net *itn, struct ip_tunnel_parm_kern *parms) { unsigned int h; __be32 remote; __be32 i_key = parms->i_key; if (parms->iph.daddr && !ipv4_is_multicast(parms->iph.daddr)) remote = parms->iph.daddr; else remote = 0; if (!test_bit(IP_TUNNEL_KEY_BIT, parms->i_flags) && test_bit(IP_TUNNEL_VTI_BIT, parms->i_flags)) i_key = 0; h = ip_tunnel_hash(i_key, remote); return &itn->tunnels[h]; } static void ip_tunnel_add(struct ip_tunnel_net *itn, struct ip_tunnel *t) { struct hlist_head *head = ip_bucket(itn, &t->parms); if (t->collect_md) rcu_assign_pointer(itn->collect_md_tun, t); hlist_add_head_rcu(&t->hash_node, head); } static void ip_tunnel_del(struct ip_tunnel_net *itn, struct ip_tunnel *t) { if (t->collect_md) rcu_assign_pointer(itn->collect_md_tun, NULL); hlist_del_init_rcu(&t->hash_node); } static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn, struct ip_tunnel_parm_kern *parms, int type) { __be32 remote = parms->iph.daddr; __be32 local = parms->iph.saddr; IP_TUNNEL_DECLARE_FLAGS(flags); __be32 key = parms->i_key; int link = parms->link; struct ip_tunnel *t = NULL; struct hlist_head *head = ip_bucket(itn, parms); ip_tunnel_flags_copy(flags, parms->i_flags); hlist_for_each_entry_rcu(t, head, hash_node, lockdep_rtnl_is_held()) { if (local == t->parms.iph.saddr && remote == t->parms.iph.daddr && link == READ_ONCE(t->parms.link) && type == t->dev->type && ip_tunnel_key_match(&t->parms, flags, key)) break; } return t; } static struct net_device *__ip_tunnel_create(struct net *net, const struct rtnl_link_ops *ops, struct ip_tunnel_parm_kern *parms) { int err; struct ip_tunnel *tunnel; struct net_device *dev; char name[IFNAMSIZ]; err = -E2BIG; if (parms->name[0]) { if (!dev_valid_name(parms->name)) goto failed; strscpy(name, parms->name); } else { if (strlen(ops->kind) > (IFNAMSIZ - 3)) goto failed; strscpy(name, ops->kind); strcat(name, "%d"); } ASSERT_RTNL(); dev = alloc_netdev(ops->priv_size, name, NET_NAME_UNKNOWN, ops->setup); if (!dev) { err = -ENOMEM; goto failed; } dev_net_set(dev, net); dev->rtnl_link_ops = ops; tunnel = netdev_priv(dev); tunnel->parms = *parms; tunnel->net = net; err = register_netdevice(dev); if (err) goto failed_free; return dev; failed_free: free_netdev(dev); failed: return ERR_PTR(err); } static int ip_tunnel_bind_dev(struct net_device *dev) { struct net_device *tdev = NULL; struct ip_tunnel *tunnel = netdev_priv(dev); const struct iphdr *iph; int hlen = LL_MAX_HEADER; int mtu = ETH_DATA_LEN; int t_hlen = tunnel->hlen + sizeof(struct iphdr); iph = &tunnel->parms.iph; /* Guess output device to choose reasonable mtu and needed_headroom */ if (iph->daddr) { struct flowi4 fl4; struct rtable *rt; ip_tunnel_init_flow(&fl4, iph->protocol, iph->daddr, iph->saddr, tunnel->parms.o_key, iph->tos & INET_DSCP_MASK, tunnel->net, tunnel->parms.link, tunnel->fwmark, 0, 0); rt = ip_route_output_key(tunnel->net, &fl4); if (!IS_ERR(rt)) { tdev = rt->dst.dev; ip_rt_put(rt); } if (dev->type != ARPHRD_ETHER) dev->flags |= IFF_POINTOPOINT; dst_cache_reset(&tunnel->dst_cache); } if (!tdev && tunnel->parms.link) tdev = __dev_get_by_index(tunnel->net, tunnel->parms.link); if (tdev) { hlen = tdev->hard_header_len + tdev->needed_headroom; mtu = min(tdev->mtu, IP_MAX_MTU); } dev->needed_headroom = t_hlen + hlen; mtu -= t_hlen + (dev->type == ARPHRD_ETHER ? dev->hard_header_len : 0); if (mtu < IPV4_MIN_MTU) mtu = IPV4_MIN_MTU; return mtu; } static struct ip_tunnel *ip_tunnel_create(struct net *net, struct ip_tunnel_net *itn, struct ip_tunnel_parm_kern *parms) { struct ip_tunnel *nt; struct net_device *dev; int t_hlen; int mtu; int err; dev = __ip_tunnel_create(net, itn->rtnl_link_ops, parms); if (IS_ERR(dev)) return ERR_CAST(dev); mtu = ip_tunnel_bind_dev(dev); err = dev_set_mtu(dev, mtu); if (err) goto err_dev_set_mtu; nt = netdev_priv(dev); t_hlen = nt->hlen + sizeof(struct iphdr); dev->min_mtu = ETH_MIN_MTU; dev->max_mtu = IP_MAX_MTU - t_hlen; if (dev->type == ARPHRD_ETHER) dev->max_mtu -= dev->hard_header_len; ip_tunnel_add(itn, nt); return nt; err_dev_set_mtu: unregister_netdevice(dev); return ERR_PTR(err); } void ip_tunnel_md_udp_encap(struct sk_buff *skb, struct ip_tunnel_info *info) { const struct iphdr *iph = ip_hdr(skb); const struct udphdr *udph; if (iph->protocol != IPPROTO_UDP) return; udph = (struct udphdr *)((__u8 *)iph + (iph->ihl << 2)); info->encap.sport = udph->source; info->encap.dport = udph->dest; } EXPORT_SYMBOL(ip_tunnel_md_udp_encap); int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, const struct tnl_ptk_info *tpi, struct metadata_dst *tun_dst, bool log_ecn_error) { const struct iphdr *iph = ip_hdr(skb); int nh, err; #ifdef CONFIG_NET_IPGRE_BROADCAST if (ipv4_is_multicast(iph->daddr)) { DEV_STATS_INC(tunnel->dev, multicast); skb->pkt_type = PACKET_BROADCAST; } #endif if (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.i_flags) != test_bit(IP_TUNNEL_CSUM_BIT, tpi->flags)) { DEV_STATS_INC(tunnel->dev, rx_crc_errors); DEV_STATS_INC(tunnel->dev, rx_errors); goto drop; } if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.i_flags)) { if (!test_bit(IP_TUNNEL_SEQ_BIT, tpi->flags) || (tunnel->i_seqno && (s32)(ntohl(tpi->seq) - tunnel->i_seqno) < 0)) { DEV_STATS_INC(tunnel->dev, rx_fifo_errors); DEV_STATS_INC(tunnel->dev, rx_errors); goto drop; } tunnel->i_seqno = ntohl(tpi->seq) + 1; } /* Save offset of outer header relative to skb->head, * because we are going to reset the network header to the inner header * and might change skb->head. */ nh = skb_network_header(skb) - skb->head; skb_set_network_header(skb, (tunnel->dev->type == ARPHRD_ETHER) ? ETH_HLEN : 0); if (!pskb_inet_may_pull(skb)) { DEV_STATS_INC(tunnel->dev, rx_length_errors); DEV_STATS_INC(tunnel->dev, rx_errors); goto drop; } iph = (struct iphdr *)(skb->head + nh); err = IP_ECN_decapsulate(iph, skb); if (unlikely(err)) { if (log_ecn_error) net_info_ratelimited("non-ECT from %pI4 with TOS=%#x\n", &iph->saddr, iph->tos); if (err > 1) { DEV_STATS_INC(tunnel->dev, rx_frame_errors); DEV_STATS_INC(tunnel->dev, rx_errors); goto drop; } } dev_sw_netstats_rx_add(tunnel->dev, skb->len); skb_scrub_packet(skb, !net_eq(tunnel->net, dev_net(tunnel->dev))); if (tunnel->dev->type == ARPHRD_ETHER) { skb->protocol = eth_type_trans(skb, tunnel->dev); skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); } else { skb->dev = tunnel->dev; } if (tun_dst) skb_dst_set(skb, (struct dst_entry *)tun_dst); gro_cells_receive(&tunnel->gro_cells, skb); return 0; drop: if (tun_dst) dst_release((struct dst_entry *)tun_dst); kfree_skb(skb); return 0; } EXPORT_SYMBOL_GPL(ip_tunnel_rcv); int ip_tunnel_encap_add_ops(const struct ip_tunnel_encap_ops *ops, unsigned int num) { if (num >= MAX_IPTUN_ENCAP_OPS) return -ERANGE; return !cmpxchg((const struct ip_tunnel_encap_ops **) &iptun_encaps[num], NULL, ops) ? 0 : -1; } EXPORT_SYMBOL(ip_tunnel_encap_add_ops); int ip_tunnel_encap_del_ops(const struct ip_tunnel_encap_ops *ops, unsigned int num) { int ret; if (num >= MAX_IPTUN_ENCAP_OPS) return -ERANGE; ret = (cmpxchg((const struct ip_tunnel_encap_ops **) &iptun_encaps[num], ops, NULL) == ops) ? 0 : -1; synchronize_net(); return ret; } EXPORT_SYMBOL(ip_tunnel_encap_del_ops); int ip_tunnel_encap_setup(struct ip_tunnel *t, struct ip_tunnel_encap *ipencap) { int hlen; memset(&t->encap, 0, sizeof(t->encap)); hlen = ip_encap_hlen(ipencap); if (hlen < 0) return hlen; t->encap.type = ipencap->type; t->encap.sport = ipencap->sport; t->encap.dport = ipencap->dport; t->encap.flags = ipencap->flags; t->encap_hlen = hlen; t->hlen = t->encap_hlen + t->tun_hlen; return 0; } EXPORT_SYMBOL_GPL(ip_tunnel_encap_setup); static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb, struct rtable *rt, __be16 df, const struct iphdr *inner_iph, int tunnel_hlen, __be32 dst, bool md) { struct ip_tunnel *tunnel = netdev_priv(dev); int pkt_size; int mtu; tunnel_hlen = md ? tunnel_hlen : tunnel->hlen; pkt_size = skb->len - tunnel_hlen; pkt_size -= dev->type == ARPHRD_ETHER ? dev->hard_header_len : 0; if (df) { mtu = dst_mtu(&rt->dst) - (sizeof(struct iphdr) + tunnel_hlen); mtu -= dev->type == ARPHRD_ETHER ? dev->hard_header_len : 0; } else { mtu = skb_valid_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu; } if (skb_valid_dst(skb)) skb_dst_update_pmtu_no_confirm(skb, mtu); if (skb->protocol == htons(ETH_P_IP)) { if (!skb_is_gso(skb) && (inner_iph->frag_off & htons(IP_DF)) && mtu < pkt_size) { icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); return -E2BIG; } } #if IS_ENABLED(CONFIG_IPV6) else if (skb->protocol == htons(ETH_P_IPV6)) { struct rt6_info *rt6; __be32 daddr; rt6 = skb_valid_dst(skb) ? dst_rt6_info(skb_dst(skb)) : NULL; daddr = md ? dst : tunnel->parms.iph.daddr; if (rt6 && mtu < dst_mtu(skb_dst(skb)) && mtu >= IPV6_MIN_MTU) { if ((daddr && !ipv4_is_multicast(daddr)) || rt6->rt6i_dst.plen == 128) { rt6->rt6i_flags |= RTF_MODIFIED; dst_metric_set(skb_dst(skb), RTAX_MTU, mtu); } } if (!skb_is_gso(skb) && mtu >= IPV6_MIN_MTU && mtu < pkt_size) { icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); return -E2BIG; } } #endif return 0; } static void ip_tunnel_adj_headroom(struct net_device *dev, unsigned int headroom) { /* we must cap headroom to some upperlimit, else pskb_expand_head * will overflow header offsets in skb_headers_offset_update(). */ static const unsigned int max_allowed = 512; if (headroom > max_allowed) headroom = max_allowed; if (headroom > READ_ONCE(dev->needed_headroom)) WRITE_ONCE(dev->needed_headroom, headroom); } void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, u8 proto, int tunnel_hlen) { struct ip_tunnel *tunnel = netdev_priv(dev); u32 headroom = sizeof(struct iphdr); struct ip_tunnel_info *tun_info; const struct ip_tunnel_key *key; const struct iphdr *inner_iph; struct rtable *rt = NULL; struct flowi4 fl4; __be16 df = 0; u8 tos, ttl; bool use_cache; tun_info = skb_tunnel_info(skb); if (unlikely(!tun_info || !(tun_info->mode & IP_TUNNEL_INFO_TX) || ip_tunnel_info_af(tun_info) != AF_INET)) goto tx_error; key = &tun_info->key; memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); inner_iph = (const struct iphdr *)skb_inner_network_header(skb); tos = key->tos; if (tos == 1) { if (skb->protocol == htons(ETH_P_IP)) tos = inner_iph->tos; else if (skb->protocol == htons(ETH_P_IPV6)) tos = ipv6_get_dsfield((const struct ipv6hdr *)inner_iph); } ip_tunnel_init_flow(&fl4, proto, key->u.ipv4.dst, key->u.ipv4.src, tunnel_id_to_key32(key->tun_id), tos & INET_DSCP_MASK, tunnel->net, 0, skb->mark, skb_get_hash(skb), key->flow_flags); if (!tunnel_hlen) tunnel_hlen = ip_encap_hlen(&tun_info->encap); if (ip_tunnel_encap(skb, &tun_info->encap, &proto, &fl4) < 0) goto tx_error; use_cache = ip_tunnel_dst_cache_usable(skb, tun_info); if (use_cache) rt = dst_cache_get_ip4(&tun_info->dst_cache, &fl4.saddr); if (!rt) { rt = ip_route_output_key(tunnel->net, &fl4); if (IS_ERR(rt)) { DEV_STATS_INC(dev, tx_carrier_errors); goto tx_error; } if (use_cache) dst_cache_set_ip4(&tun_info->dst_cache, &rt->dst, fl4.saddr); } if (rt->dst.dev == dev) { ip_rt_put(rt); DEV_STATS_INC(dev, collisions); goto tx_error; } if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags)) df = htons(IP_DF); if (tnl_update_pmtu(dev, skb, rt, df, inner_iph, tunnel_hlen, key->u.ipv4.dst, true)) { ip_rt_put(rt); goto tx_error; } tos = ip_tunnel_ecn_encap(tos, inner_iph, skb); ttl = key->ttl; if (ttl == 0) { if (skb->protocol == htons(ETH_P_IP)) ttl = inner_iph->ttl; else if (skb->protocol == htons(ETH_P_IPV6)) ttl = ((const struct ipv6hdr *)inner_iph)->hop_limit; else ttl = ip4_dst_hoplimit(&rt->dst); } headroom += LL_RESERVED_SPACE(rt->dst.dev) + rt->dst.header_len; if (skb_cow_head(skb, headroom)) { ip_rt_put(rt); goto tx_dropped; } ip_tunnel_adj_headroom(dev, headroom); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, proto, tos, ttl, df, !net_eq(tunnel->net, dev_net(dev))); return; tx_error: DEV_STATS_INC(dev, tx_errors); goto kfree; tx_dropped: DEV_STATS_INC(dev, tx_dropped); kfree: kfree_skb(skb); } EXPORT_SYMBOL_GPL(ip_md_tunnel_xmit); void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, const struct iphdr *tnl_params, u8 protocol) { struct ip_tunnel *tunnel = netdev_priv(dev); struct ip_tunnel_info *tun_info = NULL; const struct iphdr *inner_iph; unsigned int max_headroom; /* The extra header space needed */ struct rtable *rt = NULL; /* Route to the other host */ __be16 payload_protocol; bool use_cache = false; struct flowi4 fl4; bool md = false; bool connected; u8 tos, ttl; __be32 dst; __be16 df; inner_iph = (const struct iphdr *)skb_inner_network_header(skb); connected = (tunnel->parms.iph.daddr != 0); payload_protocol = skb_protocol(skb, true); memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); dst = tnl_params->daddr; if (dst == 0) { /* NBMA tunnel */ if (!skb_dst(skb)) { DEV_STATS_INC(dev, tx_fifo_errors); goto tx_error; } tun_info = skb_tunnel_info(skb); if (tun_info && (tun_info->mode & IP_TUNNEL_INFO_TX) && ip_tunnel_info_af(tun_info) == AF_INET && tun_info->key.u.ipv4.dst) { dst = tun_info->key.u.ipv4.dst; md = true; connected = true; } else if (payload_protocol == htons(ETH_P_IP)) { rt = skb_rtable(skb); dst = rt_nexthop(rt, inner_iph->daddr); } #if IS_ENABLED(CONFIG_IPV6) else if (payload_protocol == htons(ETH_P_IPV6)) { const struct in6_addr *addr6; struct neighbour *neigh; bool do_tx_error_icmp; int addr_type; neigh = dst_neigh_lookup(skb_dst(skb), &ipv6_hdr(skb)->daddr); if (!neigh) goto tx_error; addr6 = (const struct in6_addr *)&neigh->primary_key; addr_type = ipv6_addr_type(addr6); if (addr_type == IPV6_ADDR_ANY) { addr6 = &ipv6_hdr(skb)->daddr; addr_type = ipv6_addr_type(addr6); } if ((addr_type & IPV6_ADDR_COMPATv4) == 0) do_tx_error_icmp = true; else { do_tx_error_icmp = false; dst = addr6->s6_addr32[3]; } neigh_release(neigh); if (do_tx_error_icmp) goto tx_error_icmp; } #endif else goto tx_error; if (!md) connected = false; } tos = tnl_params->tos; if (tos & 0x1) { tos &= ~0x1; if (payload_protocol == htons(ETH_P_IP)) { tos = inner_iph->tos; connected = false; } else if (payload_protocol == htons(ETH_P_IPV6)) { tos = ipv6_get_dsfield((const struct ipv6hdr *)inner_iph); connected = false; } } ip_tunnel_init_flow(&fl4, protocol, dst, tnl_params->saddr, tunnel->parms.o_key, tos & INET_DSCP_MASK, tunnel->net, READ_ONCE(tunnel->parms.link), tunnel->fwmark, skb_get_hash(skb), 0); if (ip_tunnel_encap(skb, &tunnel->encap, &protocol, &fl4) < 0) goto tx_error; if (connected && md) { use_cache = ip_tunnel_dst_cache_usable(skb, tun_info); if (use_cache) rt = dst_cache_get_ip4(&tun_info->dst_cache, &fl4.saddr); } else { rt = connected ? dst_cache_get_ip4(&tunnel->dst_cache, &fl4.saddr) : NULL; } if (!rt) { rt = ip_route_output_key(tunnel->net, &fl4); if (IS_ERR(rt)) { DEV_STATS_INC(dev, tx_carrier_errors); goto tx_error; } if (use_cache) dst_cache_set_ip4(&tun_info->dst_cache, &rt->dst, fl4.saddr); else if (!md && connected) dst_cache_set_ip4(&tunnel->dst_cache, &rt->dst, fl4.saddr); } if (rt->dst.dev == dev) { ip_rt_put(rt); DEV_STATS_INC(dev, collisions); goto tx_error; } df = tnl_params->frag_off; if (payload_protocol == htons(ETH_P_IP) && !tunnel->ignore_df) df |= (inner_iph->frag_off & htons(IP_DF)); if (tnl_update_pmtu(dev, skb, rt, df, inner_iph, 0, 0, false)) { ip_rt_put(rt); goto tx_error; } if (tunnel->err_count > 0) { if (time_before(jiffies, tunnel->err_time + IPTUNNEL_ERR_TIMEO)) { tunnel->err_count--; dst_link_failure(skb); } else tunnel->err_count = 0; } tos = ip_tunnel_ecn_encap(tos, inner_iph, skb); ttl = tnl_params->ttl; if (ttl == 0) { if (payload_protocol == htons(ETH_P_IP)) ttl = inner_iph->ttl; #if IS_ENABLED(CONFIG_IPV6) else if (payload_protocol == htons(ETH_P_IPV6)) ttl = ((const struct ipv6hdr *)inner_iph)->hop_limit; #endif else ttl = ip4_dst_hoplimit(&rt->dst); } max_headroom = LL_RESERVED_SPACE(rt->dst.dev) + sizeof(struct iphdr) + rt->dst.header_len + ip_encap_hlen(&tunnel->encap); if (skb_cow_head(skb, max_headroom)) { ip_rt_put(rt); DEV_STATS_INC(dev, tx_dropped); kfree_skb(skb); return; } ip_tunnel_adj_headroom(dev, max_headroom); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, protocol, tos, ttl, df, !net_eq(tunnel->net, dev_net(dev))); return; #if IS_ENABLED(CONFIG_IPV6) tx_error_icmp: dst_link_failure(skb); #endif tx_error: DEV_STATS_INC(dev, tx_errors); kfree_skb(skb); } EXPORT_SYMBOL_GPL(ip_tunnel_xmit); static void ip_tunnel_update(struct ip_tunnel_net *itn, struct ip_tunnel *t, struct net_device *dev, struct ip_tunnel_parm_kern *p, bool set_mtu, __u32 fwmark) { ip_tunnel_del(itn, t); t->parms.iph.saddr = p->iph.saddr; t->parms.iph.daddr = p->iph.daddr; t->parms.i_key = p->i_key; t->parms.o_key = p->o_key; if (dev->type != ARPHRD_ETHER) { __dev_addr_set(dev, &p->iph.saddr, 4); memcpy(dev->broadcast, &p->iph.daddr, 4); } ip_tunnel_add(itn, t); t->parms.iph.ttl = p->iph.ttl; t->parms.iph.tos = p->iph.tos; t->parms.iph.frag_off = p->iph.frag_off; if (t->parms.link != p->link || t->fwmark != fwmark) { int mtu; WRITE_ONCE(t->parms.link, p->link); t->fwmark = fwmark; mtu = ip_tunnel_bind_dev(dev); if (set_mtu) WRITE_ONCE(dev->mtu, mtu); } dst_cache_reset(&t->dst_cache); netdev_state_change(dev); } int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd) { int err = 0; struct ip_tunnel *t = netdev_priv(dev); struct net *net = t->net; struct ip_tunnel_net *itn = net_generic(net, t->ip_tnl_net_id); switch (cmd) { case SIOCGETTUNNEL: if (dev == itn->fb_tunnel_dev) { t = ip_tunnel_find(itn, p, itn->fb_tunnel_dev->type); if (!t) t = netdev_priv(dev); } memcpy(p, &t->parms, sizeof(*p)); break; case SIOCADDTUNNEL: case SIOCCHGTUNNEL: err = -EPERM; if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) goto done; if (p->iph.ttl) p->iph.frag_off |= htons(IP_DF); if (!test_bit(IP_TUNNEL_VTI_BIT, p->i_flags)) { if (!test_bit(IP_TUNNEL_KEY_BIT, p->i_flags)) p->i_key = 0; if (!test_bit(IP_TUNNEL_KEY_BIT, p->o_flags)) p->o_key = 0; } t = ip_tunnel_find(itn, p, itn->type); if (cmd == SIOCADDTUNNEL) { if (!t) { t = ip_tunnel_create(net, itn, p); err = PTR_ERR_OR_ZERO(t); break; } err = -EEXIST; break; } if (dev != itn->fb_tunnel_dev && cmd == SIOCCHGTUNNEL) { if (t) { if (t->dev != dev) { err = -EEXIST; break; } } else { unsigned int nflags = 0; if (ipv4_is_multicast(p->iph.daddr)) nflags = IFF_BROADCAST; else if (p->iph.daddr) nflags = IFF_POINTOPOINT; if ((dev->flags^nflags)&(IFF_POINTOPOINT|IFF_BROADCAST)) { err = -EINVAL; break; } t = netdev_priv(dev); } } if (t) { err = 0; ip_tunnel_update(itn, t, dev, p, true, 0); } else { err = -ENOENT; } break; case SIOCDELTUNNEL: err = -EPERM; if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) goto done; if (dev == itn->fb_tunnel_dev) { err = -ENOENT; t = ip_tunnel_find(itn, p, itn->fb_tunnel_dev->type); if (!t) goto done; err = -EPERM; if (t == netdev_priv(itn->fb_tunnel_dev)) goto done; dev = t->dev; } unregister_netdevice(dev); err = 0; break; default: err = -EINVAL; } done: return err; } EXPORT_SYMBOL_GPL(ip_tunnel_ctl); bool ip_tunnel_parm_from_user(struct ip_tunnel_parm_kern *kp, const void __user *data) { struct ip_tunnel_parm p; if (copy_from_user(&p, data, sizeof(p))) return false; strscpy(kp->name, p.name); kp->link = p.link; ip_tunnel_flags_from_be16(kp->i_flags, p.i_flags); ip_tunnel_flags_from_be16(kp->o_flags, p.o_flags); kp->i_key = p.i_key; kp->o_key = p.o_key; memcpy(&kp->iph, &p.iph, min(sizeof(kp->iph), sizeof(p.iph))); return true; } EXPORT_SYMBOL_GPL(ip_tunnel_parm_from_user); bool ip_tunnel_parm_to_user(void __user *data, struct ip_tunnel_parm_kern *kp) { struct ip_tunnel_parm p; if (!ip_tunnel_flags_is_be16_compat(kp->i_flags) || !ip_tunnel_flags_is_be16_compat(kp->o_flags)) return false; memset(&p, 0, sizeof(p)); strscpy(p.name, kp->name); p.link = kp->link; p.i_flags = ip_tunnel_flags_to_be16(kp->i_flags); p.o_flags = ip_tunnel_flags_to_be16(kp->o_flags); p.i_key = kp->i_key; p.o_key = kp->o_key; memcpy(&p.iph, &kp->iph, min(sizeof(p.iph), sizeof(kp->iph))); return !copy_to_user(data, &p, sizeof(p)); } EXPORT_SYMBOL_GPL(ip_tunnel_parm_to_user); int ip_tunnel_siocdevprivate(struct net_device *dev, struct ifreq *ifr, void __user *data, int cmd) { struct ip_tunnel_parm_kern p; int err; if (!ip_tunnel_parm_from_user(&p, data)) return -EFAULT; err = dev->netdev_ops->ndo_tunnel_ctl(dev, &p, cmd); if (!err && !ip_tunnel_parm_to_user(data, &p)) return -EFAULT; return err; } EXPORT_SYMBOL_GPL(ip_tunnel_siocdevprivate); int __ip_tunnel_change_mtu(struct net_device *dev, int new_mtu, bool strict) { struct ip_tunnel *tunnel = netdev_priv(dev); int t_hlen = tunnel->hlen + sizeof(struct iphdr); int max_mtu = IP_MAX_MTU - t_hlen; if (dev->type == ARPHRD_ETHER) max_mtu -= dev->hard_header_len; if (new_mtu < ETH_MIN_MTU) return -EINVAL; if (new_mtu > max_mtu) { if (strict) return -EINVAL; new_mtu = max_mtu; } WRITE_ONCE(dev->mtu, new_mtu); return 0; } EXPORT_SYMBOL_GPL(__ip_tunnel_change_mtu); int ip_tunnel_change_mtu(struct net_device *dev, int new_mtu) { return __ip_tunnel_change_mtu(dev, new_mtu, true); } EXPORT_SYMBOL_GPL(ip_tunnel_change_mtu); static void ip_tunnel_dev_free(struct net_device *dev) { struct ip_tunnel *tunnel = netdev_priv(dev); gro_cells_destroy(&tunnel->gro_cells); dst_cache_destroy(&tunnel->dst_cache); } void ip_tunnel_dellink(struct net_device *dev, struct list_head *head) { struct ip_tunnel *tunnel = netdev_priv(dev); struct ip_tunnel_net *itn; itn = net_generic(tunnel->net, tunnel->ip_tnl_net_id); if (itn->fb_tunnel_dev != dev) { ip_tunnel_del(itn, netdev_priv(dev)); unregister_netdevice_queue(dev, head); } } EXPORT_SYMBOL_GPL(ip_tunnel_dellink); struct net *ip_tunnel_get_link_net(const struct net_device *dev) { struct ip_tunnel *tunnel = netdev_priv(dev); return READ_ONCE(tunnel->net); } EXPORT_SYMBOL(ip_tunnel_get_link_net); int ip_tunnel_get_iflink(const struct net_device *dev) { const struct ip_tunnel *tunnel = netdev_priv(dev); return READ_ONCE(tunnel->parms.link); } EXPORT_SYMBOL(ip_tunnel_get_iflink); int ip_tunnel_init_net(struct net *net, unsigned int ip_tnl_net_id, struct rtnl_link_ops *ops, char *devname) { struct ip_tunnel_net *itn = net_generic(net, ip_tnl_net_id); struct ip_tunnel_parm_kern parms; unsigned int i; itn->rtnl_link_ops = ops; for (i = 0; i < IP_TNL_HASH_SIZE; i++) INIT_HLIST_HEAD(&itn->tunnels[i]); if (!ops || !net_has_fallback_tunnels(net)) { struct ip_tunnel_net *it_init_net; it_init_net = net_generic(&init_net, ip_tnl_net_id); itn->type = it_init_net->type; itn->fb_tunnel_dev = NULL; return 0; } memset(&parms, 0, sizeof(parms)); if (devname) strscpy(parms.name, devname, IFNAMSIZ); rtnl_lock(); itn->fb_tunnel_dev = __ip_tunnel_create(net, ops, &parms); /* FB netdevice is special: we have one, and only one per netns. * Allowing to move it to another netns is clearly unsafe. */ if (!IS_ERR(itn->fb_tunnel_dev)) { itn->fb_tunnel_dev->netns_immutable = true; itn->fb_tunnel_dev->mtu = ip_tunnel_bind_dev(itn->fb_tunnel_dev); ip_tunnel_add(itn, netdev_priv(itn->fb_tunnel_dev)); itn->type = itn->fb_tunnel_dev->type; } rtnl_unlock(); return PTR_ERR_OR_ZERO(itn->fb_tunnel_dev); } EXPORT_SYMBOL_GPL(ip_tunnel_init_net); void ip_tunnel_delete_net(struct net *net, unsigned int id, struct rtnl_link_ops *ops, struct list_head *head) { struct ip_tunnel_net *itn = net_generic(net, id); struct net_device *dev, *aux; int h; ASSERT_RTNL_NET(net); for_each_netdev_safe(net, dev, aux) if (dev->rtnl_link_ops == ops) unregister_netdevice_queue(dev, head); for (h = 0; h < IP_TNL_HASH_SIZE; h++) { struct ip_tunnel *t; struct hlist_node *n; struct hlist_head *thead = &itn->tunnels[h]; hlist_for_each_entry_safe(t, n, thead, hash_node) /* If dev is in the same netns, it has already * been added to the list by the previous loop. */ if (!net_eq(dev_net(t->dev), net)) unregister_netdevice_queue(t->dev, head); } } EXPORT_SYMBOL_GPL(ip_tunnel_delete_net); int ip_tunnel_newlink(struct net *net, struct net_device *dev, struct nlattr *tb[], struct ip_tunnel_parm_kern *p, __u32 fwmark) { struct ip_tunnel *nt; struct ip_tunnel_net *itn; int mtu; int err; nt = netdev_priv(dev); itn = net_generic(net, nt->ip_tnl_net_id); if (nt->collect_md) { if (rtnl_dereference(itn->collect_md_tun)) return -EEXIST; } else { if (ip_tunnel_find(itn, p, dev->type)) return -EEXIST; } nt->net = net; nt->parms = *p; nt->fwmark = fwmark; err = register_netdevice(dev); if (err) goto err_register_netdevice; if (dev->type == ARPHRD_ETHER && !tb[IFLA_ADDRESS]) eth_hw_addr_random(dev); mtu = ip_tunnel_bind_dev(dev); if (tb[IFLA_MTU]) { unsigned int max = IP_MAX_MTU - (nt->hlen + sizeof(struct iphdr)); if (dev->type == ARPHRD_ETHER) max -= dev->hard_header_len; mtu = clamp(dev->mtu, (unsigned int)ETH_MIN_MTU, max); } err = dev_set_mtu(dev, mtu); if (err) goto err_dev_set_mtu; ip_tunnel_add(itn, nt); return 0; err_dev_set_mtu: unregister_netdevice(dev); err_register_netdevice: return err; } EXPORT_SYMBOL_GPL(ip_tunnel_newlink); int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[], struct ip_tunnel_parm_kern *p, __u32 fwmark) { struct ip_tunnel *t; struct ip_tunnel *tunnel = netdev_priv(dev); struct net *net = tunnel->net; struct ip_tunnel_net *itn = net_generic(net, tunnel->ip_tnl_net_id); if (dev == itn->fb_tunnel_dev) return -EINVAL; t = ip_tunnel_find(itn, p, dev->type); if (t) { if (t->dev != dev) return -EEXIST; } else { t = tunnel; if (dev->type != ARPHRD_ETHER) { unsigned int nflags = 0; if (ipv4_is_multicast(p->iph.daddr)) nflags = IFF_BROADCAST; else if (p->iph.daddr) nflags = IFF_POINTOPOINT; if ((dev->flags ^ nflags) & (IFF_POINTOPOINT | IFF_BROADCAST)) return -EINVAL; } } ip_tunnel_update(itn, t, dev, p, !tb[IFLA_MTU], fwmark); return 0; } EXPORT_SYMBOL_GPL(ip_tunnel_changelink); int ip_tunnel_init(struct net_device *dev) { struct ip_tunnel *tunnel = netdev_priv(dev); struct iphdr *iph = &tunnel->parms.iph; int err; dev->needs_free_netdev = true; dev->priv_destructor = ip_tunnel_dev_free; dev->pcpu_stat_type = NETDEV_PCPU_STAT_TSTATS; err = dst_cache_init(&tunnel->dst_cache, GFP_KERNEL); if (err) return err; err = gro_cells_init(&tunnel->gro_cells, dev); if (err) { dst_cache_destroy(&tunnel->dst_cache); return err; } tunnel->dev = dev; strscpy(tunnel->parms.name, dev->name); iph->version = 4; iph->ihl = 5; if (tunnel->collect_md) netif_keep_dst(dev); netdev_lockdep_set_classes(dev); return 0; } EXPORT_SYMBOL_GPL(ip_tunnel_init); void ip_tunnel_uninit(struct net_device *dev) { struct ip_tunnel *tunnel = netdev_priv(dev); struct net *net = tunnel->net; struct ip_tunnel_net *itn; itn = net_generic(net, tunnel->ip_tnl_net_id); ip_tunnel_del(itn, netdev_priv(dev)); if (itn->fb_tunnel_dev == dev) WRITE_ONCE(itn->fb_tunnel_dev, NULL); dst_cache_reset(&tunnel->dst_cache); } EXPORT_SYMBOL_GPL(ip_tunnel_uninit); /* Do least required initialization, rest of init is done in tunnel_init call */ void ip_tunnel_setup(struct net_device *dev, unsigned int net_id) { struct ip_tunnel *tunnel = netdev_priv(dev); tunnel->ip_tnl_net_id = net_id; } EXPORT_SYMBOL_GPL(ip_tunnel_setup); MODULE_DESCRIPTION("IPv4 tunnel implementation library"); MODULE_LICENSE("GPL"); |
| 4668 4659 3624 7400 6461 5586 5516 73 3014 3006 3014 180 3569 5129 5142 44 43 239 242 6 7 4267 4277 4261 4258 4258 33 33 5 5 28 28 6 6 1440 1279 1434 1434 1432 1429 627 161 162 5620 5631 134 5622 4825 4821 1 1 90 91 90 91 29 6 6 6 6 1459 1460 1459 1461 3 3 8 8 8 8 8 8 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 | // SPDX-License-Identifier: GPL-2.0-only #include <linux/mm.h> #include <linux/slab.h> #include <linux/string.h> #include <linux/compiler.h> #include <linux/export.h> #include <linux/err.h> #include <linux/sched.h> #include <linux/sched/mm.h> #include <linux/sched/signal.h> #include <linux/sched/task_stack.h> #include <linux/security.h> #include <linux/swap.h> #include <linux/swapops.h> #include <linux/sysctl.h> #include <linux/mman.h> #include <linux/hugetlb.h> #include <linux/vmalloc.h> #include <linux/userfaultfd_k.h> #include <linux/elf.h> #include <linux/elf-randomize.h> #include <linux/personality.h> #include <linux/random.h> #include <linux/processor.h> #include <linux/sizes.h> #include <linux/compat.h> #include <linux/fsnotify.h> #include <linux/uaccess.h> #include <kunit/visibility.h> #include "internal.h" #include "swap.h" /** * kfree_const - conditionally free memory * @x: pointer to the memory * * Function calls kfree only if @x is not in .rodata section. */ void kfree_const(const void *x) { if (!is_kernel_rodata((unsigned long)x)) kfree(x); } EXPORT_SYMBOL(kfree_const); /** * __kmemdup_nul - Create a NUL-terminated string from @s, which might be unterminated. * @s: The data to copy * @len: The size of the data, not including the NUL terminator * @gfp: the GFP mask used in the kmalloc() call when allocating memory * * Return: newly allocated copy of @s with NUL-termination or %NULL in * case of error */ static __always_inline char *__kmemdup_nul(const char *s, size_t len, gfp_t gfp) { char *buf; /* '+1' for the NUL terminator */ buf = kmalloc_track_caller(len + 1, gfp); if (!buf) return NULL; memcpy(buf, s, len); /* Ensure the buf is always NUL-terminated, regardless of @s. */ buf[len] = '\0'; return buf; } /** * kstrdup - allocate space for and copy an existing string * @s: the string to duplicate * @gfp: the GFP mask used in the kmalloc() call when allocating memory * * Return: newly allocated copy of @s or %NULL in case of error */ noinline char *kstrdup(const char *s, gfp_t gfp) { return s ? __kmemdup_nul(s, strlen(s), gfp) : NULL; } EXPORT_SYMBOL(kstrdup); /** * kstrdup_const - conditionally duplicate an existing const string * @s: the string to duplicate * @gfp: the GFP mask used in the kmalloc() call when allocating memory * * Note: Strings allocated by kstrdup_const should be freed by kfree_const and * must not be passed to krealloc(). * * Return: source string if it is in .rodata section otherwise * fallback to kstrdup. */ const char *kstrdup_const(const char *s, gfp_t gfp) { if (is_kernel_rodata((unsigned long)s)) return s; return kstrdup(s, gfp); } EXPORT_SYMBOL(kstrdup_const); /** * kstrndup - allocate space for and copy an existing string * @s: the string to duplicate * @max: read at most @max chars from @s * @gfp: the GFP mask used in the kmalloc() call when allocating memory * * Note: Use kmemdup_nul() instead if the size is known exactly. * * Return: newly allocated copy of @s or %NULL in case of error */ char *kstrndup(const char *s, size_t max, gfp_t gfp) { return s ? __kmemdup_nul(s, strnlen(s, max), gfp) : NULL; } EXPORT_SYMBOL(kstrndup); /** * kmemdup - duplicate region of memory * * @src: memory region to duplicate * @len: memory region length * @gfp: GFP mask to use * * Return: newly allocated copy of @src or %NULL in case of error, * result is physically contiguous. Use kfree() to free. */ void *kmemdup_noprof(const void *src, size_t len, gfp_t gfp) { void *p; p = kmalloc_node_track_caller_noprof(len, gfp, NUMA_NO_NODE, _RET_IP_); if (p) memcpy(p, src, len); return p; } EXPORT_SYMBOL(kmemdup_noprof); /** * kmemdup_array - duplicate a given array. * * @src: array to duplicate. * @count: number of elements to duplicate from array. * @element_size: size of each element of array. * @gfp: GFP mask to use. * * Return: duplicated array of @src or %NULL in case of error, * result is physically contiguous. Use kfree() to free. */ void *kmemdup_array(const void *src, size_t count, size_t element_size, gfp_t gfp) { return kmemdup(src, size_mul(element_size, count), gfp); } EXPORT_SYMBOL(kmemdup_array); /** * kvmemdup - duplicate region of memory * * @src: memory region to duplicate * @len: memory region length * @gfp: GFP mask to use * * Return: newly allocated copy of @src or %NULL in case of error, * result may be not physically contiguous. Use kvfree() to free. */ void *kvmemdup(const void *src, size_t len, gfp_t gfp) { void *p; p = kvmalloc(len, gfp); if (p) memcpy(p, src, len); return p; } EXPORT_SYMBOL(kvmemdup); /** * kmemdup_nul - Create a NUL-terminated string from unterminated data * @s: The data to stringify * @len: The size of the data * @gfp: the GFP mask used in the kmalloc() call when allocating memory * * Return: newly allocated copy of @s with NUL-termination or %NULL in * case of error */ char *kmemdup_nul(const char *s, size_t len, gfp_t gfp) { return s ? __kmemdup_nul(s, len, gfp) : NULL; } EXPORT_SYMBOL(kmemdup_nul); static kmem_buckets *user_buckets __ro_after_init; static int __init init_user_buckets(void) { user_buckets = kmem_buckets_create("memdup_user", 0, 0, INT_MAX, NULL); return 0; } subsys_initcall(init_user_buckets); /** * memdup_user - duplicate memory region from user space * * @src: source address in user space * @len: number of bytes to copy * * Return: an ERR_PTR() on failure. Result is physically * contiguous, to be freed by kfree(). */ void *memdup_user(const void __user *src, size_t len) { void *p; p = kmem_buckets_alloc_track_caller(user_buckets, len, GFP_USER | __GFP_NOWARN); if (!p) return ERR_PTR(-ENOMEM); if (copy_from_user(p, src, len)) { kfree(p); return ERR_PTR(-EFAULT); } return p; } EXPORT_SYMBOL(memdup_user); /** * vmemdup_user - duplicate memory region from user space * * @src: source address in user space * @len: number of bytes to copy * * Return: an ERR_PTR() on failure. Result may be not * physically contiguous. Use kvfree() to free. */ void *vmemdup_user(const void __user *src, size_t len) { void *p; p = kmem_buckets_valloc(user_buckets, len, GFP_USER); if (!p) return ERR_PTR(-ENOMEM); if (copy_from_user(p, src, len)) { kvfree(p); return ERR_PTR(-EFAULT); } return p; } EXPORT_SYMBOL(vmemdup_user); /** * strndup_user - duplicate an existing string from user space * @s: The string to duplicate * @n: Maximum number of bytes to copy, including the trailing NUL. * * Return: newly allocated copy of @s or an ERR_PTR() in case of error */ char *strndup_user(const char __user *s, long n) { char *p; long length; length = strnlen_user(s, n); if (!length) return ERR_PTR(-EFAULT); if (length > n) return ERR_PTR(-EINVAL); p = memdup_user(s, length); if (IS_ERR(p)) return p; p[length - 1] = '\0'; return p; } EXPORT_SYMBOL(strndup_user); /** * memdup_user_nul - duplicate memory region from user space and NUL-terminate * * @src: source address in user space * @len: number of bytes to copy * * Return: an ERR_PTR() on failure. */ void *memdup_user_nul(const void __user *src, size_t len) { char *p; p = kmem_buckets_alloc_track_caller(user_buckets, len + 1, GFP_USER | __GFP_NOWARN); if (!p) return ERR_PTR(-ENOMEM); if (copy_from_user(p, src, len)) { kfree(p); return ERR_PTR(-EFAULT); } p[len] = '\0'; return p; } EXPORT_SYMBOL(memdup_user_nul); /* Check if the vma is being used as a stack by this task */ int vma_is_stack_for_current(struct vm_area_struct *vma) { struct task_struct * __maybe_unused t = current; return (vma->vm_start <= KSTK_ESP(t) && vma->vm_end >= KSTK_ESP(t)); } /* * Change backing file, only valid to use during initial VMA setup. */ void vma_set_file(struct vm_area_struct *vma, struct file *file) { /* Changing an anonymous vma with this is illegal */ get_file(file); swap(vma->vm_file, file); fput(file); } EXPORT_SYMBOL(vma_set_file); #ifndef STACK_RND_MASK #define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12)) /* 8MB of VA */ #endif unsigned long randomize_stack_top(unsigned long stack_top) { unsigned long random_variable = 0; if (current->flags & PF_RANDOMIZE) { random_variable = get_random_long(); random_variable &= STACK_RND_MASK; random_variable <<= PAGE_SHIFT; } #ifdef CONFIG_STACK_GROWSUP return PAGE_ALIGN(stack_top) + random_variable; #else return PAGE_ALIGN(stack_top) - random_variable; #endif } /** * randomize_page - Generate a random, page aligned address * @start: The smallest acceptable address the caller will take. * @range: The size of the area, starting at @start, within which the * random address must fall. * * If @start + @range would overflow, @range is capped. * * NOTE: Historical use of randomize_range, which this replaces, presumed that * @start was already page aligned. We now align it regardless. * * Return: A page aligned address within [start, start + range). On error, * @start is returned. */ unsigned long randomize_page(unsigned long start, unsigned long range) { if (!PAGE_ALIGNED(start)) { range -= PAGE_ALIGN(start) - start; start = PAGE_ALIGN(start); } if (start > ULONG_MAX - range) range = ULONG_MAX - start; range >>= PAGE_SHIFT; if (range == 0) return start; return start + (get_random_long() % range << PAGE_SHIFT); } #ifdef CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT unsigned long __weak arch_randomize_brk(struct mm_struct *mm) { /* Is the current task 32bit ? */ if (!IS_ENABLED(CONFIG_64BIT) || is_compat_task()) return randomize_page(mm->brk, SZ_32M); return randomize_page(mm->brk, SZ_1G); } unsigned long arch_mmap_rnd(void) { unsigned long rnd; #ifdef CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS if (is_compat_task()) rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1); else #endif /* CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS */ rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); return rnd << PAGE_SHIFT; } static int mmap_is_legacy(struct rlimit *rlim_stack) { if (current->personality & ADDR_COMPAT_LAYOUT) return 1; /* On parisc the stack always grows up - so a unlimited stack should * not be an indicator to use the legacy memory layout. */ if (rlim_stack->rlim_cur == RLIM_INFINITY && !IS_ENABLED(CONFIG_STACK_GROWSUP)) return 1; return sysctl_legacy_va_layout; } /* * Leave enough space between the mmap area and the stack to honour ulimit in * the face of randomisation. */ #define MIN_GAP (SZ_128M) #define MAX_GAP (STACK_TOP / 6 * 5) static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack) { #ifdef CONFIG_STACK_GROWSUP /* * For an upwards growing stack the calculation is much simpler. * Memory for the maximum stack size is reserved at the top of the * task. mmap_base starts directly below the stack and grows * downwards. */ return PAGE_ALIGN_DOWN(mmap_upper_limit(rlim_stack) - rnd); #else unsigned long gap = rlim_stack->rlim_cur; unsigned long pad = stack_guard_gap; /* Account for stack randomization if necessary */ if (current->flags & PF_RANDOMIZE) pad += (STACK_RND_MASK << PAGE_SHIFT); /* Values close to RLIM_INFINITY can overflow. */ if (gap + pad > gap) gap += pad; if (gap < MIN_GAP && MIN_GAP < MAX_GAP) gap = MIN_GAP; else if (gap > MAX_GAP) gap = MAX_GAP; return PAGE_ALIGN(STACK_TOP - gap - rnd); #endif } void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) { unsigned long random_factor = 0UL; if (current->flags & PF_RANDOMIZE) random_factor = arch_mmap_rnd(); if (mmap_is_legacy(rlim_stack)) { mm->mmap_base = TASK_UNMAPPED_BASE + random_factor; clear_bit(MMF_TOPDOWN, &mm->flags); } else { mm->mmap_base = mmap_base(random_factor, rlim_stack); set_bit(MMF_TOPDOWN, &mm->flags); } } #elif defined(CONFIG_MMU) && !defined(HAVE_ARCH_PICK_MMAP_LAYOUT) void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) { mm->mmap_base = TASK_UNMAPPED_BASE; clear_bit(MMF_TOPDOWN, &mm->flags); } #endif #ifdef CONFIG_MMU EXPORT_SYMBOL_IF_KUNIT(arch_pick_mmap_layout); #endif /** * __account_locked_vm - account locked pages to an mm's locked_vm * @mm: mm to account against * @pages: number of pages to account * @inc: %true if @pages should be considered positive, %false if not * @task: task used to check RLIMIT_MEMLOCK * @bypass_rlim: %true if checking RLIMIT_MEMLOCK should be skipped * * Assumes @task and @mm are valid (i.e. at least one reference on each), and * that mmap_lock is held as writer. * * Return: * * 0 on success * * -ENOMEM if RLIMIT_MEMLOCK would be exceeded. */ int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, struct task_struct *task, bool bypass_rlim) { unsigned long locked_vm, limit; int ret = 0; mmap_assert_write_locked(mm); locked_vm = mm->locked_vm; if (inc) { if (!bypass_rlim) { limit = task_rlimit(task, RLIMIT_MEMLOCK) >> PAGE_SHIFT; if (locked_vm + pages > limit) ret = -ENOMEM; } if (!ret) mm->locked_vm = locked_vm + pages; } else { WARN_ON_ONCE(pages > locked_vm); mm->locked_vm = locked_vm - pages; } pr_debug("%s: [%d] caller %ps %c%lu %lu/%lu%s\n", __func__, task->pid, (void *)_RET_IP_, (inc) ? '+' : '-', pages << PAGE_SHIFT, locked_vm << PAGE_SHIFT, task_rlimit(task, RLIMIT_MEMLOCK), ret ? " - exceeded" : ""); return ret; } EXPORT_SYMBOL_GPL(__account_locked_vm); /** * account_locked_vm - account locked pages to an mm's locked_vm * @mm: mm to account against, may be NULL * @pages: number of pages to account * @inc: %true if @pages should be considered positive, %false if not * * Assumes a non-NULL @mm is valid (i.e. at least one reference on it). * * Return: * * 0 on success, or if mm is NULL * * -ENOMEM if RLIMIT_MEMLOCK would be exceeded. */ int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc) { int ret; if (pages == 0 || !mm) return 0; mmap_write_lock(mm); ret = __account_locked_vm(mm, pages, inc, current, capable(CAP_IPC_LOCK)); mmap_write_unlock(mm); return ret; } EXPORT_SYMBOL_GPL(account_locked_vm); unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flag, unsigned long pgoff) { unsigned long ret; struct mm_struct *mm = current->mm; unsigned long populate; LIST_HEAD(uf); ret = security_mmap_file(file, prot, flag); if (!ret) ret = fsnotify_mmap_perm(file, prot, pgoff >> PAGE_SHIFT, len); if (!ret) { if (mmap_write_lock_killable(mm)) return -EINTR; ret = do_mmap(file, addr, len, prot, flag, 0, pgoff, &populate, &uf); mmap_write_unlock(mm); userfaultfd_unmap_complete(mm, &uf); if (populate) mm_populate(ret, populate); } return ret; } /* * Perform a userland memory mapping into the current process address space. See * the comment for do_mmap() for more details on this operation in general. * * This differs from do_mmap() in that: * * a. An offset parameter is provided rather than pgoff, which is both checked * for overflow and page alignment. * b. mmap locking is performed on the caller's behalf. * c. Userfaultfd unmap events and memory population are handled. * * This means that this function performs essentially the same work as if * userland were invoking mmap (2). * * Returns either an error, or the address at which the requested mapping has * been performed. */ unsigned long vm_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flag, unsigned long offset) { if (unlikely(offset + PAGE_ALIGN(len) < offset)) return -EINVAL; if (unlikely(offset_in_page(offset))) return -EINVAL; return vm_mmap_pgoff(file, addr, len, prot, flag, offset >> PAGE_SHIFT); } EXPORT_SYMBOL(vm_mmap); /** * __vmalloc_array - allocate memory for a virtually contiguous array. * @n: number of elements. * @size: element size. * @flags: the type of memory to allocate (see kmalloc). */ void *__vmalloc_array_noprof(size_t n, size_t size, gfp_t flags) { size_t bytes; if (unlikely(check_mul_overflow(n, size, &bytes))) return NULL; return __vmalloc_noprof(bytes, flags); } EXPORT_SYMBOL(__vmalloc_array_noprof); /** * vmalloc_array - allocate memory for a virtually contiguous array. * @n: number of elements. * @size: element size. */ void *vmalloc_array_noprof(size_t n, size_t size) { return __vmalloc_array_noprof(n, size, GFP_KERNEL); } EXPORT_SYMBOL(vmalloc_array_noprof); /** * __vcalloc - allocate and zero memory for a virtually contiguous array. * @n: number of elements. * @size: element size. * @flags: the type of memory to allocate (see kmalloc). */ void *__vcalloc_noprof(size_t n, size_t size, gfp_t flags) { return __vmalloc_array_noprof(n, size, flags | __GFP_ZERO); } EXPORT_SYMBOL(__vcalloc_noprof); /** * vcalloc - allocate and zero memory for a virtually contiguous array. * @n: number of elements. * @size: element size. */ void *vcalloc_noprof(size_t n, size_t size) { return __vmalloc_array_noprof(n, size, GFP_KERNEL | __GFP_ZERO); } EXPORT_SYMBOL(vcalloc_noprof); struct anon_vma *folio_anon_vma(const struct folio *folio) { unsigned long mapping = (unsigned long)folio->mapping; if ((mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON) return NULL; return (void *)(mapping - PAGE_MAPPING_ANON); } /** * folio_mapping - Find the mapping where this folio is stored. * @folio: The folio. * * For folios which are in the page cache, return the mapping that this * page belongs to. Folios in the swap cache return the swap mapping * this page is stored in (which is different from the mapping for the * swap file or swap device where the data is stored). * * You can call this for folios which aren't in the swap cache or page * cache and it will return NULL. */ struct address_space *folio_mapping(struct folio *folio) { struct address_space *mapping; /* This happens if someone calls flush_dcache_page on slab page */ if (unlikely(folio_test_slab(folio))) return NULL; if (unlikely(folio_test_swapcache(folio))) return swap_address_space(folio->swap); mapping = folio->mapping; if ((unsigned long)mapping & PAGE_MAPPING_FLAGS) return NULL; return mapping; } EXPORT_SYMBOL(folio_mapping); /** * folio_copy - Copy the contents of one folio to another. * @dst: Folio to copy to. * @src: Folio to copy from. * * The bytes in the folio represented by @src are copied to @dst. * Assumes the caller has validated that @dst is at least as large as @src. * Can be called in atomic context for order-0 folios, but if the folio is * larger, it may sleep. */ void folio_copy(struct folio *dst, struct folio *src) { long i = 0; long nr = folio_nr_pages(src); for (;;) { copy_highpage(folio_page(dst, i), folio_page(src, i)); if (++i == nr) break; cond_resched(); } } EXPORT_SYMBOL(folio_copy); int folio_mc_copy(struct folio *dst, struct folio *src) { long nr = folio_nr_pages(src); long i = 0; for (;;) { if (copy_mc_highpage(folio_page(dst, i), folio_page(src, i))) return -EHWPOISON; if (++i == nr) break; cond_resched(); } return 0; } EXPORT_SYMBOL(folio_mc_copy); int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS; static int sysctl_overcommit_ratio __read_mostly = 50; static unsigned long sysctl_overcommit_kbytes __read_mostly; int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT; unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */ unsigned long sysctl_admin_reserve_kbytes __read_mostly = 1UL << 13; /* 8MB */ #ifdef CONFIG_SYSCTL static int overcommit_ratio_handler(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { int ret; ret = proc_dointvec(table, write, buffer, lenp, ppos); if (ret == 0 && write) sysctl_overcommit_kbytes = 0; return ret; } static void sync_overcommit_as(struct work_struct *dummy) { percpu_counter_sync(&vm_committed_as); } static int overcommit_policy_handler(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct ctl_table t; int new_policy = -1; int ret; /* * The deviation of sync_overcommit_as could be big with loose policy * like OVERCOMMIT_ALWAYS/OVERCOMMIT_GUESS. When changing policy to * strict OVERCOMMIT_NEVER, we need to reduce the deviation to comply * with the strict "NEVER", and to avoid possible race condition (even * though user usually won't too frequently do the switching to policy * OVERCOMMIT_NEVER), the switch is done in the following order: * 1. changing the batch * 2. sync percpu count on each CPU * 3. switch the policy */ if (write) { t = *table; t.data = &new_policy; ret = proc_dointvec_minmax(&t, write, buffer, lenp, ppos); if (ret || new_policy == -1) return ret; mm_compute_batch(new_policy); if (new_policy == OVERCOMMIT_NEVER) schedule_on_each_cpu(sync_overcommit_as); sysctl_overcommit_memory = new_policy; } else { ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); } return ret; } static int overcommit_kbytes_handler(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { int ret; ret = proc_doulongvec_minmax(table, write, buffer, lenp, ppos); if (ret == 0 && write) sysctl_overcommit_ratio = 0; return ret; } static const struct ctl_table util_sysctl_table[] = { { .procname = "overcommit_memory", .data = &sysctl_overcommit_memory, .maxlen = sizeof(sysctl_overcommit_memory), .mode = 0644, .proc_handler = overcommit_policy_handler, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_TWO, }, { .procname = "overcommit_ratio", .data = &sysctl_overcommit_ratio, .maxlen = sizeof(sysctl_overcommit_ratio), .mode = 0644, .proc_handler = overcommit_ratio_handler, }, { .procname = "overcommit_kbytes", .data = &sysctl_overcommit_kbytes, .maxlen = sizeof(sysctl_overcommit_kbytes), .mode = 0644, .proc_handler = overcommit_kbytes_handler, }, { .procname = "user_reserve_kbytes", .data = &sysctl_user_reserve_kbytes, .maxlen = sizeof(sysctl_user_reserve_kbytes), .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, { .procname = "admin_reserve_kbytes", .data = &sysctl_admin_reserve_kbytes, .maxlen = sizeof(sysctl_admin_reserve_kbytes), .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, }; static int __init init_vm_util_sysctls(void) { register_sysctl_init("vm", util_sysctl_table); return 0; } subsys_initcall(init_vm_util_sysctls); #endif /* CONFIG_SYSCTL */ /* * Committed memory limit enforced when OVERCOMMIT_NEVER policy is used */ unsigned long vm_commit_limit(void) { unsigned long allowed; if (sysctl_overcommit_kbytes) allowed = sysctl_overcommit_kbytes >> (PAGE_SHIFT - 10); else allowed = ((totalram_pages() - hugetlb_total_pages()) * sysctl_overcommit_ratio / 100); allowed += total_swap_pages; return allowed; } /* * Make sure vm_committed_as in one cacheline and not cacheline shared with * other variables. It can be updated by several CPUs frequently. */ struct percpu_counter vm_committed_as ____cacheline_aligned_in_smp; /* * The global memory commitment made in the system can be a metric * that can be used to drive ballooning decisions when Linux is hosted * as a guest. On Hyper-V, the host implements a policy engine for dynamically * balancing memory across competing virtual machines that are hosted. * Several metrics drive this policy engine including the guest reported * memory commitment. * * The time cost of this is very low for small platforms, and for big * platform like a 2S/36C/72T Skylake server, in worst case where * vm_committed_as's spinlock is under severe contention, the time cost * could be about 30~40 microseconds. */ unsigned long vm_memory_committed(void) { return percpu_counter_sum_positive(&vm_committed_as); } EXPORT_SYMBOL_GPL(vm_memory_committed); /* * Check that a process has enough memory to allocate a new virtual * mapping. 0 means there is enough memory for the allocation to * succeed and -ENOMEM implies there is not. * * We currently support three overcommit policies, which are set via the * vm.overcommit_memory sysctl. See Documentation/mm/overcommit-accounting.rst * * Strict overcommit modes added 2002 Feb 26 by Alan Cox. * Additional code 2002 Jul 20 by Robert Love. * * cap_sys_admin is 1 if the process has admin privileges, 0 otherwise. * * Note this is a helper function intended to be used by LSMs which * wish to use this logic. */ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) { long allowed; unsigned long bytes_failed; vm_acct_memory(pages); /* * Sometimes we want to use more memory than we have */ if (sysctl_overcommit_memory == OVERCOMMIT_ALWAYS) return 0; if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) { if (pages > totalram_pages() + total_swap_pages) goto error; return 0; } allowed = vm_commit_limit(); /* * Reserve some for root */ if (!cap_sys_admin) allowed -= sysctl_admin_reserve_kbytes >> (PAGE_SHIFT - 10); /* * Don't let a single process grow so big a user can't recover */ if (mm) { long reserve = sysctl_user_reserve_kbytes >> (PAGE_SHIFT - 10); allowed -= min_t(long, mm->total_vm / 32, reserve); } if (percpu_counter_read_positive(&vm_committed_as) < allowed) return 0; error: bytes_failed = pages << PAGE_SHIFT; pr_warn_ratelimited("%s: pid: %d, comm: %s, bytes: %lu not enough memory for the allocation\n", __func__, current->pid, current->comm, bytes_failed); vm_unacct_memory(pages); return -ENOMEM; } /** * get_cmdline() - copy the cmdline value to a buffer. * @task: the task whose cmdline value to copy. * @buffer: the buffer to copy to. * @buflen: the length of the buffer. Larger cmdline values are truncated * to this length. * * Return: the size of the cmdline field copied. Note that the copy does * not guarantee an ending NULL byte. */ int get_cmdline(struct task_struct *task, char *buffer, int buflen) { int res = 0; unsigned int len; struct mm_struct *mm = get_task_mm(task); unsigned long arg_start, arg_end, env_start, env_end; if (!mm) goto out; if (!mm->arg_end) goto out_mm; /* Shh! No looking before we're done */ spin_lock(&mm->arg_lock); arg_start = mm->arg_start; arg_end = mm->arg_end; env_start = mm->env_start; env_end = mm->env_end; spin_unlock(&mm->arg_lock); len = arg_end - arg_start; if (len > buflen) len = buflen; res = access_process_vm(task, arg_start, buffer, len, FOLL_FORCE); /* * If the nul at the end of args has been overwritten, then * assume application is using setproctitle(3). */ if (res > 0 && buffer[res-1] != '\0' && len < buflen) { len = strnlen(buffer, res); if (len < res) { res = len; } else { len = env_end - env_start; if (len > buflen - res) len = buflen - res; res += access_process_vm(task, env_start, buffer+res, len, FOLL_FORCE); res = strnlen(buffer, res); } } out_mm: mmput(mm); out: return res; } int __weak memcmp_pages(struct page *page1, struct page *page2) { char *addr1, *addr2; int ret; addr1 = kmap_local_page(page1); addr2 = kmap_local_page(page2); ret = memcmp(addr1, addr2, PAGE_SIZE); kunmap_local(addr2); kunmap_local(addr1); return ret; } #ifdef CONFIG_PRINTK /** * mem_dump_obj - Print available provenance information * @object: object for which to find provenance information. * * This function uses pr_cont(), so that the caller is expected to have * printed out whatever preamble is appropriate. The provenance information * depends on the type of object and on how much debugging is enabled. * For example, for a slab-cache object, the slab name is printed, and, * if available, the return address and stack trace from the allocation * and last free path of that object. */ void mem_dump_obj(void *object) { const char *type; if (kmem_dump_obj(object)) return; if (vmalloc_dump_obj(object)) return; if (is_vmalloc_addr(object)) type = "vmalloc memory"; else if (virt_addr_valid(object)) type = "non-slab/vmalloc memory"; else if (object == NULL) type = "NULL pointer"; else if (object == ZERO_SIZE_PTR) type = "zero-size pointer"; else type = "non-paged memory"; pr_cont(" %s\n", type); } EXPORT_SYMBOL_GPL(mem_dump_obj); #endif /* * A driver might set a page logically offline -- PageOffline() -- and * turn the page inaccessible in the hypervisor; after that, access to page * content can be fatal. * * Some special PFN walkers -- i.e., /proc/kcore -- read content of random * pages after checking PageOffline(); however, these PFN walkers can race * with drivers that set PageOffline(). * * page_offline_freeze()/page_offline_thaw() allows for a subsystem to * synchronize with such drivers, achieving that a page cannot be set * PageOffline() while frozen. * * page_offline_begin()/page_offline_end() is used by drivers that care about * such races when setting a page PageOffline(). */ static DECLARE_RWSEM(page_offline_rwsem); void page_offline_freeze(void) { down_read(&page_offline_rwsem); } void page_offline_thaw(void) { up_read(&page_offline_rwsem); } void page_offline_begin(void) { down_write(&page_offline_rwsem); } EXPORT_SYMBOL(page_offline_begin); void page_offline_end(void) { up_write(&page_offline_rwsem); } EXPORT_SYMBOL(page_offline_end); #ifndef flush_dcache_folio void flush_dcache_folio(struct folio *folio) { long i, nr = folio_nr_pages(folio); for (i = 0; i < nr; i++) flush_dcache_page(folio_page(folio, i)); } EXPORT_SYMBOL(flush_dcache_folio); #endif /** * compat_vma_mmap_prepare() - Apply the file's .mmap_prepare() hook to an * existing VMA * @file: The file which possesss an f_op->mmap_prepare() hook * @vma: The VMA to apply the .mmap_prepare() hook to. * * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain * 'wrapper' file systems invoke a nested mmap hook of an underlying file. * * Until all filesystems are converted to use .mmap_prepare(), we must be * conservative and continue to invoke these 'wrapper' filesystems using the * deprecated .mmap() hook. * * However we have a problem if the underlying file system possesses an * .mmap_prepare() hook, as we are in a different context when we invoke the * .mmap() hook, already having a VMA to deal with. * * compat_vma_mmap_prepare() is a compatibility function that takes VMA state, * establishes a struct vm_area_desc descriptor, passes to the underlying * .mmap_prepare() hook and applies any changes performed by it. * * Once the conversion of filesystems is complete this function will no longer * be required and will be removed. * * Returns: 0 on success or error. */ int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma) { struct vm_area_desc desc; int err; err = file->f_op->mmap_prepare(vma_to_desc(vma, &desc)); if (err) return err; set_vma_from_desc(vma, &desc); return 0; } EXPORT_SYMBOL(compat_vma_mmap_prepare); |
| 2 2 1 1 1 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 | // SPDX-License-Identifier: GPL-2.0 /***************************************************************************** * USBLCD Kernel Driver * * Version 1.05 * * (C) 2005 Georges Toth <g.toth@e-biz.lu> * * * * This file is licensed under the GPL. See COPYING in the package. * * Based on usb-skeleton.c 2.0 by Greg Kroah-Hartman (greg@kroah.com) * * * * * * 28.02.05 Complete rewrite of the original usblcd.c driver, * * based on usb_skeleton.c. * * This new driver allows more than one USB-LCD to be connected * * and controlled, at once * *****************************************************************************/ #include <linux/module.h> #include <linux/kernel.h> #include <linux/slab.h> #include <linux/errno.h> #include <linux/mutex.h> #include <linux/rwsem.h> #include <linux/uaccess.h> #include <linux/usb.h> #define DRIVER_VERSION "USBLCD Driver Version 1.05" #define USBLCD_MINOR 144 #define IOCTL_GET_HARD_VERSION 1 #define IOCTL_GET_DRV_VERSION 2 static const struct usb_device_id id_table[] = { { .idVendor = 0x10D2, .match_flags = USB_DEVICE_ID_MATCH_VENDOR, }, { }, }; MODULE_DEVICE_TABLE(usb, id_table); struct usb_lcd { struct usb_device *udev; /* init: probe_lcd */ struct usb_interface *interface; /* the interface for this device */ unsigned char *bulk_in_buffer; /* the buffer to receive data */ size_t bulk_in_size; /* the size of the receive buffer */ __u8 bulk_in_endpointAddr; /* the address of the bulk in endpoint */ __u8 bulk_out_endpointAddr; /* the address of the bulk out endpoint */ struct kref kref; struct semaphore limit_sem; /* to stop writes at full throttle from using up all RAM */ struct usb_anchor submitted; /* URBs to wait for before suspend */ struct rw_semaphore io_rwsem; unsigned long disconnected:1; }; #define to_lcd_dev(d) container_of(d, struct usb_lcd, kref) #define USB_LCD_CONCURRENT_WRITES 5 static struct usb_driver lcd_driver; static void lcd_delete(struct kref *kref) { struct usb_lcd *dev = to_lcd_dev(kref); usb_put_dev(dev->udev); kfree(dev->bulk_in_buffer); kfree(dev); } static int lcd_open(struct inode *inode, struct file *file) { struct usb_lcd *dev; struct usb_interface *interface; int subminor, r; subminor = iminor(inode); interface = usb_find_interface(&lcd_driver, subminor); if (!interface) { pr_err("USBLCD: %s - error, can't find device for minor %d\n", __func__, subminor); return -ENODEV; } dev = usb_get_intfdata(interface); /* increment our usage count for the device */ kref_get(&dev->kref); /* grab a power reference */ r = usb_autopm_get_interface(interface); if (r < 0) { kref_put(&dev->kref, lcd_delete); return r; } /* save our object in the file's private structure */ file->private_data = dev; return 0; } static int lcd_release(struct inode *inode, struct file *file) { struct usb_lcd *dev; dev = file->private_data; if (dev == NULL) return -ENODEV; /* decrement the count on our device */ usb_autopm_put_interface(dev->interface); kref_put(&dev->kref, lcd_delete); return 0; } static ssize_t lcd_read(struct file *file, char __user * buffer, size_t count, loff_t *ppos) { struct usb_lcd *dev; int retval = 0; int bytes_read; dev = file->private_data; down_read(&dev->io_rwsem); if (dev->disconnected) { retval = -ENODEV; goto out_up_io; } /* do a blocking bulk read to get data from the device */ retval = usb_bulk_msg(dev->udev, usb_rcvbulkpipe(dev->udev, dev->bulk_in_endpointAddr), dev->bulk_in_buffer, min(dev->bulk_in_size, count), &bytes_read, 10000); /* if the read was successful, copy the data to userspace */ if (!retval) { if (copy_to_user(buffer, dev->bulk_in_buffer, bytes_read)) retval = -EFAULT; else retval = bytes_read; } out_up_io: up_read(&dev->io_rwsem); return retval; } static long lcd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct usb_lcd *dev; u16 bcdDevice; char buf[30]; dev = file->private_data; if (dev == NULL) return -ENODEV; switch (cmd) { case IOCTL_GET_HARD_VERSION: bcdDevice = le16_to_cpu((dev->udev)->descriptor.bcdDevice); sprintf(buf, "%1d%1d.%1d%1d", (bcdDevice & 0xF000)>>12, (bcdDevice & 0xF00)>>8, (bcdDevice & 0xF0)>>4, (bcdDevice & 0xF)); if (copy_to_user((void __user *)arg, buf, strlen(buf)) != 0) return -EFAULT; break; case IOCTL_GET_DRV_VERSION: sprintf(buf, DRIVER_VERSION); if (copy_to_user((void __user *)arg, buf, strlen(buf)) != 0) return -EFAULT; break; default: return -ENOTTY; } return 0; } static void lcd_write_bulk_callback(struct urb *urb) { struct usb_lcd *dev; int status = urb->status; dev = urb->context; /* sync/async unlink faults aren't errors */ if (status && !(status == -ENOENT || status == -ECONNRESET || status == -ESHUTDOWN)) { dev_dbg(&dev->interface->dev, "nonzero write bulk status received: %d\n", status); } /* free up our allocated buffer */ usb_free_coherent(urb->dev, urb->transfer_buffer_length, urb->transfer_buffer, urb->transfer_dma); up(&dev->limit_sem); } static ssize_t lcd_write(struct file *file, const char __user * user_buffer, size_t count, loff_t *ppos) { struct usb_lcd *dev; int retval = 0, r; struct urb *urb = NULL; char *buf = NULL; dev = file->private_data; /* verify that we actually have some data to write */ if (count == 0) goto exit; r = down_interruptible(&dev->limit_sem); if (r < 0) return -EINTR; down_read(&dev->io_rwsem); if (dev->disconnected) { retval = -ENODEV; goto err_up_io; } /* create a urb, and a buffer for it, and copy the data to the urb */ urb = usb_alloc_urb(0, GFP_KERNEL); if (!urb) { retval = -ENOMEM; goto err_up_io; } buf = usb_alloc_coherent(dev->udev, count, GFP_KERNEL, &urb->transfer_dma); if (!buf) { retval = -ENOMEM; goto error; } if (copy_from_user(buf, user_buffer, count)) { retval = -EFAULT; goto error; } /* initialize the urb properly */ usb_fill_bulk_urb(urb, dev->udev, usb_sndbulkpipe(dev->udev, dev->bulk_out_endpointAddr), buf, count, lcd_write_bulk_callback, dev); urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP; usb_anchor_urb(urb, &dev->submitted); /* send the data out the bulk port */ retval = usb_submit_urb(urb, GFP_KERNEL); if (retval) { dev_err(&dev->udev->dev, "%s - failed submitting write urb, error %d\n", __func__, retval); goto error_unanchor; } /* release our reference to this urb, the USB core will eventually free it entirely */ usb_free_urb(urb); up_read(&dev->io_rwsem); exit: return count; error_unanchor: usb_unanchor_urb(urb); error: usb_free_coherent(dev->udev, count, buf, urb->transfer_dma); usb_free_urb(urb); err_up_io: up_read(&dev->io_rwsem); up(&dev->limit_sem); return retval; } static const struct file_operations lcd_fops = { .owner = THIS_MODULE, .read = lcd_read, .write = lcd_write, .open = lcd_open, .unlocked_ioctl = lcd_ioctl, .release = lcd_release, .llseek = noop_llseek, }; /* * usb class driver info in order to get a minor number from the usb core, * and to have the device registered with the driver core */ static struct usb_class_driver lcd_class = { .name = "lcd%d", .fops = &lcd_fops, .minor_base = USBLCD_MINOR, }; static int lcd_probe(struct usb_interface *interface, const struct usb_device_id *id) { struct usb_lcd *dev = NULL; struct usb_endpoint_descriptor *bulk_in, *bulk_out; int i; int retval; /* allocate memory for our device state and initialize it */ dev = kzalloc(sizeof(*dev), GFP_KERNEL); if (!dev) return -ENOMEM; kref_init(&dev->kref); sema_init(&dev->limit_sem, USB_LCD_CONCURRENT_WRITES); init_rwsem(&dev->io_rwsem); init_usb_anchor(&dev->submitted); dev->udev = usb_get_dev(interface_to_usbdev(interface)); dev->interface = interface; if (le16_to_cpu(dev->udev->descriptor.idProduct) != 0x0001) { dev_warn(&interface->dev, "USBLCD model not supported.\n"); retval = -ENODEV; goto error; } /* set up the endpoint information */ /* use only the first bulk-in and bulk-out endpoints */ retval = usb_find_common_endpoints(interface->cur_altsetting, &bulk_in, &bulk_out, NULL, NULL); if (retval) { dev_err(&interface->dev, "Could not find both bulk-in and bulk-out endpoints\n"); goto error; } dev->bulk_in_size = usb_endpoint_maxp(bulk_in); dev->bulk_in_endpointAddr = bulk_in->bEndpointAddress; dev->bulk_in_buffer = kmalloc(dev->bulk_in_size, GFP_KERNEL); if (!dev->bulk_in_buffer) { retval = -ENOMEM; goto error; } dev->bulk_out_endpointAddr = bulk_out->bEndpointAddress; /* save our data pointer in this interface device */ usb_set_intfdata(interface, dev); /* we can register the device now, as it is ready */ retval = usb_register_dev(interface, &lcd_class); if (retval) { /* something prevented us from registering this driver */ dev_err(&interface->dev, "Not able to get a minor for this device.\n"); goto error; } i = le16_to_cpu(dev->udev->descriptor.bcdDevice); dev_info(&interface->dev, "USBLCD Version %1d%1d.%1d%1d found " "at address %d\n", (i & 0xF000)>>12, (i & 0xF00)>>8, (i & 0xF0)>>4, (i & 0xF), dev->udev->devnum); /* let the user know what node this device is now attached to */ dev_info(&interface->dev, "USB LCD device now attached to USBLCD-%d\n", interface->minor); return 0; error: kref_put(&dev->kref, lcd_delete); return retval; } static void lcd_draw_down(struct usb_lcd *dev) { int time; time = usb_wait_anchor_empty_timeout(&dev->submitted, 1000); if (!time) usb_kill_anchored_urbs(&dev->submitted); } static int lcd_suspend(struct usb_interface *intf, pm_message_t message) { struct usb_lcd *dev = usb_get_intfdata(intf); if (!dev) return 0; lcd_draw_down(dev); return 0; } static int lcd_resume(struct usb_interface *intf) { return 0; } static void lcd_disconnect(struct usb_interface *interface) { struct usb_lcd *dev = usb_get_intfdata(interface); int minor = interface->minor; /* give back our minor */ usb_deregister_dev(interface, &lcd_class); down_write(&dev->io_rwsem); dev->disconnected = 1; up_write(&dev->io_rwsem); usb_kill_anchored_urbs(&dev->submitted); /* decrement our usage count */ kref_put(&dev->kref, lcd_delete); dev_info(&interface->dev, "USB LCD #%d now disconnected\n", minor); } static struct usb_driver lcd_driver = { .name = "usblcd", .probe = lcd_probe, .disconnect = lcd_disconnect, .suspend = lcd_suspend, .resume = lcd_resume, .id_table = id_table, .supports_autosuspend = 1, }; module_usb_driver(lcd_driver); MODULE_AUTHOR("Georges Toth <g.toth@e-biz.lu>"); MODULE_DESCRIPTION(DRIVER_VERSION); MODULE_LICENSE("GPL"); |
| 6 6 9 7 7 6 7 7 6 6 6 6 5 6 6 6 6 4 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 | // SPDX-License-Identifier: GPL-2.0-only /* * LED support for the input layer * * Copyright 2010-2015 Samuel Thibault <samuel.thibault@ens-lyon.org> */ #include <linux/kernel.h> #include <linux/slab.h> #include <linux/module.h> #include <linux/init.h> #include <linux/leds.h> #include <linux/input.h> #if IS_ENABLED(CONFIG_VT) #define VT_TRIGGER(_name) .trigger = _name #else #define VT_TRIGGER(_name) .trigger = NULL #endif #if IS_ENABLED(CONFIG_SND_CTL_LED) #define AUDIO_TRIGGER(_name) .trigger = _name #else #define AUDIO_TRIGGER(_name) .trigger = NULL #endif static const struct { const char *name; const char *trigger; } input_led_info[LED_CNT] = { [LED_NUML] = { "numlock", VT_TRIGGER("kbd-numlock") }, [LED_CAPSL] = { "capslock", VT_TRIGGER("kbd-capslock") }, [LED_SCROLLL] = { "scrolllock", VT_TRIGGER("kbd-scrolllock") }, [LED_COMPOSE] = { "compose" }, [LED_KANA] = { "kana", VT_TRIGGER("kbd-kanalock") }, [LED_SLEEP] = { "sleep" } , [LED_SUSPEND] = { "suspend" }, [LED_MUTE] = { "mute", AUDIO_TRIGGER("audio-mute") }, [LED_MISC] = { "misc" }, [LED_MAIL] = { "mail" }, [LED_CHARGING] = { "charging" }, }; struct input_led { struct led_classdev cdev; struct input_handle *handle; unsigned int code; /* One of LED_* constants */ }; struct input_leds { struct input_handle handle; unsigned int num_leds; struct input_led leds[] __counted_by(num_leds); }; static enum led_brightness input_leds_brightness_get(struct led_classdev *cdev) { struct input_led *led = container_of(cdev, struct input_led, cdev); struct input_dev *input = led->handle->dev; return test_bit(led->code, input->led) ? cdev->max_brightness : 0; } static void input_leds_brightness_set(struct led_classdev *cdev, enum led_brightness brightness) { struct input_led *led = container_of(cdev, struct input_led, cdev); input_inject_event(led->handle, EV_LED, led->code, !!brightness); } static void input_leds_event(struct input_handle *handle, unsigned int type, unsigned int code, int value) { } static int input_leds_get_count(struct input_dev *dev) { unsigned int led_code; int count = 0; for_each_set_bit(led_code, dev->ledbit, LED_CNT) if (input_led_info[led_code].name) count++; return count; } static int input_leds_connect(struct input_handler *handler, struct input_dev *dev, const struct input_device_id *id) { struct input_leds *leds; struct input_led *led; unsigned int num_leds; unsigned int led_code; int led_no; int error; num_leds = input_leds_get_count(dev); if (!num_leds) return -ENXIO; leds = kzalloc(struct_size(leds, leds, num_leds), GFP_KERNEL); if (!leds) return -ENOMEM; leds->num_leds = num_leds; leds->handle.dev = dev; leds->handle.handler = handler; leds->handle.name = "leds"; leds->handle.private = leds; error = input_register_handle(&leds->handle); if (error) goto err_free_mem; error = input_open_device(&leds->handle); if (error) goto err_unregister_handle; led_no = 0; for_each_set_bit(led_code, dev->ledbit, LED_CNT) { if (!input_led_info[led_code].name) continue; led = &leds->leds[led_no]; led->handle = &leds->handle; led->code = led_code; led->cdev.name = kasprintf(GFP_KERNEL, "%s::%s", dev_name(&dev->dev), input_led_info[led_code].name); if (!led->cdev.name) { error = -ENOMEM; goto err_unregister_leds; } led->cdev.max_brightness = 1; led->cdev.brightness_get = input_leds_brightness_get; led->cdev.brightness_set = input_leds_brightness_set; led->cdev.default_trigger = input_led_info[led_code].trigger; error = led_classdev_register(&dev->dev, &led->cdev); if (error) { dev_err(&dev->dev, "failed to register LED %s: %d\n", led->cdev.name, error); kfree(led->cdev.name); goto err_unregister_leds; } led_no++; } return 0; err_unregister_leds: while (--led_no >= 0) { struct input_led *led = &leds->leds[led_no]; led_classdev_unregister(&led->cdev); kfree(led->cdev.name); } input_close_device(&leds->handle); err_unregister_handle: input_unregister_handle(&leds->handle); err_free_mem: kfree(leds); return error; } static void input_leds_disconnect(struct input_handle *handle) { struct input_leds *leds = handle->private; int i; for (i = 0; i < leds->num_leds; i++) { struct input_led *led = &leds->leds[i]; led_classdev_unregister(&led->cdev); kfree(led->cdev.name); } input_close_device(handle); input_unregister_handle(handle); kfree(leds); } static const struct input_device_id input_leds_ids[] = { { .flags = INPUT_DEVICE_ID_MATCH_EVBIT, .evbit = { BIT_MASK(EV_LED) }, }, { }, }; MODULE_DEVICE_TABLE(input, input_leds_ids); static struct input_handler input_leds_handler = { .event = input_leds_event, .connect = input_leds_connect, .disconnect = input_leds_disconnect, .name = "leds", .id_table = input_leds_ids, }; static int __init input_leds_init(void) { return input_register_handler(&input_leds_handler); } module_init(input_leds_init); static void __exit input_leds_exit(void) { input_unregister_handler(&input_leds_handler); } module_exit(input_leds_exit); MODULE_AUTHOR("Samuel Thibault <samuel.thibault@ens-lyon.org>"); MODULE_AUTHOR("Dmitry Torokhov <dmitry.torokhov@gmail.com>"); MODULE_DESCRIPTION("Input -> LEDs Bridge"); MODULE_LICENSE("GPL v2"); |
| 20 9 9 9 9 9 8 8 8 8 8 8 8 8 8 8 3 3 16 16 15 15 15 15 16 10 10 10 8 10 8 8 7 8 8 10 7 7 7 7 2 5 16 17 17 17 6 1 16 16 16 14 16 20 20 14 20 20 20 20 20 3 17 15 15 21 20 20 17 1 1 27 27 27 21 27 4 2 1 1 1 1 1 3 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 | // SPDX-License-Identifier: GPL-2.0-or-later /* * * Bluetooth HCI UART driver * * Copyright (C) 2000-2001 Qualcomm Incorporated * Copyright (C) 2002-2003 Maxim Krasnyansky <maxk@qualcomm.com> * Copyright (C) 2004-2005 Marcel Holtmann <marcel@holtmann.org> */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/init.h> #include <linux/types.h> #include <linux/fcntl.h> #include <linux/interrupt.h> #include <linux/ptrace.h> #include <linux/poll.h> #include <linux/slab.h> #include <linux/tty.h> #include <linux/errno.h> #include <linux/string.h> #include <linux/signal.h> #include <linux/ioctl.h> #include <linux/skbuff.h> #include <linux/firmware.h> #include <linux/serdev.h> #include <net/bluetooth/bluetooth.h> #include <net/bluetooth/hci_core.h> #include "btintel.h" #include "btbcm.h" #include "hci_uart.h" #define VERSION "2.3" static const struct hci_uart_proto *hup[HCI_UART_MAX_PROTO]; int hci_uart_register_proto(const struct hci_uart_proto *p) { if (p->id >= HCI_UART_MAX_PROTO) return -EINVAL; if (hup[p->id]) return -EEXIST; hup[p->id] = p; BT_INFO("HCI UART protocol %s registered", p->name); return 0; } int hci_uart_unregister_proto(const struct hci_uart_proto *p) { if (p->id >= HCI_UART_MAX_PROTO) return -EINVAL; if (!hup[p->id]) return -EINVAL; hup[p->id] = NULL; return 0; } static const struct hci_uart_proto *hci_uart_get_proto(unsigned int id) { if (id >= HCI_UART_MAX_PROTO) return NULL; return hup[id]; } static inline void hci_uart_tx_complete(struct hci_uart *hu, int pkt_type) { struct hci_dev *hdev = hu->hdev; /* Update HCI stat counters */ switch (pkt_type) { case HCI_COMMAND_PKT: hdev->stat.cmd_tx++; break; case HCI_ACLDATA_PKT: hdev->stat.acl_tx++; break; case HCI_SCODATA_PKT: hdev->stat.sco_tx++; break; } } static inline struct sk_buff *hci_uart_dequeue(struct hci_uart *hu) { struct sk_buff *skb = hu->tx_skb; if (!skb) { percpu_down_read(&hu->proto_lock); if (test_bit(HCI_UART_PROTO_READY, &hu->flags) || test_bit(HCI_UART_PROTO_INIT, &hu->flags)) skb = hu->proto->dequeue(hu); percpu_up_read(&hu->proto_lock); } else { hu->tx_skb = NULL; } return skb; } int hci_uart_tx_wakeup(struct hci_uart *hu) { /* This may be called in an IRQ context, so we can't sleep. Therefore * we try to acquire the lock only, and if that fails we assume the * tty is being closed because that is the only time the write lock is * acquired. If, however, at some point in the future the write lock * is also acquired in other situations, then this must be revisited. */ if (!percpu_down_read_trylock(&hu->proto_lock)) return 0; if (!test_bit(HCI_UART_PROTO_READY, &hu->flags) && !test_bit(HCI_UART_PROTO_INIT, &hu->flags)) goto no_schedule; set_bit(HCI_UART_TX_WAKEUP, &hu->tx_state); if (test_and_set_bit(HCI_UART_SENDING, &hu->tx_state)) goto no_schedule; BT_DBG(""); schedule_work(&hu->write_work); no_schedule: percpu_up_read(&hu->proto_lock); return 0; } EXPORT_SYMBOL_GPL(hci_uart_tx_wakeup); static void hci_uart_write_work(struct work_struct *work) { struct hci_uart *hu = container_of(work, struct hci_uart, write_work); struct tty_struct *tty = hu->tty; struct hci_dev *hdev = hu->hdev; struct sk_buff *skb; /* REVISIT: should we cope with bad skbs or ->write() returning * and error value ? */ restart: clear_bit(HCI_UART_TX_WAKEUP, &hu->tx_state); while ((skb = hci_uart_dequeue(hu))) { int len; set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags); len = tty->ops->write(tty, skb->data, skb->len); hdev->stat.byte_tx += len; skb_pull(skb, len); if (skb->len) { hu->tx_skb = skb; break; } hci_uart_tx_complete(hu, hci_skb_pkt_type(skb)); kfree_skb(skb); } clear_bit(HCI_UART_SENDING, &hu->tx_state); if (test_bit(HCI_UART_TX_WAKEUP, &hu->tx_state)) goto restart; wake_up_bit(&hu->tx_state, HCI_UART_SENDING); } void hci_uart_init_work(struct work_struct *work) { struct hci_uart *hu = container_of(work, struct hci_uart, init_ready); int err; struct hci_dev *hdev; if (!test_and_clear_bit(HCI_UART_INIT_PENDING, &hu->hdev_flags)) return; err = hci_register_dev(hu->hdev); if (err < 0) { BT_ERR("Can't register HCI device"); clear_bit(HCI_UART_PROTO_READY, &hu->flags); hu->proto->close(hu); hdev = hu->hdev; hu->hdev = NULL; hci_free_dev(hdev); return; } set_bit(HCI_UART_REGISTERED, &hu->flags); } int hci_uart_init_ready(struct hci_uart *hu) { if (!test_bit(HCI_UART_INIT_PENDING, &hu->hdev_flags)) return -EALREADY; schedule_work(&hu->init_ready); return 0; } int hci_uart_wait_until_sent(struct hci_uart *hu) { return wait_on_bit_timeout(&hu->tx_state, HCI_UART_SENDING, TASK_INTERRUPTIBLE, msecs_to_jiffies(2000)); } /* ------- Interface to HCI layer ------ */ /* Reset device */ static int hci_uart_flush(struct hci_dev *hdev) { struct hci_uart *hu = hci_get_drvdata(hdev); struct tty_struct *tty = hu->tty; BT_DBG("hdev %p tty %p", hdev, tty); if (hu->tx_skb) { kfree_skb(hu->tx_skb); hu->tx_skb = NULL; } /* Flush any pending characters in the driver and discipline. */ tty_ldisc_flush(tty); tty_driver_flush_buffer(tty); percpu_down_read(&hu->proto_lock); if (test_bit(HCI_UART_PROTO_READY, &hu->flags)) hu->proto->flush(hu); percpu_up_read(&hu->proto_lock); return 0; } /* Initialize device */ static int hci_uart_open(struct hci_dev *hdev) { BT_DBG("%s %p", hdev->name, hdev); /* Undo clearing this from hci_uart_close() */ hdev->flush = hci_uart_flush; return 0; } /* Close device */ static int hci_uart_close(struct hci_dev *hdev) { BT_DBG("hdev %p", hdev); hci_uart_flush(hdev); hdev->flush = NULL; return 0; } /* Send frames from HCI layer */ static int hci_uart_send_frame(struct hci_dev *hdev, struct sk_buff *skb) { struct hci_uart *hu = hci_get_drvdata(hdev); BT_DBG("%s: type %d len %d", hdev->name, hci_skb_pkt_type(skb), skb->len); percpu_down_read(&hu->proto_lock); if (!test_bit(HCI_UART_PROTO_READY, &hu->flags) && !test_bit(HCI_UART_PROTO_INIT, &hu->flags)) { percpu_up_read(&hu->proto_lock); return -EUNATCH; } hu->proto->enqueue(hu, skb); percpu_up_read(&hu->proto_lock); hci_uart_tx_wakeup(hu); return 0; } /* Check the underlying device or tty has flow control support */ bool hci_uart_has_flow_control(struct hci_uart *hu) { /* serdev nodes check if the needed operations are present */ if (hu->serdev) return true; if (hu->tty->driver->ops->tiocmget && hu->tty->driver->ops->tiocmset) return true; return false; } /* Flow control or un-flow control the device */ void hci_uart_set_flow_control(struct hci_uart *hu, bool enable) { struct tty_struct *tty = hu->tty; struct ktermios ktermios; int status; unsigned int set = 0; unsigned int clear = 0; if (hu->serdev) { serdev_device_set_flow_control(hu->serdev, !enable); serdev_device_set_rts(hu->serdev, !enable); return; } if (enable) { /* Disable hardware flow control */ ktermios = tty->termios; ktermios.c_cflag &= ~CRTSCTS; tty_set_termios(tty, &ktermios); BT_DBG("Disabling hardware flow control: %s", (tty->termios.c_cflag & CRTSCTS) ? "failed" : "success"); /* Clear RTS to prevent the device from sending */ /* Most UARTs need OUT2 to enable interrupts */ status = tty->driver->ops->tiocmget(tty); BT_DBG("Current tiocm 0x%x", status); set &= ~(TIOCM_OUT2 | TIOCM_RTS); clear = ~set; set &= TIOCM_DTR | TIOCM_RTS | TIOCM_OUT1 | TIOCM_OUT2 | TIOCM_LOOP; clear &= TIOCM_DTR | TIOCM_RTS | TIOCM_OUT1 | TIOCM_OUT2 | TIOCM_LOOP; status = tty->driver->ops->tiocmset(tty, set, clear); BT_DBG("Clearing RTS: %s", status ? "failed" : "success"); } else { /* Set RTS to allow the device to send again */ status = tty->driver->ops->tiocmget(tty); BT_DBG("Current tiocm 0x%x", status); set |= (TIOCM_OUT2 | TIOCM_RTS); clear = ~set; set &= TIOCM_DTR | TIOCM_RTS | TIOCM_OUT1 | TIOCM_OUT2 | TIOCM_LOOP; clear &= TIOCM_DTR | TIOCM_RTS | TIOCM_OUT1 | TIOCM_OUT2 | TIOCM_LOOP; status = tty->driver->ops->tiocmset(tty, set, clear); BT_DBG("Setting RTS: %s", status ? "failed" : "success"); /* Re-enable hardware flow control */ ktermios = tty->termios; ktermios.c_cflag |= CRTSCTS; tty_set_termios(tty, &ktermios); BT_DBG("Enabling hardware flow control: %s", !(tty->termios.c_cflag & CRTSCTS) ? "failed" : "success"); } } void hci_uart_set_speeds(struct hci_uart *hu, unsigned int init_speed, unsigned int oper_speed) { hu->init_speed = init_speed; hu->oper_speed = oper_speed; } void hci_uart_set_baudrate(struct hci_uart *hu, unsigned int speed) { struct tty_struct *tty = hu->tty; struct ktermios ktermios; ktermios = tty->termios; ktermios.c_cflag &= ~CBAUD; tty_termios_encode_baud_rate(&ktermios, speed, speed); /* tty_set_termios() return not checked as it is always 0 */ tty_set_termios(tty, &ktermios); BT_DBG("%s: New tty speeds: %d/%d", hu->hdev->name, tty->termios.c_ispeed, tty->termios.c_ospeed); } static int hci_uart_setup(struct hci_dev *hdev) { struct hci_uart *hu = hci_get_drvdata(hdev); struct hci_rp_read_local_version *ver; struct sk_buff *skb; unsigned int speed; int err; /* Init speed if any */ if (hu->init_speed) speed = hu->init_speed; else if (hu->proto->init_speed) speed = hu->proto->init_speed; else speed = 0; if (speed) hci_uart_set_baudrate(hu, speed); /* Operational speed if any */ if (hu->oper_speed) speed = hu->oper_speed; else if (hu->proto->oper_speed) speed = hu->proto->oper_speed; else speed = 0; if (hu->proto->set_baudrate && speed) { err = hu->proto->set_baudrate(hu, speed); if (!err) hci_uart_set_baudrate(hu, speed); } if (hu->proto->setup) return hu->proto->setup(hu); if (!test_bit(HCI_UART_VND_DETECT, &hu->hdev_flags)) return 0; skb = __hci_cmd_sync(hdev, HCI_OP_READ_LOCAL_VERSION, 0, NULL, HCI_INIT_TIMEOUT); if (IS_ERR(skb)) { BT_ERR("%s: Reading local version information failed (%ld)", hdev->name, PTR_ERR(skb)); return 0; } if (skb->len != sizeof(*ver)) { BT_ERR("%s: Event length mismatch for version information", hdev->name); goto done; } ver = (struct hci_rp_read_local_version *)skb->data; switch (le16_to_cpu(ver->manufacturer)) { #ifdef CONFIG_BT_HCIUART_INTEL case 2: hdev->set_bdaddr = btintel_set_bdaddr; btintel_check_bdaddr(hdev); break; #endif #ifdef CONFIG_BT_HCIUART_BCM case 15: hdev->set_bdaddr = btbcm_set_bdaddr; btbcm_check_bdaddr(hdev); break; #endif default: break; } done: kfree_skb(skb); return 0; } /* ------ LDISC part ------ */ /* hci_uart_tty_open * * Called when line discipline changed to HCI_UART. * * Arguments: * tty pointer to tty info structure * Return Value: * 0 if success, otherwise error code */ static int hci_uart_tty_open(struct tty_struct *tty) { struct hci_uart *hu; BT_DBG("tty %p", tty); if (!capable(CAP_NET_ADMIN)) return -EPERM; /* Error if the tty has no write op instead of leaving an exploitable * hole */ if (tty->ops->write == NULL) return -EOPNOTSUPP; hu = kzalloc(sizeof(*hu), GFP_KERNEL); if (!hu) { BT_ERR("Can't allocate control structure"); return -ENFILE; } if (percpu_init_rwsem(&hu->proto_lock)) { BT_ERR("Can't allocate semaphore structure"); kfree(hu); return -ENOMEM; } tty->disc_data = hu; hu->tty = tty; tty->receive_room = 65536; /* disable alignment support by default */ hu->alignment = 1; hu->padding = 0; /* Use serial port speed as oper_speed */ hu->oper_speed = tty->termios.c_ospeed; INIT_WORK(&hu->init_ready, hci_uart_init_work); INIT_WORK(&hu->write_work, hci_uart_write_work); /* Flush any pending characters in the driver */ tty_driver_flush_buffer(tty); return 0; } /* hci_uart_tty_close() * * Called when the line discipline is changed to something * else, the tty is closed, or the tty detects a hangup. */ static void hci_uart_tty_close(struct tty_struct *tty) { struct hci_uart *hu = tty->disc_data; struct hci_dev *hdev; BT_DBG("tty %p", tty); /* Detach from the tty */ tty->disc_data = NULL; if (!hu) return; hdev = hu->hdev; if (hdev) hci_uart_close(hdev); if (test_bit(HCI_UART_PROTO_READY, &hu->flags)) { percpu_down_write(&hu->proto_lock); clear_bit(HCI_UART_PROTO_READY, &hu->flags); percpu_up_write(&hu->proto_lock); cancel_work_sync(&hu->init_ready); cancel_work_sync(&hu->write_work); if (hdev) { if (test_bit(HCI_UART_REGISTERED, &hu->flags)) hci_unregister_dev(hdev); hci_free_dev(hdev); } hu->proto->close(hu); } clear_bit(HCI_UART_PROTO_SET, &hu->flags); percpu_free_rwsem(&hu->proto_lock); kfree(hu); } /* hci_uart_tty_wakeup() * * Callback for transmit wakeup. Called when low level * device driver can accept more send data. * * Arguments: tty pointer to associated tty instance data * Return Value: None */ static void hci_uart_tty_wakeup(struct tty_struct *tty) { struct hci_uart *hu = tty->disc_data; BT_DBG(""); if (!hu) return; clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags); if (tty != hu->tty) return; if (test_bit(HCI_UART_PROTO_READY, &hu->flags) || test_bit(HCI_UART_PROTO_INIT, &hu->flags)) hci_uart_tx_wakeup(hu); } /* hci_uart_tty_receive() * * Called by tty low level driver when receive data is * available. * * Arguments: tty pointer to tty instance data * data pointer to received data * flags pointer to flags for data * count count of received data in bytes * * Return Value: None */ static void hci_uart_tty_receive(struct tty_struct *tty, const u8 *data, const u8 *flags, size_t count) { struct hci_uart *hu = tty->disc_data; if (!hu || tty != hu->tty) return; percpu_down_read(&hu->proto_lock); if (!test_bit(HCI_UART_PROTO_READY, &hu->flags) && !test_bit(HCI_UART_PROTO_INIT, &hu->flags)) { percpu_up_read(&hu->proto_lock); return; } /* It does not need a lock here as it is already protected by a mutex in * tty caller */ hu->proto->recv(hu, data, count); percpu_up_read(&hu->proto_lock); if (hu->hdev) hu->hdev->stat.byte_rx += count; tty_unthrottle(tty); } static int hci_uart_register_dev(struct hci_uart *hu) { struct hci_dev *hdev; int err; BT_DBG(""); /* Initialize and register HCI device */ hdev = hci_alloc_dev(); if (!hdev) { BT_ERR("Can't allocate HCI device"); return -ENOMEM; } hu->hdev = hdev; hdev->bus = HCI_UART; hci_set_drvdata(hdev, hu); /* Only when vendor specific setup callback is provided, consider * the manufacturer information valid. This avoids filling in the * value for Ericsson when nothing is specified. */ if (hu->proto->setup) hdev->manufacturer = hu->proto->manufacturer; hdev->open = hci_uart_open; hdev->close = hci_uart_close; hdev->flush = hci_uart_flush; hdev->send = hci_uart_send_frame; hdev->setup = hci_uart_setup; SET_HCIDEV_DEV(hdev, hu->tty->dev); if (test_bit(HCI_UART_RAW_DEVICE, &hu->hdev_flags)) hci_set_quirk(hdev, HCI_QUIRK_RAW_DEVICE); if (test_bit(HCI_UART_EXT_CONFIG, &hu->hdev_flags)) hci_set_quirk(hdev, HCI_QUIRK_EXTERNAL_CONFIG); if (!test_bit(HCI_UART_RESET_ON_INIT, &hu->hdev_flags)) hci_set_quirk(hdev, HCI_QUIRK_RESET_ON_CLOSE); /* Only call open() for the protocol after hdev is fully initialized as * open() (or a timer/workqueue it starts) may attempt to reference it. */ err = hu->proto->open(hu); if (err) { hu->hdev = NULL; hci_free_dev(hdev); return err; } if (test_bit(HCI_UART_INIT_PENDING, &hu->hdev_flags)) return 0; if (hci_register_dev(hdev) < 0) { BT_ERR("Can't register HCI device"); hu->proto->close(hu); hu->hdev = NULL; hci_free_dev(hdev); return -ENODEV; } set_bit(HCI_UART_REGISTERED, &hu->flags); return 0; } static int hci_uart_set_proto(struct hci_uart *hu, int id) { const struct hci_uart_proto *p; int err; p = hci_uart_get_proto(id); if (!p) return -EPROTONOSUPPORT; hu->proto = p; set_bit(HCI_UART_PROTO_INIT, &hu->flags); err = hci_uart_register_dev(hu); if (err) { return err; } set_bit(HCI_UART_PROTO_READY, &hu->flags); clear_bit(HCI_UART_PROTO_INIT, &hu->flags); return 0; } static int hci_uart_set_flags(struct hci_uart *hu, unsigned long flags) { unsigned long valid_flags = BIT(HCI_UART_RAW_DEVICE) | BIT(HCI_UART_RESET_ON_INIT) | BIT(HCI_UART_INIT_PENDING) | BIT(HCI_UART_EXT_CONFIG) | BIT(HCI_UART_VND_DETECT); if (flags & ~valid_flags) return -EINVAL; hu->hdev_flags = flags; return 0; } /* hci_uart_tty_ioctl() * * Process IOCTL system call for the tty device. * * Arguments: * * tty pointer to tty instance data * cmd IOCTL command code * arg argument for IOCTL call (cmd dependent) * * Return Value: Command dependent */ static int hci_uart_tty_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) { struct hci_uart *hu = tty->disc_data; int err = 0; BT_DBG(""); /* Verify the status of the device */ if (!hu) return -EBADF; switch (cmd) { case HCIUARTSETPROTO: if (!test_and_set_bit(HCI_UART_PROTO_SET, &hu->flags)) { err = hci_uart_set_proto(hu, arg); if (err) clear_bit(HCI_UART_PROTO_SET, &hu->flags); } else err = -EBUSY; break; case HCIUARTGETPROTO: if (test_bit(HCI_UART_PROTO_SET, &hu->flags) && test_bit(HCI_UART_PROTO_READY, &hu->flags)) err = hu->proto->id; else err = -EUNATCH; break; case HCIUARTGETDEVICE: if (test_bit(HCI_UART_REGISTERED, &hu->flags)) err = hu->hdev->id; else err = -EUNATCH; break; case HCIUARTSETFLAGS: if (test_bit(HCI_UART_PROTO_SET, &hu->flags)) err = -EBUSY; else err = hci_uart_set_flags(hu, arg); break; case HCIUARTGETFLAGS: err = hu->hdev_flags; break; default: err = n_tty_ioctl_helper(tty, cmd, arg); break; } return err; } /* * We don't provide read/write/poll interface for user space. */ static ssize_t hci_uart_tty_read(struct tty_struct *tty, struct file *file, u8 *buf, size_t nr, void **cookie, unsigned long offset) { return 0; } static ssize_t hci_uart_tty_write(struct tty_struct *tty, struct file *file, const u8 *data, size_t count) { return 0; } static struct tty_ldisc_ops hci_uart_ldisc = { .owner = THIS_MODULE, .num = N_HCI, .name = "n_hci", .open = hci_uart_tty_open, .close = hci_uart_tty_close, .read = hci_uart_tty_read, .write = hci_uart_tty_write, .ioctl = hci_uart_tty_ioctl, .compat_ioctl = hci_uart_tty_ioctl, .receive_buf = hci_uart_tty_receive, .write_wakeup = hci_uart_tty_wakeup, }; static int __init hci_uart_init(void) { int err; BT_INFO("HCI UART driver ver %s", VERSION); /* Register the tty discipline */ err = tty_register_ldisc(&hci_uart_ldisc); if (err) { BT_ERR("HCI line discipline registration failed. (%d)", err); return err; } #ifdef CONFIG_BT_HCIUART_H4 h4_init(); #endif #ifdef CONFIG_BT_HCIUART_BCSP bcsp_init(); #endif #ifdef CONFIG_BT_HCIUART_LL ll_init(); #endif #ifdef CONFIG_BT_HCIUART_ATH3K ath_init(); #endif #ifdef CONFIG_BT_HCIUART_3WIRE h5_init(); #endif #ifdef CONFIG_BT_HCIUART_INTEL intel_init(); #endif #ifdef CONFIG_BT_HCIUART_BCM bcm_init(); #endif #ifdef CONFIG_BT_HCIUART_QCA qca_init(); #endif #ifdef CONFIG_BT_HCIUART_AG6XX ag6xx_init(); #endif #ifdef CONFIG_BT_HCIUART_MRVL mrvl_init(); #endif #ifdef CONFIG_BT_HCIUART_AML aml_init(); #endif return 0; } static void __exit hci_uart_exit(void) { #ifdef CONFIG_BT_HCIUART_H4 h4_deinit(); #endif #ifdef CONFIG_BT_HCIUART_BCSP bcsp_deinit(); #endif #ifdef CONFIG_BT_HCIUART_LL ll_deinit(); #endif #ifdef CONFIG_BT_HCIUART_ATH3K ath_deinit(); #endif #ifdef CONFIG_BT_HCIUART_3WIRE h5_deinit(); #endif #ifdef CONFIG_BT_HCIUART_INTEL intel_deinit(); #endif #ifdef CONFIG_BT_HCIUART_BCM bcm_deinit(); #endif #ifdef CONFIG_BT_HCIUART_QCA qca_deinit(); #endif #ifdef CONFIG_BT_HCIUART_AG6XX ag6xx_deinit(); #endif #ifdef CONFIG_BT_HCIUART_MRVL mrvl_deinit(); #endif #ifdef CONFIG_BT_HCIUART_AML aml_deinit(); #endif tty_unregister_ldisc(&hci_uart_ldisc); } module_init(hci_uart_init); module_exit(hci_uart_exit); MODULE_AUTHOR("Marcel Holtmann <marcel@holtmann.org>"); MODULE_DESCRIPTION("Bluetooth HCI UART driver ver " VERSION); MODULE_VERSION(VERSION); MODULE_LICENSE("GPL"); MODULE_ALIAS_LDISC(N_HCI); |
| 1705 1686 10 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _NET_FLOW_DISSECTOR_H #define _NET_FLOW_DISSECTOR_H #include <linux/types.h> #include <linux/in6.h> #include <linux/siphash.h> #include <linux/string.h> #include <uapi/linux/if_ether.h> #include <uapi/linux/pkt_cls.h> struct bpf_prog; struct net; struct sk_buff; /** * struct flow_dissector_key_control: * @thoff: Transport header offset * @addr_type: Type of key. One of FLOW_DISSECTOR_KEY_* * @flags: Key flags. * Any of FLOW_DIS_(IS_FRAGMENT|FIRST_FRAG|ENCAPSULATION|F_*) */ struct flow_dissector_key_control { u16 thoff; u16 addr_type; u32 flags; }; /* The control flags are kept in sync with TCA_FLOWER_KEY_FLAGS_*, as those * flags are exposed to userspace in some error paths, ie. unsupported flags. */ enum flow_dissector_ctrl_flags { FLOW_DIS_IS_FRAGMENT = TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT, FLOW_DIS_FIRST_FRAG = TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST, FLOW_DIS_F_TUNNEL_CSUM = TCA_FLOWER_KEY_FLAGS_TUNNEL_CSUM, FLOW_DIS_F_TUNNEL_DONT_FRAGMENT = TCA_FLOWER_KEY_FLAGS_TUNNEL_DONT_FRAGMENT, FLOW_DIS_F_TUNNEL_OAM = TCA_FLOWER_KEY_FLAGS_TUNNEL_OAM, FLOW_DIS_F_TUNNEL_CRIT_OPT = TCA_FLOWER_KEY_FLAGS_TUNNEL_CRIT_OPT, /* These flags are internal to the kernel */ FLOW_DIS_ENCAPSULATION = (TCA_FLOWER_KEY_FLAGS_MAX << 1), }; enum flow_dissect_ret { FLOW_DISSECT_RET_OUT_GOOD, FLOW_DISSECT_RET_OUT_BAD, FLOW_DISSECT_RET_PROTO_AGAIN, FLOW_DISSECT_RET_IPPROTO_AGAIN, FLOW_DISSECT_RET_CONTINUE, }; /** * struct flow_dissector_key_basic: * @n_proto: Network header protocol (eg. IPv4/IPv6) * @ip_proto: Transport header protocol (eg. TCP/UDP) * @padding: Unused */ struct flow_dissector_key_basic { __be16 n_proto; u8 ip_proto; u8 padding; }; struct flow_dissector_key_tags { u32 flow_label; }; struct flow_dissector_key_vlan { union { struct { u16 vlan_id:12, vlan_dei:1, vlan_priority:3; }; __be16 vlan_tci; }; __be16 vlan_tpid; __be16 vlan_eth_type; u16 padding; }; struct flow_dissector_mpls_lse { u32 mpls_ttl:8, mpls_bos:1, mpls_tc:3, mpls_label:20; }; #define FLOW_DIS_MPLS_MAX 7 struct flow_dissector_key_mpls { struct flow_dissector_mpls_lse ls[FLOW_DIS_MPLS_MAX]; /* Label Stack */ u8 used_lses; /* One bit set for each Label Stack Entry in use */ }; static inline void dissector_set_mpls_lse(struct flow_dissector_key_mpls *mpls, int lse_index) { mpls->used_lses |= 1 << lse_index; } #define FLOW_DIS_TUN_OPTS_MAX 255 /** * struct flow_dissector_key_enc_opts: * @data: tunnel option data * @len: length of tunnel option data * @dst_opt_type: tunnel option type */ struct flow_dissector_key_enc_opts { u8 data[FLOW_DIS_TUN_OPTS_MAX]; /* Using IP_TUNNEL_OPTS_MAX is desired * here but seems difficult to #include */ u8 len; u32 dst_opt_type; }; struct flow_dissector_key_keyid { __be32 keyid; }; /** * struct flow_dissector_key_ipv4_addrs: * @src: source ip address * @dst: destination ip address */ struct flow_dissector_key_ipv4_addrs { /* (src,dst) must be grouped, in the same way than in IP header */ __be32 src; __be32 dst; }; /** * struct flow_dissector_key_ipv6_addrs: * @src: source ip address * @dst: destination ip address */ struct flow_dissector_key_ipv6_addrs { /* (src,dst) must be grouped, in the same way than in IP header */ struct in6_addr src; struct in6_addr dst; }; /** * struct flow_dissector_key_tipc: * @key: source node address combined with selector */ struct flow_dissector_key_tipc { __be32 key; }; /** * struct flow_dissector_key_addrs: * @v4addrs: IPv4 addresses * @v6addrs: IPv6 addresses * @tipckey: TIPC key */ struct flow_dissector_key_addrs { union { struct flow_dissector_key_ipv4_addrs v4addrs; struct flow_dissector_key_ipv6_addrs v6addrs; struct flow_dissector_key_tipc tipckey; }; }; /** * struct flow_dissector_key_arp: * @sip: Sender IP address * @tip: Target IP address * @op: Operation * @sha: Sender hardware address * @tha: Target hardware address */ struct flow_dissector_key_arp { __u32 sip; __u32 tip; __u8 op; unsigned char sha[ETH_ALEN]; unsigned char tha[ETH_ALEN]; }; /** * struct flow_dissector_key_ports: * @ports: port numbers of Transport header * @src: source port number * @dst: destination port number */ struct flow_dissector_key_ports { union { __be32 ports; struct { __be16 src; __be16 dst; }; }; }; /** * struct flow_dissector_key_ports_range * @tp: port number from packet * @tp_min: min port number in range * @tp_max: max port number in range */ struct flow_dissector_key_ports_range { union { struct flow_dissector_key_ports tp; struct { struct flow_dissector_key_ports tp_min; struct flow_dissector_key_ports tp_max; }; }; }; /** * struct flow_dissector_key_icmp: * @type: ICMP type * @code: ICMP code * @id: Session identifier */ struct flow_dissector_key_icmp { struct { u8 type; u8 code; }; u16 id; }; /** * struct flow_dissector_key_eth_addrs: * @src: source Ethernet address * @dst: destination Ethernet address */ struct flow_dissector_key_eth_addrs { /* (dst,src) must be grouped, in the same way than in ETH header */ unsigned char dst[ETH_ALEN]; unsigned char src[ETH_ALEN]; }; /** * struct flow_dissector_key_tcp: * @flags: flags */ struct flow_dissector_key_tcp { __be16 flags; }; /** * struct flow_dissector_key_ip: * @tos: tos * @ttl: ttl */ struct flow_dissector_key_ip { __u8 tos; __u8 ttl; }; /** * struct flow_dissector_key_meta: * @ingress_ifindex: ingress ifindex * @ingress_iftype: ingress interface type * @l2_miss: packet did not match an L2 entry during forwarding */ struct flow_dissector_key_meta { int ingress_ifindex; u16 ingress_iftype; u8 l2_miss; }; /** * struct flow_dissector_key_ct: * @ct_state: conntrack state after converting with map * @ct_mark: conttrack mark * @ct_zone: conntrack zone * @ct_labels: conntrack labels */ struct flow_dissector_key_ct { u16 ct_state; u16 ct_zone; u32 ct_mark; u32 ct_labels[4]; }; /** * struct flow_dissector_key_hash: * @hash: hash value */ struct flow_dissector_key_hash { u32 hash; }; /** * struct flow_dissector_key_num_of_vlans: * @num_of_vlans: num_of_vlans value */ struct flow_dissector_key_num_of_vlans { u8 num_of_vlans; }; /** * struct flow_dissector_key_pppoe: * @session_id: pppoe session id * @ppp_proto: ppp protocol * @type: pppoe eth type */ struct flow_dissector_key_pppoe { __be16 session_id; __be16 ppp_proto; __be16 type; }; /** * struct flow_dissector_key_l2tpv3: * @session_id: identifier for a l2tp session */ struct flow_dissector_key_l2tpv3 { __be32 session_id; }; /** * struct flow_dissector_key_ipsec: * @spi: identifier for a ipsec connection */ struct flow_dissector_key_ipsec { __be32 spi; }; /** * struct flow_dissector_key_cfm * @mdl_ver: maintenance domain level (mdl) and cfm protocol version * @opcode: code specifying a type of cfm protocol packet * * See 802.1ag, ITU-T G.8013/Y.1731 * 1 2 * |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * | mdl | version | opcode | * +-----+---------+-+-+-+-+-+-+-+-+ */ struct flow_dissector_key_cfm { u8 mdl_ver; u8 opcode; }; #define FLOW_DIS_CFM_MDL_MASK GENMASK(7, 5) #define FLOW_DIS_CFM_MDL_MAX 7 enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */ FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */ FLOW_DISSECTOR_KEY_IPV4_ADDRS, /* struct flow_dissector_key_ipv4_addrs */ FLOW_DISSECTOR_KEY_IPV6_ADDRS, /* struct flow_dissector_key_ipv6_addrs */ FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_PORTS_RANGE, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_ICMP, /* struct flow_dissector_key_icmp */ FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */ FLOW_DISSECTOR_KEY_TIPC, /* struct flow_dissector_key_tipc */ FLOW_DISSECTOR_KEY_ARP, /* struct flow_dissector_key_arp */ FLOW_DISSECTOR_KEY_VLAN, /* struct flow_dissector_key_vlan */ FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_tags */ FLOW_DISSECTOR_KEY_GRE_KEYID, /* struct flow_dissector_key_keyid */ FLOW_DISSECTOR_KEY_MPLS_ENTROPY, /* struct flow_dissector_key_keyid */ FLOW_DISSECTOR_KEY_ENC_KEYID, /* struct flow_dissector_key_keyid */ FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS, /* struct flow_dissector_key_ipv4_addrs */ FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS, /* struct flow_dissector_key_ipv6_addrs */ FLOW_DISSECTOR_KEY_ENC_CONTROL, /* struct flow_dissector_key_control */ FLOW_DISSECTOR_KEY_ENC_PORTS, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_MPLS, /* struct flow_dissector_key_mpls */ FLOW_DISSECTOR_KEY_TCP, /* struct flow_dissector_key_tcp */ FLOW_DISSECTOR_KEY_IP, /* struct flow_dissector_key_ip */ FLOW_DISSECTOR_KEY_CVLAN, /* struct flow_dissector_key_vlan */ FLOW_DISSECTOR_KEY_ENC_IP, /* struct flow_dissector_key_ip */ FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */ FLOW_DISSECTOR_KEY_META, /* struct flow_dissector_key_meta */ FLOW_DISSECTOR_KEY_CT, /* struct flow_dissector_key_ct */ FLOW_DISSECTOR_KEY_HASH, /* struct flow_dissector_key_hash */ FLOW_DISSECTOR_KEY_NUM_OF_VLANS, /* struct flow_dissector_key_num_of_vlans */ FLOW_DISSECTOR_KEY_PPPOE, /* struct flow_dissector_key_pppoe */ FLOW_DISSECTOR_KEY_L2TPV3, /* struct flow_dissector_key_l2tpv3 */ FLOW_DISSECTOR_KEY_CFM, /* struct flow_dissector_key_cfm */ FLOW_DISSECTOR_KEY_IPSEC, /* struct flow_dissector_key_ipsec */ FLOW_DISSECTOR_KEY_MAX, }; #define FLOW_DISSECTOR_F_PARSE_1ST_FRAG BIT(0) #define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL BIT(1) #define FLOW_DISSECTOR_F_STOP_AT_ENCAP BIT(2) #define FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP BIT(3) struct flow_dissector_key { enum flow_dissector_key_id key_id; size_t offset; /* offset of struct flow_dissector_key_* in target the struct */ }; struct flow_dissector { unsigned long long used_keys; /* each bit represents presence of one key id */ unsigned short int offset[FLOW_DISSECTOR_KEY_MAX]; }; struct flow_keys_basic { struct flow_dissector_key_control control; struct flow_dissector_key_basic basic; }; struct flow_keys { struct flow_dissector_key_control control; #define FLOW_KEYS_HASH_START_FIELD basic struct flow_dissector_key_basic basic __aligned(SIPHASH_ALIGNMENT); struct flow_dissector_key_tags tags; struct flow_dissector_key_vlan vlan; struct flow_dissector_key_vlan cvlan; struct flow_dissector_key_keyid keyid; struct flow_dissector_key_ports ports; struct flow_dissector_key_icmp icmp; /* 'addrs' must be the last member */ struct flow_dissector_key_addrs addrs; }; #define FLOW_KEYS_HASH_OFFSET \ offsetof(struct flow_keys, FLOW_KEYS_HASH_START_FIELD) __be32 flow_get_u32_src(const struct flow_keys *flow); __be32 flow_get_u32_dst(const struct flow_keys *flow); extern struct flow_dissector flow_keys_dissector; extern struct flow_dissector flow_keys_basic_dissector; /* struct flow_keys_digest: * * This structure is used to hold a digest of the full flow keys. This is a * larger "hash" of a flow to allow definitively matching specific flows where * the 32 bit skb->hash is not large enough. The size is limited to 16 bytes so * that it can be used in CB of skb (see sch_choke for an example). */ #define FLOW_KEYS_DIGEST_LEN 16 struct flow_keys_digest { u8 data[FLOW_KEYS_DIGEST_LEN]; }; void make_flow_keys_digest(struct flow_keys_digest *digest, const struct flow_keys *flow); static inline bool flow_keys_have_l4(const struct flow_keys *keys) { return (keys->ports.ports || keys->tags.flow_label); } u32 flow_hash_from_keys(struct flow_keys *keys); u32 flow_hash_from_keys_seed(struct flow_keys *keys, const siphash_key_t *keyval); void skb_flow_get_icmp_tci(const struct sk_buff *skb, struct flow_dissector_key_icmp *key_icmp, const void *data, int thoff, int hlen); static inline bool dissector_uses_key(const struct flow_dissector *flow_dissector, enum flow_dissector_key_id key_id) { return flow_dissector->used_keys & (1ULL << key_id); } static inline void *skb_flow_dissector_target(struct flow_dissector *flow_dissector, enum flow_dissector_key_id key_id, void *target_container) { return ((char *)target_container) + flow_dissector->offset[key_id]; } struct bpf_flow_dissector { struct bpf_flow_keys *flow_keys; const struct sk_buff *skb; const void *data; const void *data_end; }; static inline void flow_dissector_init_keys(struct flow_dissector_key_control *key_control, struct flow_dissector_key_basic *key_basic) { memset(key_control, 0, sizeof(*key_control)); memset(key_basic, 0, sizeof(*key_basic)); } #ifdef CONFIG_BPF_SYSCALL int flow_dissector_bpf_prog_attach_check(struct net *net, struct bpf_prog *prog); #endif /* CONFIG_BPF_SYSCALL */ #endif |
| 1 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 | // SPDX-License-Identifier: GPL-2.0+ /* * Comedi driver for National Instruments AT-A2150 boards * Copyright (C) 2001, 2002 Frank Mori Hess <fmhess@users.sourceforge.net> * * COMEDI - Linux Control and Measurement Device Interface * Copyright (C) 2000 David A. Schleef <ds@schleef.org> */ /* * Driver: ni_at_a2150 * Description: National Instruments AT-A2150 * Author: Frank Mori Hess * Status: works * Devices: [National Instruments] AT-A2150C (at_a2150c), AT-2150S (at_a2150s) * * Configuration options: * [0] - I/O port base address * [1] - IRQ (optional, required for timed conversions) * [2] - DMA (optional, required for timed conversions) * * Yet another driver for obsolete hardware brought to you by Frank Hess. * Testing and debugging help provided by Dave Andruczyk. * * If you want to ac couple the board's inputs, use AREF_OTHER. * * The only difference in the boards is their master clock frequencies. * * References (from ftp://ftp.natinst.com/support/manuals): * 320360.pdf AT-A2150 User Manual * * TODO: * - analog level triggering * - TRIG_WAKE_EOS */ #include <linux/module.h> #include <linux/delay.h> #include <linux/interrupt.h> #include <linux/slab.h> #include <linux/io.h> #include <linux/comedi/comedidev.h> #include <linux/comedi/comedi_8254.h> #include <linux/comedi/comedi_isadma.h> #define A2150_DMA_BUFFER_SIZE 0xff00 /* size in bytes of dma buffer */ /* Registers and bits */ #define CONFIG_REG 0x0 #define CHANNEL_BITS(x) ((x) & 0x7) #define CHANNEL_MASK 0x7 #define CLOCK_SELECT_BITS(x) (((x) & 0x3) << 3) #define CLOCK_DIVISOR_BITS(x) (((x) & 0x3) << 5) #define CLOCK_MASK (0xf << 3) /* enable (don't internally ground) channels 0 and 1 */ #define ENABLE0_BIT 0x80 /* enable (don't internally ground) channels 2 and 3 */ #define ENABLE1_BIT 0x100 #define AC0_BIT 0x200 /* ac couple channels 0,1 */ #define AC1_BIT 0x400 /* ac couple channels 2,3 */ #define APD_BIT 0x800 /* analog power down */ #define DPD_BIT 0x1000 /* digital power down */ #define TRIGGER_REG 0x2 /* trigger config register */ #define POST_TRIGGER_BITS 0x2 #define DELAY_TRIGGER_BITS 0x3 #define HW_TRIG_EN 0x10 /* enable hardware trigger */ #define FIFO_START_REG 0x6 /* software start aquistion trigger */ #define FIFO_RESET_REG 0x8 /* clears fifo + fifo flags */ #define FIFO_DATA_REG 0xa /* read data */ #define DMA_TC_CLEAR_REG 0xe /* clear dma terminal count interrupt */ #define STATUS_REG 0x12 /* read only */ #define FNE_BIT 0x1 /* fifo not empty */ #define OVFL_BIT 0x8 /* fifo overflow */ #define EDAQ_BIT 0x10 /* end of acquisition interrupt */ #define DCAL_BIT 0x20 /* offset calibration in progress */ #define INTR_BIT 0x40 /* interrupt has occurred */ /* dma terminal count interrupt has occurred */ #define DMA_TC_BIT 0x80 #define ID_BITS(x) (((x) >> 8) & 0x3) #define IRQ_DMA_CNTRL_REG 0x12 /* write only */ #define DMA_CHAN_BITS(x) ((x) & 0x7) /* sets dma channel */ #define DMA_EN_BIT 0x8 /* enables dma */ #define IRQ_LVL_BITS(x) (((x) & 0xf) << 4) /* sets irq level */ #define FIFO_INTR_EN_BIT 0x100 /* enable fifo interrupts */ #define FIFO_INTR_FHF_BIT 0x200 /* interrupt fifo half full */ /* enable interrupt on dma terminal count */ #define DMA_INTR_EN_BIT 0x800 #define DMA_DEM_EN_BIT 0x1000 /* enables demand mode dma */ #define I8253_BASE_REG 0x14 struct a2150_board { const char *name; int clock[4]; /* master clock periods, in nanoseconds */ int num_clocks; /* number of available master clock speeds */ int ai_speed; /* maximum conversion rate in nanoseconds */ }; /* analog input range */ static const struct comedi_lrange range_a2150 = { 1, { BIP_RANGE(2.828) } }; /* enum must match board indices */ enum { a2150_c, a2150_s }; static const struct a2150_board a2150_boards[] = { { .name = "at-a2150c", .clock = {31250, 22676, 20833, 19531}, .num_clocks = 4, .ai_speed = 19531, }, { .name = "at-a2150s", .clock = {62500, 50000, 41667, 0}, .num_clocks = 3, .ai_speed = 41667, }, }; struct a2150_private { struct comedi_isadma *dma; unsigned int count; /* number of data points left to be taken */ int irq_dma_bits; /* irq/dma register bits */ int config_bits; /* config register bits */ }; /* interrupt service routine */ static irqreturn_t a2150_interrupt(int irq, void *d) { struct comedi_device *dev = d; struct a2150_private *devpriv = dev->private; struct comedi_isadma *dma = devpriv->dma; struct comedi_isadma_desc *desc = &dma->desc[0]; struct comedi_subdevice *s = dev->read_subdev; struct comedi_async *async = s->async; struct comedi_cmd *cmd = &async->cmd; unsigned short *buf = desc->virt_addr; unsigned int max_points, num_points, residue, leftover; unsigned short dpnt; int status; int i; if (!dev->attached) return IRQ_HANDLED; status = inw(dev->iobase + STATUS_REG); if ((status & INTR_BIT) == 0) return IRQ_NONE; if (status & OVFL_BIT) { async->events |= COMEDI_CB_ERROR; comedi_handle_events(dev, s); } if ((status & DMA_TC_BIT) == 0) { async->events |= COMEDI_CB_ERROR; comedi_handle_events(dev, s); return IRQ_HANDLED; } /* * residue is the number of bytes left to be done on the dma * transfer. It should always be zero at this point unless * the stop_src is set to external triggering. */ residue = comedi_isadma_disable(desc->chan); /* figure out how many points to read */ max_points = comedi_bytes_to_samples(s, desc->size); num_points = max_points - comedi_bytes_to_samples(s, residue); if (devpriv->count < num_points && cmd->stop_src == TRIG_COUNT) num_points = devpriv->count; /* figure out how many points will be stored next time */ leftover = 0; if (cmd->stop_src == TRIG_NONE) { leftover = comedi_bytes_to_samples(s, desc->size); } else if (devpriv->count > max_points) { leftover = devpriv->count - max_points; if (leftover > max_points) leftover = max_points; } /* * There should only be a residue if collection was stopped by having * the stop_src set to an external trigger, in which case there * will be no more data */ if (residue) leftover = 0; for (i = 0; i < num_points; i++) { /* write data point to comedi buffer */ dpnt = buf[i]; /* convert from 2's complement to unsigned coding */ dpnt ^= 0x8000; comedi_buf_write_samples(s, &dpnt, 1); if (cmd->stop_src == TRIG_COUNT) { if (--devpriv->count == 0) { /* end of acquisition */ async->events |= COMEDI_CB_EOA; break; } } } /* re-enable dma */ if (leftover) { desc->size = comedi_samples_to_bytes(s, leftover); comedi_isadma_program(desc); } comedi_handle_events(dev, s); /* clear interrupt */ outw(0x00, dev->iobase + DMA_TC_CLEAR_REG); return IRQ_HANDLED; } static int a2150_cancel(struct comedi_device *dev, struct comedi_subdevice *s) { struct a2150_private *devpriv = dev->private; struct comedi_isadma *dma = devpriv->dma; struct comedi_isadma_desc *desc = &dma->desc[0]; /* disable dma on card */ devpriv->irq_dma_bits &= ~DMA_INTR_EN_BIT & ~DMA_EN_BIT; outw(devpriv->irq_dma_bits, dev->iobase + IRQ_DMA_CNTRL_REG); /* disable computer's dma */ comedi_isadma_disable(desc->chan); /* clear fifo and reset triggering circuitry */ outw(0, dev->iobase + FIFO_RESET_REG); return 0; } /* * sets bits in devpriv->clock_bits to nearest approximation of requested * period, adjusts requested period to actual timing. */ static int a2150_get_timing(struct comedi_device *dev, unsigned int *period, unsigned int flags) { const struct a2150_board *board = dev->board_ptr; struct a2150_private *devpriv = dev->private; int lub, glb, temp; int lub_divisor_shift, lub_index, glb_divisor_shift, glb_index; int i, j; /* initialize greatest lower and least upper bounds */ lub_divisor_shift = 3; lub_index = 0; lub = board->clock[lub_index] * (1 << lub_divisor_shift); glb_divisor_shift = 0; glb_index = board->num_clocks - 1; glb = board->clock[glb_index] * (1 << glb_divisor_shift); /* make sure period is in available range */ if (*period < glb) *period = glb; if (*period > lub) *period = lub; /* we can multiply period by 1, 2, 4, or 8, using (1 << i) */ for (i = 0; i < 4; i++) { /* there are a maximum of 4 master clocks */ for (j = 0; j < board->num_clocks; j++) { /* temp is the period in nanosec we are evaluating */ temp = board->clock[j] * (1 << i); /* if it is the best match yet */ if (temp < lub && temp >= *period) { lub_divisor_shift = i; lub_index = j; lub = temp; } if (temp > glb && temp <= *period) { glb_divisor_shift = i; glb_index = j; glb = temp; } } } switch (flags & CMDF_ROUND_MASK) { case CMDF_ROUND_NEAREST: default: /* if least upper bound is better approximation */ if (lub - *period < *period - glb) *period = lub; else *period = glb; break; case CMDF_ROUND_UP: *period = lub; break; case CMDF_ROUND_DOWN: *period = glb; break; } /* set clock bits for config register appropriately */ devpriv->config_bits &= ~CLOCK_MASK; if (*period == lub) { devpriv->config_bits |= CLOCK_SELECT_BITS(lub_index) | CLOCK_DIVISOR_BITS(lub_divisor_shift); } else { devpriv->config_bits |= CLOCK_SELECT_BITS(glb_index) | CLOCK_DIVISOR_BITS(glb_divisor_shift); } return 0; } static int a2150_set_chanlist(struct comedi_device *dev, unsigned int start_channel, unsigned int num_channels) { struct a2150_private *devpriv = dev->private; if (start_channel + num_channels > 4) return -1; devpriv->config_bits &= ~CHANNEL_MASK; switch (num_channels) { case 1: devpriv->config_bits |= CHANNEL_BITS(0x4 | start_channel); break; case 2: if (start_channel == 0) devpriv->config_bits |= CHANNEL_BITS(0x2); else if (start_channel == 2) devpriv->config_bits |= CHANNEL_BITS(0x3); else return -1; break; case 4: devpriv->config_bits |= CHANNEL_BITS(0x1); break; default: return -1; } return 0; } static int a2150_ai_check_chanlist(struct comedi_device *dev, struct comedi_subdevice *s, struct comedi_cmd *cmd) { unsigned int chan0 = CR_CHAN(cmd->chanlist[0]); unsigned int aref0 = CR_AREF(cmd->chanlist[0]); int i; if (cmd->chanlist_len == 2 && (chan0 == 1 || chan0 == 3)) { dev_dbg(dev->class_dev, "length 2 chanlist must be channels 0,1 or channels 2,3\n"); return -EINVAL; } if (cmd->chanlist_len == 3) { dev_dbg(dev->class_dev, "chanlist must have 1,2 or 4 channels\n"); return -EINVAL; } for (i = 1; i < cmd->chanlist_len; i++) { unsigned int chan = CR_CHAN(cmd->chanlist[i]); unsigned int aref = CR_AREF(cmd->chanlist[i]); if (chan != (chan0 + i)) { dev_dbg(dev->class_dev, "entries in chanlist must be consecutive channels, counting upwards\n"); return -EINVAL; } if (chan == 2) aref0 = aref; if (aref != aref0) { dev_dbg(dev->class_dev, "channels 0/1 and 2/3 must have the same analog reference\n"); return -EINVAL; } } return 0; } static int a2150_ai_cmdtest(struct comedi_device *dev, struct comedi_subdevice *s, struct comedi_cmd *cmd) { const struct a2150_board *board = dev->board_ptr; int err = 0; unsigned int arg; /* Step 1 : check if triggers are trivially valid */ err |= comedi_check_trigger_src(&cmd->start_src, TRIG_NOW | TRIG_EXT); err |= comedi_check_trigger_src(&cmd->scan_begin_src, TRIG_TIMER); err |= comedi_check_trigger_src(&cmd->convert_src, TRIG_NOW); err |= comedi_check_trigger_src(&cmd->scan_end_src, TRIG_COUNT); err |= comedi_check_trigger_src(&cmd->stop_src, TRIG_COUNT | TRIG_NONE); if (err) return 1; /* Step 2a : make sure trigger sources are unique */ err |= comedi_check_trigger_is_unique(cmd->start_src); err |= comedi_check_trigger_is_unique(cmd->stop_src); /* Step 2b : and mutually compatible */ if (err) return 2; /* Step 3: check if arguments are trivially valid */ err |= comedi_check_trigger_arg_is(&cmd->start_arg, 0); if (cmd->convert_src == TRIG_TIMER) { err |= comedi_check_trigger_arg_min(&cmd->convert_arg, board->ai_speed); } err |= comedi_check_trigger_arg_min(&cmd->chanlist_len, 1); err |= comedi_check_trigger_arg_is(&cmd->scan_end_arg, cmd->chanlist_len); if (cmd->stop_src == TRIG_COUNT) err |= comedi_check_trigger_arg_min(&cmd->stop_arg, 1); else /* TRIG_NONE */ err |= comedi_check_trigger_arg_is(&cmd->stop_arg, 0); if (err) return 3; /* step 4: fix up any arguments */ if (cmd->scan_begin_src == TRIG_TIMER) { arg = cmd->scan_begin_arg; a2150_get_timing(dev, &arg, cmd->flags); err |= comedi_check_trigger_arg_is(&cmd->scan_begin_arg, arg); } if (err) return 4; /* Step 5: check channel list if it exists */ if (cmd->chanlist && cmd->chanlist_len > 0) err |= a2150_ai_check_chanlist(dev, s, cmd); if (err) return 5; return 0; } static int a2150_ai_cmd(struct comedi_device *dev, struct comedi_subdevice *s) { struct a2150_private *devpriv = dev->private; struct comedi_isadma *dma = devpriv->dma; struct comedi_isadma_desc *desc = &dma->desc[0]; struct comedi_async *async = s->async; struct comedi_cmd *cmd = &async->cmd; unsigned int old_config_bits = devpriv->config_bits; unsigned int trigger_bits; if (cmd->flags & CMDF_PRIORITY) { dev_err(dev->class_dev, "dma incompatible with hard real-time interrupt (CMDF_PRIORITY), aborting\n"); return -1; } /* clear fifo and reset triggering circuitry */ outw(0, dev->iobase + FIFO_RESET_REG); /* setup chanlist */ if (a2150_set_chanlist(dev, CR_CHAN(cmd->chanlist[0]), cmd->chanlist_len) < 0) return -1; /* setup ac/dc coupling */ if (CR_AREF(cmd->chanlist[0]) == AREF_OTHER) devpriv->config_bits |= AC0_BIT; else devpriv->config_bits &= ~AC0_BIT; if (CR_AREF(cmd->chanlist[2]) == AREF_OTHER) devpriv->config_bits |= AC1_BIT; else devpriv->config_bits &= ~AC1_BIT; /* setup timing */ a2150_get_timing(dev, &cmd->scan_begin_arg, cmd->flags); /* send timing, channel, config bits */ outw(devpriv->config_bits, dev->iobase + CONFIG_REG); /* initialize number of samples remaining */ devpriv->count = cmd->stop_arg * cmd->chanlist_len; comedi_isadma_disable(desc->chan); /* set size of transfer to fill in 1/3 second */ #define ONE_THIRD_SECOND 333333333 desc->size = comedi_bytes_per_sample(s) * cmd->chanlist_len * ONE_THIRD_SECOND / cmd->scan_begin_arg; if (desc->size > desc->maxsize) desc->size = desc->maxsize; if (desc->size < comedi_bytes_per_sample(s)) desc->size = comedi_bytes_per_sample(s); desc->size -= desc->size % comedi_bytes_per_sample(s); comedi_isadma_program(desc); /* * Clear dma interrupt before enabling it, to try and get rid of * that one spurious interrupt that has been happening. */ outw(0x00, dev->iobase + DMA_TC_CLEAR_REG); /* enable dma on card */ devpriv->irq_dma_bits |= DMA_INTR_EN_BIT | DMA_EN_BIT; outw(devpriv->irq_dma_bits, dev->iobase + IRQ_DMA_CNTRL_REG); /* may need to wait 72 sampling periods if timing was changed */ comedi_8254_load(dev->pacer, 2, 72, I8254_MODE0 | I8254_BINARY); /* setup start triggering */ trigger_bits = 0; /* decide if we need to wait 72 periods for valid data */ if (cmd->start_src == TRIG_NOW && (old_config_bits & CLOCK_MASK) != (devpriv->config_bits & CLOCK_MASK)) { /* set trigger source to delay trigger */ trigger_bits |= DELAY_TRIGGER_BITS; } else { /* otherwise no delay */ trigger_bits |= POST_TRIGGER_BITS; } /* enable external hardware trigger */ if (cmd->start_src == TRIG_EXT) { trigger_bits |= HW_TRIG_EN; } else if (cmd->start_src == TRIG_OTHER) { /* * XXX add support for level/slope start trigger * using TRIG_OTHER */ dev_err(dev->class_dev, "you shouldn't see this?\n"); } /* send trigger config bits */ outw(trigger_bits, dev->iobase + TRIGGER_REG); /* start acquisition for soft trigger */ if (cmd->start_src == TRIG_NOW) outw(0, dev->iobase + FIFO_START_REG); return 0; } static int a2150_ai_eoc(struct comedi_device *dev, struct comedi_subdevice *s, struct comedi_insn *insn, unsigned long context) { unsigned int status; status = inw(dev->iobase + STATUS_REG); if (status & FNE_BIT) return 0; return -EBUSY; } static int a2150_ai_rinsn(struct comedi_device *dev, struct comedi_subdevice *s, struct comedi_insn *insn, unsigned int *data) { struct a2150_private *devpriv = dev->private; unsigned int n; int ret; /* clear fifo and reset triggering circuitry */ outw(0, dev->iobase + FIFO_RESET_REG); /* setup chanlist */ if (a2150_set_chanlist(dev, CR_CHAN(insn->chanspec), 1) < 0) return -1; /* set dc coupling */ devpriv->config_bits &= ~AC0_BIT; devpriv->config_bits &= ~AC1_BIT; /* send timing, channel, config bits */ outw(devpriv->config_bits, dev->iobase + CONFIG_REG); /* disable dma on card */ devpriv->irq_dma_bits &= ~DMA_INTR_EN_BIT & ~DMA_EN_BIT; outw(devpriv->irq_dma_bits, dev->iobase + IRQ_DMA_CNTRL_REG); /* setup start triggering */ outw(0, dev->iobase + TRIGGER_REG); /* start acquisition for soft trigger */ outw(0, dev->iobase + FIFO_START_REG); /* * there is a 35.6 sample delay for data to get through the * antialias filter */ for (n = 0; n < 36; n++) { ret = comedi_timeout(dev, s, insn, a2150_ai_eoc, 0); if (ret) return ret; inw(dev->iobase + FIFO_DATA_REG); } /* read data */ for (n = 0; n < insn->n; n++) { ret = comedi_timeout(dev, s, insn, a2150_ai_eoc, 0); if (ret) return ret; data[n] = inw(dev->iobase + FIFO_DATA_REG); data[n] ^= 0x8000; } /* clear fifo and reset triggering circuitry */ outw(0, dev->iobase + FIFO_RESET_REG); return n; } static void a2150_alloc_irq_and_dma(struct comedi_device *dev, struct comedi_devconfig *it) { struct a2150_private *devpriv = dev->private; unsigned int irq_num = it->options[1]; unsigned int dma_chan = it->options[2]; /* * Only IRQs 15, 14, 12-9, and 7-3 are valid. * Only DMA channels 7-5 and 3-0 are valid. */ if (irq_num > 15 || dma_chan > 7 || !((1 << irq_num) & 0xdef8) || !((1 << dma_chan) & 0xef)) return; if (request_irq(irq_num, a2150_interrupt, 0, dev->board_name, dev)) return; /* DMA uses 1 buffer */ devpriv->dma = comedi_isadma_alloc(dev, 1, dma_chan, dma_chan, A2150_DMA_BUFFER_SIZE, COMEDI_ISADMA_READ); if (!devpriv->dma) { free_irq(irq_num, dev); } else { dev->irq = irq_num; devpriv->irq_dma_bits = IRQ_LVL_BITS(irq_num) | DMA_CHAN_BITS(dma_chan); } } static void a2150_free_dma(struct comedi_device *dev) { struct a2150_private *devpriv = dev->private; if (devpriv) comedi_isadma_free(devpriv->dma); } static const struct a2150_board *a2150_probe(struct comedi_device *dev) { int id = ID_BITS(inw(dev->iobase + STATUS_REG)); if (id >= ARRAY_SIZE(a2150_boards)) return NULL; return &a2150_boards[id]; } static int a2150_attach(struct comedi_device *dev, struct comedi_devconfig *it) { const struct a2150_board *board; struct a2150_private *devpriv; struct comedi_subdevice *s; static const int timeout = 2000; int i; int ret; devpriv = comedi_alloc_devpriv(dev, sizeof(*devpriv)); if (!devpriv) return -ENOMEM; ret = comedi_request_region(dev, it->options[0], 0x1c); if (ret) return ret; board = a2150_probe(dev); if (!board) return -ENODEV; dev->board_ptr = board; dev->board_name = board->name; /* an IRQ and DMA are required to support async commands */ a2150_alloc_irq_and_dma(dev, it); dev->pacer = comedi_8254_io_alloc(dev->iobase + I8253_BASE_REG, 0, I8254_IO8, 0); if (IS_ERR(dev->pacer)) return PTR_ERR(dev->pacer); ret = comedi_alloc_subdevices(dev, 1); if (ret) return ret; /* analog input subdevice */ s = &dev->subdevices[0]; s->type = COMEDI_SUBD_AI; s->subdev_flags = SDF_READABLE | SDF_GROUND | SDF_OTHER; s->n_chan = 4; s->maxdata = 0xffff; s->range_table = &range_a2150; s->insn_read = a2150_ai_rinsn; if (dev->irq) { dev->read_subdev = s; s->subdev_flags |= SDF_CMD_READ; s->len_chanlist = s->n_chan; s->do_cmd = a2150_ai_cmd; s->do_cmdtest = a2150_ai_cmdtest; s->cancel = a2150_cancel; } /* set card's irq and dma levels */ outw(devpriv->irq_dma_bits, dev->iobase + IRQ_DMA_CNTRL_REG); /* reset and sync adc clock circuitry */ outw_p(DPD_BIT | APD_BIT, dev->iobase + CONFIG_REG); outw_p(DPD_BIT, dev->iobase + CONFIG_REG); /* initialize configuration register */ devpriv->config_bits = 0; outw(devpriv->config_bits, dev->iobase + CONFIG_REG); /* wait until offset calibration is done, then enable analog inputs */ for (i = 0; i < timeout; i++) { if ((DCAL_BIT & inw(dev->iobase + STATUS_REG)) == 0) break; usleep_range(1000, 3000); } if (i == timeout) { dev_err(dev->class_dev, "timed out waiting for offset calibration to complete\n"); return -ETIME; } devpriv->config_bits |= ENABLE0_BIT | ENABLE1_BIT; outw(devpriv->config_bits, dev->iobase + CONFIG_REG); return 0; }; static void a2150_detach(struct comedi_device *dev) { if (dev->iobase) outw(APD_BIT | DPD_BIT, dev->iobase + CONFIG_REG); a2150_free_dma(dev); comedi_legacy_detach(dev); }; static struct comedi_driver ni_at_a2150_driver = { .driver_name = "ni_at_a2150", .module = THIS_MODULE, .attach = a2150_attach, .detach = a2150_detach, }; module_comedi_driver(ni_at_a2150_driver); MODULE_AUTHOR("Comedi https://www.comedi.org"); MODULE_DESCRIPTION("Comedi low-level driver"); MODULE_LICENSE("GPL"); |
| 100 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _ASM_X86_SHARED_IO_H #define _ASM_X86_SHARED_IO_H #include <linux/types.h> #define BUILDIO(bwl, bw, type) \ static __always_inline void __out##bwl(type value, u16 port) \ { \ asm volatile("out" #bwl " %" #bw "0, %w1" \ : : "a"(value), "Nd"(port)); \ } \ \ static __always_inline type __in##bwl(u16 port) \ { \ type value; \ asm volatile("in" #bwl " %w1, %" #bw "0" \ : "=a"(value) : "Nd"(port)); \ return value; \ } BUILDIO(b, b, u8) BUILDIO(w, w, u16) BUILDIO(l, , u32) #undef BUILDIO #define inb __inb #define inw __inw #define inl __inl #define outb __outb #define outw __outw #define outl __outl #endif |
| 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef __LINUX_PIM_H #define __LINUX_PIM_H #include <linux/skbuff.h> #include <asm/byteorder.h> /* Message types - V1 */ #define PIM_V1_VERSION cpu_to_be32(0x10000000) #define PIM_V1_REGISTER 1 /* Message types - V2 */ #define PIM_VERSION 2 /* RFC7761, sec 4.9: * Type * Types for specific PIM messages. PIM Types are: * * Message Type Destination * --------------------------------------------------------------------- * 0 = Hello Multicast to ALL-PIM-ROUTERS * 1 = Register Unicast to RP * 2 = Register-Stop Unicast to source of Register * packet * 3 = Join/Prune Multicast to ALL-PIM-ROUTERS * 4 = Bootstrap Multicast to ALL-PIM-ROUTERS * 5 = Assert Multicast to ALL-PIM-ROUTERS * 6 = Graft (used in PIM-DM only) Unicast to RPF'(S) * 7 = Graft-Ack (used in PIM-DM only) Unicast to source of Graft * packet * 8 = Candidate-RP-Advertisement Unicast to Domain's BSR */ enum { PIM_TYPE_HELLO, PIM_TYPE_REGISTER, PIM_TYPE_REGISTER_STOP, PIM_TYPE_JOIN_PRUNE, PIM_TYPE_BOOTSTRAP, PIM_TYPE_ASSERT, PIM_TYPE_GRAFT, PIM_TYPE_GRAFT_ACK, PIM_TYPE_CANDIDATE_RP_ADV }; #define PIM_NULL_REGISTER cpu_to_be32(0x40000000) /* RFC7761, sec 4.9: * The PIM header common to all PIM messages is: * 0 1 2 3 * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * |PIM Ver| Type | Reserved | Checksum | * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ */ struct pimhdr { __u8 type; __u8 reserved; __be16 csum; }; /* PIMv2 register message header layout (ietf-draft-idmr-pimvsm-v2-00.ps */ struct pimreghdr { __u8 type; __u8 reserved; __be16 csum; __be32 flags; }; int pim_rcv_v1(struct sk_buff *skb); static inline bool ipmr_pimsm_enabled(void) { return IS_BUILTIN(CONFIG_IP_PIMSM_V1) || IS_BUILTIN(CONFIG_IP_PIMSM_V2); } static inline struct pimhdr *pim_hdr(const struct sk_buff *skb) { return (struct pimhdr *)skb_transport_header(skb); } static inline u8 pim_hdr_version(const struct pimhdr *pimhdr) { return pimhdr->type >> 4; } static inline u8 pim_hdr_type(const struct pimhdr *pimhdr) { return pimhdr->type & 0xf; } /* check if the address is 224.0.0.13, RFC7761 sec 4.3.1 */ static inline bool pim_ipv4_all_pim_routers(__be32 addr) { return addr == htonl(0xE000000D); } #endif |
| 343 2 36 37 69 10 4 81 4 4 4 4 4 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* * INET An implementation of the TCP/IP protocol suite for the LINUX * operating system. INET is implemented using the BSD Socket * interface as the means of communication with the user level. * * Definitions for the UDP protocol. * * Version: @(#)udp.h 1.0.2 04/28/93 * * Author: Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG> */ #ifndef _LINUX_UDP_H #define _LINUX_UDP_H #include <net/inet_sock.h> #include <linux/skbuff.h> #include <net/netns/hash.h> #include <uapi/linux/udp.h> static inline struct udphdr *udp_hdr(const struct sk_buff *skb) { return (struct udphdr *)skb_transport_header(skb); } #define UDP_HTABLE_SIZE_MIN_PERNET 128 #define UDP_HTABLE_SIZE_MIN (IS_ENABLED(CONFIG_BASE_SMALL) ? 128 : 256) #define UDP_HTABLE_SIZE_MAX 65536 static inline u32 udp_hashfn(const struct net *net, u32 num, u32 mask) { return (num + net_hash_mix(net)) & mask; } enum { UDP_FLAGS_CORK, /* Cork is required */ UDP_FLAGS_NO_CHECK6_TX, /* Send zero UDP6 checksums on TX? */ UDP_FLAGS_NO_CHECK6_RX, /* Allow zero UDP6 checksums on RX? */ UDP_FLAGS_GRO_ENABLED, /* Request GRO aggregation */ UDP_FLAGS_ACCEPT_FRAGLIST, UDP_FLAGS_ACCEPT_L4, UDP_FLAGS_ENCAP_ENABLED, /* This socket enabled encap */ UDP_FLAGS_UDPLITE_SEND_CC, /* set via udplite setsockopt */ UDP_FLAGS_UDPLITE_RECV_CC, /* set via udplite setsockopt */ }; struct udp_sock { /* inet_sock has to be the first member */ struct inet_sock inet; #define udp_port_hash inet.sk.__sk_common.skc_u16hashes[0] #define udp_portaddr_hash inet.sk.__sk_common.skc_u16hashes[1] #define udp_portaddr_node inet.sk.__sk_common.skc_portaddr_node unsigned long udp_flags; int pending; /* Any pending frames ? */ __u8 encap_type; /* Is this an Encapsulation socket? */ #if !IS_ENABLED(CONFIG_BASE_SMALL) /* For UDP 4-tuple hash */ __u16 udp_lrpa_hash; struct hlist_nulls_node udp_lrpa_node; #endif /* * Following member retains the information to create a UDP header * when the socket is uncorked. */ __u16 len; /* total length of pending frames */ __u16 gso_size; /* * Fields specific to UDP-Lite. */ __u16 pcslen; __u16 pcrlen; /* * For encapsulation sockets. */ int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); void (*encap_err_rcv)(struct sock *sk, struct sk_buff *skb, int err, __be16 port, u32 info, u8 *payload); int (*encap_err_lookup)(struct sock *sk, struct sk_buff *skb); void (*encap_destroy)(struct sock *sk); /* GRO functions for UDP socket */ struct sk_buff * (*gro_receive)(struct sock *sk, struct list_head *head, struct sk_buff *skb); int (*gro_complete)(struct sock *sk, struct sk_buff *skb, int nhoff); /* udp_recvmsg try to use this before splicing sk_receive_queue */ struct sk_buff_head reader_queue ____cacheline_aligned_in_smp; /* This field is dirtied by udp_recvmsg() */ int forward_deficit; /* This fields follows rcvbuf value, and is touched by udp_recvmsg */ int forward_threshold; /* Cache friendly copy of sk->sk_peek_off >= 0 */ bool peeking_with_offset; /* * Accounting for the tunnel GRO fastpath. * Unprotected by compilers guard, as it uses space available in * the last UDP socket cacheline. */ struct hlist_node tunnel_list; }; #define udp_test_bit(nr, sk) \ test_bit(UDP_FLAGS_##nr, &udp_sk(sk)->udp_flags) #define udp_set_bit(nr, sk) \ set_bit(UDP_FLAGS_##nr, &udp_sk(sk)->udp_flags) #define udp_test_and_set_bit(nr, sk) \ test_and_set_bit(UDP_FLAGS_##nr, &udp_sk(sk)->udp_flags) #define udp_clear_bit(nr, sk) \ clear_bit(UDP_FLAGS_##nr, &udp_sk(sk)->udp_flags) #define udp_assign_bit(nr, sk, val) \ assign_bit(UDP_FLAGS_##nr, &udp_sk(sk)->udp_flags, val) #define UDP_MAX_SEGMENTS (1 << 7UL) #define udp_sk(ptr) container_of_const(ptr, struct udp_sock, inet.sk) static inline int udp_set_peek_off(struct sock *sk, int val) { sk_set_peek_off(sk, val); WRITE_ONCE(udp_sk(sk)->peeking_with_offset, val >= 0); return 0; } static inline void udp_set_no_check6_tx(struct sock *sk, bool val) { udp_assign_bit(NO_CHECK6_TX, sk, val); } static inline void udp_set_no_check6_rx(struct sock *sk, bool val) { udp_assign_bit(NO_CHECK6_RX, sk, val); } static inline bool udp_get_no_check6_tx(const struct sock *sk) { return udp_test_bit(NO_CHECK6_TX, sk); } static inline bool udp_get_no_check6_rx(const struct sock *sk) { return udp_test_bit(NO_CHECK6_RX, sk); } static inline void udp_cmsg_recv(struct msghdr *msg, struct sock *sk, struct sk_buff *skb) { int gso_size; if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) { gso_size = skb_shinfo(skb)->gso_size; put_cmsg(msg, SOL_UDP, UDP_GRO, sizeof(gso_size), &gso_size); } } DECLARE_STATIC_KEY_FALSE(udp_encap_needed_key); #if IS_ENABLED(CONFIG_IPV6) DECLARE_STATIC_KEY_FALSE(udpv6_encap_needed_key); #endif static inline bool udp_encap_needed(void) { if (static_branch_unlikely(&udp_encap_needed_key)) return true; #if IS_ENABLED(CONFIG_IPV6) if (static_branch_unlikely(&udpv6_encap_needed_key)) return true; #endif return false; } static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb) { if (!skb_is_gso(skb)) return false; if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 && !udp_test_bit(ACCEPT_L4, sk)) return true; if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST && !udp_test_bit(ACCEPT_FRAGLIST, sk)) return true; /* GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits might still * land in a tunnel as the socket check in udp_gro_receive cannot be * foolproof. */ if (udp_encap_needed() && READ_ONCE(udp_sk(sk)->encap_rcv) && !(skb_shinfo(skb)->gso_type & (SKB_GSO_UDP_TUNNEL | SKB_GSO_UDP_TUNNEL_CSUM))) return true; return false; } static inline void udp_allow_gso(struct sock *sk) { udp_set_bit(ACCEPT_L4, sk); udp_set_bit(ACCEPT_FRAGLIST, sk); } #define udp_portaddr_for_each_entry(__sk, list) \ hlist_for_each_entry(__sk, list, __sk_common.skc_portaddr_node) #define udp_portaddr_for_each_entry_from(__sk) \ hlist_for_each_entry_from(__sk, __sk_common.skc_portaddr_node) #define udp_portaddr_for_each_entry_rcu(__sk, list) \ hlist_for_each_entry_rcu(__sk, list, __sk_common.skc_portaddr_node) #if !IS_ENABLED(CONFIG_BASE_SMALL) #define udp_lrpa_for_each_entry_rcu(__up, node, list) \ hlist_nulls_for_each_entry_rcu(__up, node, list, udp_lrpa_node) #endif #define IS_UDPLITE(__sk) (__sk->sk_protocol == IPPROTO_UDPLITE) static inline struct sock *udp_tunnel_sk(const struct net *net, bool is_ipv6) { #if IS_ENABLED(CONFIG_NET_UDP_TUNNEL) return rcu_dereference(net->ipv4.udp_tunnel_gro[is_ipv6].sk); #else return NULL; #endif } #endif /* _LINUX_UDP_H */ |
| 6 3 3 3 |