| 2 2 2 2 2 2 31 4 31 22 10 198 159 198 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 177 175 178 178 82 15 78 75 83 178 178 71 142 142 199 196 197 83 178 197 197 195 35 243 35 30 199 240 1 1 1 1 1 1 1 4 3 2 1 4 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 | /* * Copyright (c) 2018 Cumulus Networks. All rights reserved. * Copyright (c) 2018 David Ahern <dsa@cumulusnetworks.com> * * This software is licensed under the GNU General License Version 2, * June 1991 as shown in the file COPYING in the top-level directory of this * source tree. * * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. */ #include <linux/bitmap.h> #include <linux/in6.h> #include <linux/kernel.h> #include <linux/list.h> #include <linux/rhashtable.h> #include <linux/spinlock_types.h> #include <linux/types.h> #include <net/fib_notifier.h> #include <net/inet_dscp.h> #include <net/ip_fib.h> #include <net/ip6_fib.h> #include <net/fib_rules.h> #include <net/net_namespace.h> #include <net/nexthop.h> #include <linux/debugfs.h> #include "netdevsim.h" struct nsim_fib_entry { u64 max; atomic64_t num; }; struct nsim_per_fib_data { struct nsim_fib_entry fib; struct nsim_fib_entry rules; }; struct nsim_fib_data { struct notifier_block fib_nb; struct nsim_per_fib_data ipv4; struct nsim_per_fib_data ipv6; struct nsim_fib_entry nexthops; struct rhashtable fib_rt_ht; struct list_head fib_rt_list; struct mutex fib_lock; /* Protects FIB HT and list */ struct notifier_block nexthop_nb; struct rhashtable nexthop_ht; struct devlink *devlink; struct work_struct fib_event_work; struct work_struct fib_flush_work; struct list_head fib_event_queue; spinlock_t fib_event_queue_lock; /* Protects fib event queue list */ struct mutex nh_lock; /* Protects NH HT */ struct dentry *ddir; bool fail_route_offload; bool fail_res_nexthop_group_replace; bool fail_nexthop_bucket_replace; bool fail_route_delete; }; struct nsim_fib_rt_key { unsigned char addr[sizeof(struct in6_addr)]; unsigned char prefix_len; int family; u32 tb_id; }; struct nsim_fib_rt { struct nsim_fib_rt_key key; struct rhash_head ht_node; struct list_head list; /* Member of fib_rt_list */ }; struct nsim_fib4_rt { struct nsim_fib_rt common; struct fib_info *fi; dscp_t dscp; u8 type; }; struct nsim_fib6_rt { struct nsim_fib_rt common; struct list_head nh_list; unsigned int nhs; }; struct nsim_fib6_rt_nh { struct list_head list; /* Member of nh_list */ struct fib6_info *rt; }; struct nsim_fib6_event { struct fib6_info **rt_arr; unsigned int nrt6; }; struct nsim_fib_event { struct list_head list; /* node in fib queue */ union { struct fib_entry_notifier_info fen_info; struct nsim_fib6_event fib6_event; }; struct nsim_fib_data *data; unsigned long event; int family; }; static const struct rhashtable_params nsim_fib_rt_ht_params = { .key_offset = offsetof(struct nsim_fib_rt, key), .head_offset = offsetof(struct nsim_fib_rt, ht_node), .key_len = sizeof(struct nsim_fib_rt_key), .automatic_shrinking = true, }; struct nsim_nexthop { struct rhash_head ht_node; u64 occ; u32 id; bool is_resilient; }; static const struct rhashtable_params nsim_nexthop_ht_params = { .key_offset = offsetof(struct nsim_nexthop, id), .head_offset = offsetof(struct nsim_nexthop, ht_node), .key_len = sizeof(u32), .automatic_shrinking = true, }; u64 nsim_fib_get_val(struct nsim_fib_data *fib_data, enum nsim_resource_id res_id, bool max) { struct nsim_fib_entry *entry; switch (res_id) { case NSIM_RESOURCE_IPV4_FIB: entry = &fib_data->ipv4.fib; break; case NSIM_RESOURCE_IPV4_FIB_RULES: entry = &fib_data->ipv4.rules; break; case NSIM_RESOURCE_IPV6_FIB: entry = &fib_data->ipv6.fib; break; case NSIM_RESOURCE_IPV6_FIB_RULES: entry = &fib_data->ipv6.rules; break; case NSIM_RESOURCE_NEXTHOPS: entry = &fib_data->nexthops; break; default: return 0; } return max ? entry->max : atomic64_read(&entry->num); } static void nsim_fib_set_max(struct nsim_fib_data *fib_data, enum nsim_resource_id res_id, u64 val) { struct nsim_fib_entry *entry; switch (res_id) { case NSIM_RESOURCE_IPV4_FIB: entry = &fib_data->ipv4.fib; break; case NSIM_RESOURCE_IPV4_FIB_RULES: entry = &fib_data->ipv4.rules; break; case NSIM_RESOURCE_IPV6_FIB: entry = &fib_data->ipv6.fib; break; case NSIM_RESOURCE_IPV6_FIB_RULES: entry = &fib_data->ipv6.rules; break; case NSIM_RESOURCE_NEXTHOPS: entry = &fib_data->nexthops; break; default: WARN_ON(1); return; } entry->max = val; } static int nsim_fib_rule_account(struct nsim_fib_entry *entry, bool add, struct netlink_ext_ack *extack) { int err = 0; if (add) { if (!atomic64_add_unless(&entry->num, 1, entry->max)) { err = -ENOSPC; NL_SET_ERR_MSG_MOD(extack, "Exceeded number of supported fib rule entries"); } } else { atomic64_dec_if_positive(&entry->num); } return err; } static int nsim_fib_rule_event(struct nsim_fib_data *data, struct fib_notifier_info *info, bool add) { struct netlink_ext_ack *extack = info->extack; int err = 0; switch (info->family) { case AF_INET: err = nsim_fib_rule_account(&data->ipv4.rules, add, extack); break; case AF_INET6: err = nsim_fib_rule_account(&data->ipv6.rules, add, extack); break; } return err; } static int nsim_fib_account(struct nsim_fib_entry *entry, bool add) { int err = 0; if (add) { if (!atomic64_add_unless(&entry->num, 1, entry->max)) err = -ENOSPC; } else { atomic64_dec_if_positive(&entry->num); } return err; } static void nsim_fib_rt_init(struct nsim_fib_data *data, struct nsim_fib_rt *fib_rt, const void *addr, size_t addr_len, unsigned int prefix_len, int family, u32 tb_id) { memcpy(fib_rt->key.addr, addr, addr_len); fib_rt->key.prefix_len = prefix_len; fib_rt->key.family = family; fib_rt->key.tb_id = tb_id; list_add(&fib_rt->list, &data->fib_rt_list); } static void nsim_fib_rt_fini(struct nsim_fib_rt *fib_rt) { list_del(&fib_rt->list); } static struct nsim_fib_rt *nsim_fib_rt_lookup(struct rhashtable *fib_rt_ht, const void *addr, size_t addr_len, unsigned int prefix_len, int family, u32 tb_id) { struct nsim_fib_rt_key key; memset(&key, 0, sizeof(key)); memcpy(key.addr, addr, addr_len); key.prefix_len = prefix_len; key.family = family; key.tb_id = tb_id; return rhashtable_lookup_fast(fib_rt_ht, &key, nsim_fib_rt_ht_params); } static struct nsim_fib4_rt * nsim_fib4_rt_create(struct nsim_fib_data *data, struct fib_entry_notifier_info *fen_info) { struct nsim_fib4_rt *fib4_rt; fib4_rt = kzalloc(sizeof(*fib4_rt), GFP_KERNEL); if (!fib4_rt) return NULL; nsim_fib_rt_init(data, &fib4_rt->common, &fen_info->dst, sizeof(u32), fen_info->dst_len, AF_INET, fen_info->tb_id); fib4_rt->fi = fen_info->fi; fib_info_hold(fib4_rt->fi); fib4_rt->dscp = fen_info->dscp; fib4_rt->type = fen_info->type; return fib4_rt; } static void nsim_fib4_rt_destroy(struct nsim_fib4_rt *fib4_rt) { fib_info_put(fib4_rt->fi); nsim_fib_rt_fini(&fib4_rt->common); kfree(fib4_rt); } static struct nsim_fib4_rt * nsim_fib4_rt_lookup(struct rhashtable *fib_rt_ht, const struct fib_entry_notifier_info *fen_info) { struct nsim_fib_rt *fib_rt; fib_rt = nsim_fib_rt_lookup(fib_rt_ht, &fen_info->dst, sizeof(u32), fen_info->dst_len, AF_INET, fen_info->tb_id); if (!fib_rt) return NULL; return container_of(fib_rt, struct nsim_fib4_rt, common); } static void nsim_fib4_rt_offload_failed_flag_set(struct net *net, struct fib_entry_notifier_info *fen_info) { u32 *p_dst = (u32 *)&fen_info->dst; struct fib_rt_info fri; fri.fi = fen_info->fi; fri.tb_id = fen_info->tb_id; fri.dst = cpu_to_be32(*p_dst); fri.dst_len = fen_info->dst_len; fri.dscp = fen_info->dscp; fri.type = fen_info->type; fri.offload = false; fri.trap = false; fri.offload_failed = true; fib_alias_hw_flags_set(net, &fri); } static void nsim_fib4_rt_hw_flags_set(struct net *net, const struct nsim_fib4_rt *fib4_rt, bool trap) { u32 *p_dst = (u32 *) fib4_rt->common.key.addr; int dst_len = fib4_rt->common.key.prefix_len; struct fib_rt_info fri; fri.fi = fib4_rt->fi; fri.tb_id = fib4_rt->common.key.tb_id; fri.dst = cpu_to_be32(*p_dst); fri.dst_len = dst_len; fri.dscp = fib4_rt->dscp; fri.type = fib4_rt->type; fri.offload = false; fri.trap = trap; fri.offload_failed = false; fib_alias_hw_flags_set(net, &fri); } static int nsim_fib4_rt_add(struct nsim_fib_data *data, struct nsim_fib4_rt *fib4_rt) { struct net *net = devlink_net(data->devlink); int err; err = rhashtable_insert_fast(&data->fib_rt_ht, &fib4_rt->common.ht_node, nsim_fib_rt_ht_params); if (err) goto err_fib_dismiss; /* Simulate hardware programming latency. */ msleep(1); nsim_fib4_rt_hw_flags_set(net, fib4_rt, true); return 0; err_fib_dismiss: /* Drop the accounting that was increased from the notification * context when FIB_EVENT_ENTRY_REPLACE was triggered. */ nsim_fib_account(&data->ipv4.fib, false); return err; } static int nsim_fib4_rt_replace(struct nsim_fib_data *data, struct nsim_fib4_rt *fib4_rt, struct nsim_fib4_rt *fib4_rt_old) { struct net *net = devlink_net(data->devlink); int err; /* We are replacing a route, so need to remove the accounting which * was increased when FIB_EVENT_ENTRY_REPLACE was triggered. */ err = nsim_fib_account(&data->ipv4.fib, false); if (err) return err; err = rhashtable_replace_fast(&data->fib_rt_ht, &fib4_rt_old->common.ht_node, &fib4_rt->common.ht_node, nsim_fib_rt_ht_params); if (err) return err; msleep(1); nsim_fib4_rt_hw_flags_set(net, fib4_rt, true); nsim_fib4_rt_hw_flags_set(net, fib4_rt_old, false); nsim_fib4_rt_destroy(fib4_rt_old); return 0; } static int nsim_fib4_rt_insert(struct nsim_fib_data *data, struct fib_entry_notifier_info *fen_info) { struct nsim_fib4_rt *fib4_rt, *fib4_rt_old; int err; if (data->fail_route_offload) { /* For testing purposes, user set debugfs fail_route_offload * value to true. Simulate hardware programming latency and then * fail. */ msleep(1); return -EINVAL; } fib4_rt = nsim_fib4_rt_create(data, fen_info); if (!fib4_rt) return -ENOMEM; fib4_rt_old = nsim_fib4_rt_lookup(&data->fib_rt_ht, fen_info); if (!fib4_rt_old) err = nsim_fib4_rt_add(data, fib4_rt); else err = nsim_fib4_rt_replace(data, fib4_rt, fib4_rt_old); if (err) nsim_fib4_rt_destroy(fib4_rt); return err; } static void nsim_fib4_rt_remove(struct nsim_fib_data *data, const struct fib_entry_notifier_info *fen_info) { struct nsim_fib4_rt *fib4_rt; fib4_rt = nsim_fib4_rt_lookup(&data->fib_rt_ht, fen_info); if (!fib4_rt) return; rhashtable_remove_fast(&data->fib_rt_ht, &fib4_rt->common.ht_node, nsim_fib_rt_ht_params); nsim_fib4_rt_destroy(fib4_rt); } static int nsim_fib4_event(struct nsim_fib_data *data, struct fib_entry_notifier_info *fen_info, unsigned long event) { int err = 0; switch (event) { case FIB_EVENT_ENTRY_REPLACE: err = nsim_fib4_rt_insert(data, fen_info); if (err) { struct net *net = devlink_net(data->devlink); nsim_fib4_rt_offload_failed_flag_set(net, fen_info); } break; case FIB_EVENT_ENTRY_DEL: nsim_fib4_rt_remove(data, fen_info); break; default: break; } return err; } static struct nsim_fib6_rt_nh * nsim_fib6_rt_nh_find(const struct nsim_fib6_rt *fib6_rt, const struct fib6_info *rt) { struct nsim_fib6_rt_nh *fib6_rt_nh; list_for_each_entry(fib6_rt_nh, &fib6_rt->nh_list, list) { if (fib6_rt_nh->rt == rt) return fib6_rt_nh; } return NULL; } static int nsim_fib6_rt_nh_add(struct nsim_fib6_rt *fib6_rt, struct fib6_info *rt) { struct nsim_fib6_rt_nh *fib6_rt_nh; fib6_rt_nh = kzalloc(sizeof(*fib6_rt_nh), GFP_KERNEL); if (!fib6_rt_nh) return -ENOMEM; fib6_info_hold(rt); fib6_rt_nh->rt = rt; list_add_tail(&fib6_rt_nh->list, &fib6_rt->nh_list); fib6_rt->nhs++; return 0; } #if IS_ENABLED(CONFIG_IPV6) static void nsim_rt6_release(struct fib6_info *rt) { fib6_info_release(rt); } #else static void nsim_rt6_release(struct fib6_info *rt) { } #endif static void nsim_fib6_rt_nh_del(struct nsim_fib6_rt *fib6_rt, const struct fib6_info *rt) { struct nsim_fib6_rt_nh *fib6_rt_nh; fib6_rt_nh = nsim_fib6_rt_nh_find(fib6_rt, rt); if (!fib6_rt_nh) return; fib6_rt->nhs--; list_del(&fib6_rt_nh->list); nsim_rt6_release(fib6_rt_nh->rt); kfree(fib6_rt_nh); } static struct nsim_fib6_rt * nsim_fib6_rt_create(struct nsim_fib_data *data, struct fib6_info **rt_arr, unsigned int nrt6) { struct fib6_info *rt = rt_arr[0]; struct nsim_fib6_rt *fib6_rt; int i = 0; int err; fib6_rt = kzalloc(sizeof(*fib6_rt), GFP_KERNEL); if (!fib6_rt) return ERR_PTR(-ENOMEM); nsim_fib_rt_init(data, &fib6_rt->common, &rt->fib6_dst.addr, sizeof(rt->fib6_dst.addr), rt->fib6_dst.plen, AF_INET6, rt->fib6_table->tb6_id); /* We consider a multipath IPv6 route as one entry, but it can be made * up from several fib6_info structs (one for each nexthop), so we * add them all to the same list under the entry. */ INIT_LIST_HEAD(&fib6_rt->nh_list); for (i = 0; i < nrt6; i++) { err = nsim_fib6_rt_nh_add(fib6_rt, rt_arr[i]); if (err) goto err_fib6_rt_nh_del; } return fib6_rt; err_fib6_rt_nh_del: for (i--; i >= 0; i--) { nsim_fib6_rt_nh_del(fib6_rt, rt_arr[i]); } nsim_fib_rt_fini(&fib6_rt->common); kfree(fib6_rt); return ERR_PTR(err); } static void nsim_fib6_rt_destroy(struct nsim_fib6_rt *fib6_rt) { struct nsim_fib6_rt_nh *iter, *tmp; list_for_each_entry_safe(iter, tmp, &fib6_rt->nh_list, list) nsim_fib6_rt_nh_del(fib6_rt, iter->rt); WARN_ON_ONCE(!list_empty(&fib6_rt->nh_list)); nsim_fib_rt_fini(&fib6_rt->common); kfree(fib6_rt); } static struct nsim_fib6_rt * nsim_fib6_rt_lookup(struct rhashtable *fib_rt_ht, const struct fib6_info *rt) { struct nsim_fib_rt *fib_rt; fib_rt = nsim_fib_rt_lookup(fib_rt_ht, &rt->fib6_dst.addr, sizeof(rt->fib6_dst.addr), rt->fib6_dst.plen, AF_INET6, rt->fib6_table->tb6_id); if (!fib_rt) return NULL; return container_of(fib_rt, struct nsim_fib6_rt, common); } static int nsim_fib6_rt_append(struct nsim_fib_data *data, struct nsim_fib6_event *fib6_event) { struct fib6_info *rt = fib6_event->rt_arr[0]; struct nsim_fib6_rt *fib6_rt; int i, err; if (data->fail_route_offload) { /* For testing purposes, user set debugfs fail_route_offload * value to true. Simulate hardware programming latency and then * fail. */ msleep(1); return -EINVAL; } fib6_rt = nsim_fib6_rt_lookup(&data->fib_rt_ht, rt); if (!fib6_rt) return -EINVAL; for (i = 0; i < fib6_event->nrt6; i++) { err = nsim_fib6_rt_nh_add(fib6_rt, fib6_event->rt_arr[i]); if (err) goto err_fib6_rt_nh_del; WRITE_ONCE(fib6_event->rt_arr[i]->trap, true); } return 0; err_fib6_rt_nh_del: for (i--; i >= 0; i--) { WRITE_ONCE(fib6_event->rt_arr[i]->trap, false); nsim_fib6_rt_nh_del(fib6_rt, fib6_event->rt_arr[i]); } return err; } #if IS_ENABLED(CONFIG_IPV6) static void nsim_fib6_rt_offload_failed_flag_set(struct nsim_fib_data *data, struct fib6_info **rt_arr, unsigned int nrt6) { struct net *net = devlink_net(data->devlink); int i; for (i = 0; i < nrt6; i++) fib6_info_hw_flags_set(net, rt_arr[i], false, false, true); } #else static void nsim_fib6_rt_offload_failed_flag_set(struct nsim_fib_data *data, struct fib6_info **rt_arr, unsigned int nrt6) { } #endif #if IS_ENABLED(CONFIG_IPV6) static void nsim_fib6_rt_hw_flags_set(struct nsim_fib_data *data, const struct nsim_fib6_rt *fib6_rt, bool trap) { struct net *net = devlink_net(data->devlink); struct nsim_fib6_rt_nh *fib6_rt_nh; list_for_each_entry(fib6_rt_nh, &fib6_rt->nh_list, list) fib6_info_hw_flags_set(net, fib6_rt_nh->rt, false, trap, false); } #else static void nsim_fib6_rt_hw_flags_set(struct nsim_fib_data *data, const struct nsim_fib6_rt *fib6_rt, bool trap) { } #endif static int nsim_fib6_rt_add(struct nsim_fib_data *data, struct nsim_fib6_rt *fib6_rt) { int err; err = rhashtable_insert_fast(&data->fib_rt_ht, &fib6_rt->common.ht_node, nsim_fib_rt_ht_params); if (err) goto err_fib_dismiss; msleep(1); nsim_fib6_rt_hw_flags_set(data, fib6_rt, true); return 0; err_fib_dismiss: /* Drop the accounting that was increased from the notification * context when FIB_EVENT_ENTRY_REPLACE was triggered. */ nsim_fib_account(&data->ipv6.fib, false); return err; } static int nsim_fib6_rt_replace(struct nsim_fib_data *data, struct nsim_fib6_rt *fib6_rt, struct nsim_fib6_rt *fib6_rt_old) { int err; /* We are replacing a route, so need to remove the accounting which * was increased when FIB_EVENT_ENTRY_REPLACE was triggered. */ err = nsim_fib_account(&data->ipv6.fib, false); if (err) return err; err = rhashtable_replace_fast(&data->fib_rt_ht, &fib6_rt_old->common.ht_node, &fib6_rt->common.ht_node, nsim_fib_rt_ht_params); if (err) return err; msleep(1); nsim_fib6_rt_hw_flags_set(data, fib6_rt, true); nsim_fib6_rt_hw_flags_set(data, fib6_rt_old, false); nsim_fib6_rt_destroy(fib6_rt_old); return 0; } static int nsim_fib6_rt_insert(struct nsim_fib_data *data, struct nsim_fib6_event *fib6_event) { struct fib6_info *rt = fib6_event->rt_arr[0]; struct nsim_fib6_rt *fib6_rt, *fib6_rt_old; int err; if (data->fail_route_offload) { /* For testing purposes, user set debugfs fail_route_offload * value to true. Simulate hardware programming latency and then * fail. */ msleep(1); return -EINVAL; } fib6_rt = nsim_fib6_rt_create(data, fib6_event->rt_arr, fib6_event->nrt6); if (IS_ERR(fib6_rt)) return PTR_ERR(fib6_rt); fib6_rt_old = nsim_fib6_rt_lookup(&data->fib_rt_ht, rt); if (!fib6_rt_old) err = nsim_fib6_rt_add(data, fib6_rt); else err = nsim_fib6_rt_replace(data, fib6_rt, fib6_rt_old); if (err) nsim_fib6_rt_destroy(fib6_rt); return err; } static void nsim_fib6_rt_remove(struct nsim_fib_data *data, struct nsim_fib6_event *fib6_event) { struct fib6_info *rt = fib6_event->rt_arr[0]; struct nsim_fib6_rt *fib6_rt; int i; /* Multipath routes are first added to the FIB trie and only then * notified. If we vetoed the addition, we will get a delete * notification for a route we do not have. Therefore, do not warn if * route was not found. */ fib6_rt = nsim_fib6_rt_lookup(&data->fib_rt_ht, rt); if (!fib6_rt) return; /* If not all the nexthops are deleted, then only reduce the nexthop * group. */ if (fib6_event->nrt6 != fib6_rt->nhs) { for (i = 0; i < fib6_event->nrt6; i++) nsim_fib6_rt_nh_del(fib6_rt, fib6_event->rt_arr[i]); return; } rhashtable_remove_fast(&data->fib_rt_ht, &fib6_rt->common.ht_node, nsim_fib_rt_ht_params); nsim_fib6_rt_destroy(fib6_rt); } static int nsim_fib6_event_init(struct nsim_fib6_event *fib6_event, struct fib6_entry_notifier_info *fen6_info) { struct fib6_info *rt = fen6_info->rt; struct fib6_info **rt_arr; struct fib6_info *iter; unsigned int nrt6; int i = 0; nrt6 = fen6_info->nsiblings + 1; rt_arr = kcalloc(nrt6, sizeof(struct fib6_info *), GFP_ATOMIC); if (!rt_arr) return -ENOMEM; fib6_event->rt_arr = rt_arr; fib6_event->nrt6 = nrt6; rt_arr[0] = rt; fib6_info_hold(rt); if (!fen6_info->nsiblings) return 0; list_for_each_entry(iter, &rt->fib6_siblings, fib6_siblings) { if (i == fen6_info->nsiblings) break; rt_arr[i + 1] = iter; fib6_info_hold(iter); i++; } WARN_ON_ONCE(i != fen6_info->nsiblings); return 0; } static void nsim_fib6_event_fini(struct nsim_fib6_event *fib6_event) { int i; for (i = 0; i < fib6_event->nrt6; i++) nsim_rt6_release(fib6_event->rt_arr[i]); kfree(fib6_event->rt_arr); } static int nsim_fib6_event(struct nsim_fib_data *data, struct nsim_fib6_event *fib6_event, unsigned long event) { int err; if (fib6_event->rt_arr[0]->fib6_src.plen) return 0; switch (event) { case FIB_EVENT_ENTRY_REPLACE: err = nsim_fib6_rt_insert(data, fib6_event); if (err) goto err_rt_offload_failed_flag_set; break; case FIB_EVENT_ENTRY_APPEND: err = nsim_fib6_rt_append(data, fib6_event); if (err) goto err_rt_offload_failed_flag_set; break; case FIB_EVENT_ENTRY_DEL: nsim_fib6_rt_remove(data, fib6_event); break; default: break; } return 0; err_rt_offload_failed_flag_set: nsim_fib6_rt_offload_failed_flag_set(data, fib6_event->rt_arr, fib6_event->nrt6); return err; } static void nsim_fib_event(struct nsim_fib_event *fib_event) { switch (fib_event->family) { case AF_INET: nsim_fib4_event(fib_event->data, &fib_event->fen_info, fib_event->event); fib_info_put(fib_event->fen_info.fi); break; case AF_INET6: nsim_fib6_event(fib_event->data, &fib_event->fib6_event, fib_event->event); nsim_fib6_event_fini(&fib_event->fib6_event); break; } } static int nsim_fib4_prepare_event(struct fib_notifier_info *info, struct nsim_fib_event *fib_event, unsigned long event) { struct nsim_fib_data *data = fib_event->data; struct fib_entry_notifier_info *fen_info; struct netlink_ext_ack *extack; int err = 0; fen_info = container_of(info, struct fib_entry_notifier_info, info); fib_event->fen_info = *fen_info; extack = info->extack; switch (event) { case FIB_EVENT_ENTRY_REPLACE: err = nsim_fib_account(&data->ipv4.fib, true); if (err) { NL_SET_ERR_MSG_MOD(extack, "Exceeded number of supported fib entries"); return err; } break; case FIB_EVENT_ENTRY_DEL: if (data->fail_route_delete) { NL_SET_ERR_MSG_MOD(extack, "Failed to process route deletion"); return -EINVAL; } nsim_fib_account(&data->ipv4.fib, false); break; } /* Take reference on fib_info to prevent it from being * freed while event is queued. Release it afterwards. */ fib_info_hold(fib_event->fen_info.fi); return 0; } static int nsim_fib6_prepare_event(struct fib_notifier_info *info, struct nsim_fib_event *fib_event, unsigned long event) { struct nsim_fib_data *data = fib_event->data; struct fib6_entry_notifier_info *fen6_info; struct netlink_ext_ack *extack; int err = 0; fen6_info = container_of(info, struct fib6_entry_notifier_info, info); err = nsim_fib6_event_init(&fib_event->fib6_event, fen6_info); if (err) return err; extack = info->extack; switch (event) { case FIB_EVENT_ENTRY_REPLACE: err = nsim_fib_account(&data->ipv6.fib, true); if (err) { NL_SET_ERR_MSG_MOD(extack, "Exceeded number of supported fib entries"); goto err_fib6_event_fini; } break; case FIB_EVENT_ENTRY_DEL: if (data->fail_route_delete) { err = -EINVAL; NL_SET_ERR_MSG_MOD(extack, "Failed to process route deletion"); goto err_fib6_event_fini; } nsim_fib_account(&data->ipv6.fib, false); break; } return 0; err_fib6_event_fini: nsim_fib6_event_fini(&fib_event->fib6_event); return err; } static int nsim_fib_event_schedule_work(struct nsim_fib_data *data, struct fib_notifier_info *info, unsigned long event) { struct nsim_fib_event *fib_event; int err; if (info->family != AF_INET && info->family != AF_INET6) /* netdevsim does not support 'RTNL_FAMILY_IP6MR' and * 'RTNL_FAMILY_IPMR' and should ignore them. */ return NOTIFY_DONE; fib_event = kzalloc(sizeof(*fib_event), GFP_ATOMIC); if (!fib_event) goto err_fib_event_alloc; fib_event->data = data; fib_event->event = event; fib_event->family = info->family; switch (info->family) { case AF_INET: err = nsim_fib4_prepare_event(info, fib_event, event); break; case AF_INET6: err = nsim_fib6_prepare_event(info, fib_event, event); break; } if (err) goto err_fib_prepare_event; /* Enqueue the event and trigger the work */ spin_lock_bh(&data->fib_event_queue_lock); list_add_tail(&fib_event->list, &data->fib_event_queue); spin_unlock_bh(&data->fib_event_queue_lock); schedule_work(&data->fib_event_work); return NOTIFY_DONE; err_fib_prepare_event: kfree(fib_event); err_fib_event_alloc: if (event == FIB_EVENT_ENTRY_DEL) schedule_work(&data->fib_flush_work); return NOTIFY_BAD; } static int nsim_fib_event_nb(struct notifier_block *nb, unsigned long event, void *ptr) { struct nsim_fib_data *data = container_of(nb, struct nsim_fib_data, fib_nb); struct fib_notifier_info *info = ptr; int err; switch (event) { case FIB_EVENT_RULE_ADD: case FIB_EVENT_RULE_DEL: err = nsim_fib_rule_event(data, info, event == FIB_EVENT_RULE_ADD); return notifier_from_errno(err); case FIB_EVENT_ENTRY_REPLACE: case FIB_EVENT_ENTRY_APPEND: case FIB_EVENT_ENTRY_DEL: return nsim_fib_event_schedule_work(data, info, event); } return NOTIFY_DONE; } static void nsim_fib4_rt_free(struct nsim_fib_rt *fib_rt, struct nsim_fib_data *data) { struct devlink *devlink = data->devlink; struct nsim_fib4_rt *fib4_rt; fib4_rt = container_of(fib_rt, struct nsim_fib4_rt, common); nsim_fib4_rt_hw_flags_set(devlink_net(devlink), fib4_rt, false); nsim_fib_account(&data->ipv4.fib, false); nsim_fib4_rt_destroy(fib4_rt); } static void nsim_fib6_rt_free(struct nsim_fib_rt *fib_rt, struct nsim_fib_data *data) { struct nsim_fib6_rt *fib6_rt; fib6_rt = container_of(fib_rt, struct nsim_fib6_rt, common); nsim_fib6_rt_hw_flags_set(data, fib6_rt, false); nsim_fib_account(&data->ipv6.fib, false); nsim_fib6_rt_destroy(fib6_rt); } static void nsim_fib_rt_free(void *ptr, void *arg) { struct nsim_fib_rt *fib_rt = ptr; struct nsim_fib_data *data = arg; switch (fib_rt->key.family) { case AF_INET: nsim_fib4_rt_free(fib_rt, data); break; case AF_INET6: nsim_fib6_rt_free(fib_rt, data); break; default: WARN_ON_ONCE(1); } } /* inconsistent dump, trying again */ static void nsim_fib_dump_inconsistent(struct notifier_block *nb) { struct nsim_fib_data *data = container_of(nb, struct nsim_fib_data, fib_nb); struct nsim_fib_rt *fib_rt, *fib_rt_tmp; /* Flush the work to make sure there is no race with notifications. */ flush_work(&data->fib_event_work); /* The notifier block is still not registered, so we do not need to * take any locks here. */ list_for_each_entry_safe(fib_rt, fib_rt_tmp, &data->fib_rt_list, list) { rhashtable_remove_fast(&data->fib_rt_ht, &fib_rt->ht_node, nsim_fib_rt_ht_params); nsim_fib_rt_free(fib_rt, data); } atomic64_set(&data->ipv4.rules.num, 0ULL); atomic64_set(&data->ipv6.rules.num, 0ULL); } static struct nsim_nexthop *nsim_nexthop_create(struct nsim_fib_data *data, struct nh_notifier_info *info) { struct nsim_nexthop *nexthop; u64 occ = 0; int i; nexthop = kzalloc(sizeof(*nexthop), GFP_KERNEL); if (!nexthop) return ERR_PTR(-ENOMEM); nexthop->id = info->id; /* Determine the number of nexthop entries the new nexthop will * occupy. */ switch (info->type) { case NH_NOTIFIER_INFO_TYPE_SINGLE: occ = 1; break; case NH_NOTIFIER_INFO_TYPE_GRP: for (i = 0; i < info->nh_grp->num_nh; i++) occ += info->nh_grp->nh_entries[i].weight; break; case NH_NOTIFIER_INFO_TYPE_RES_TABLE: occ = info->nh_res_table->num_nh_buckets; nexthop->is_resilient = true; break; default: NL_SET_ERR_MSG_MOD(info->extack, "Unsupported nexthop type"); kfree(nexthop); return ERR_PTR(-EOPNOTSUPP); } nexthop->occ = occ; return nexthop; } static void nsim_nexthop_destroy(struct nsim_nexthop *nexthop) { kfree(nexthop); } static int nsim_nexthop_account(struct nsim_fib_data *data, u64 occ, bool add, struct netlink_ext_ack *extack) { int i, err = 0; if (add) { for (i = 0; i < occ; i++) if (!atomic64_add_unless(&data->nexthops.num, 1, data->nexthops.max)) { err = -ENOSPC; NL_SET_ERR_MSG_MOD(extack, "Exceeded number of supported nexthops"); goto err_num_decrease; } } else { if (WARN_ON(occ > atomic64_read(&data->nexthops.num))) return -EINVAL; atomic64_sub(occ, &data->nexthops.num); } return err; err_num_decrease: atomic64_sub(i, &data->nexthops.num); return err; } static void nsim_nexthop_hw_flags_set(struct net *net, const struct nsim_nexthop *nexthop, bool trap) { int i; nexthop_set_hw_flags(net, nexthop->id, false, trap); if (!nexthop->is_resilient) return; for (i = 0; i < nexthop->occ; i++) nexthop_bucket_set_hw_flags(net, nexthop->id, i, false, trap); } static int nsim_nexthop_add(struct nsim_fib_data *data, struct nsim_nexthop *nexthop, struct netlink_ext_ack *extack) { struct net *net = devlink_net(data->devlink); int err; err = nsim_nexthop_account(data, nexthop->occ, true, extack); if (err) return err; err = rhashtable_insert_fast(&data->nexthop_ht, &nexthop->ht_node, nsim_nexthop_ht_params); if (err) { NL_SET_ERR_MSG_MOD(extack, "Failed to insert nexthop"); goto err_nexthop_dismiss; } nsim_nexthop_hw_flags_set(net, nexthop, true); return 0; err_nexthop_dismiss: nsim_nexthop_account(data, nexthop->occ, false, extack); return err; } static int nsim_nexthop_replace(struct nsim_fib_data *data, struct nsim_nexthop *nexthop, struct nsim_nexthop *nexthop_old, struct netlink_ext_ack *extack) { struct net *net = devlink_net(data->devlink); int err; err = nsim_nexthop_account(data, nexthop->occ, true, extack); if (err) return err; err = rhashtable_replace_fast(&data->nexthop_ht, &nexthop_old->ht_node, &nexthop->ht_node, nsim_nexthop_ht_params); if (err) { NL_SET_ERR_MSG_MOD(extack, "Failed to replace nexthop"); goto err_nexthop_dismiss; } nsim_nexthop_hw_flags_set(net, nexthop, true); nsim_nexthop_account(data, nexthop_old->occ, false, extack); nsim_nexthop_destroy(nexthop_old); return 0; err_nexthop_dismiss: nsim_nexthop_account(data, nexthop->occ, false, extack); return err; } static int nsim_nexthop_insert(struct nsim_fib_data *data, struct nh_notifier_info *info) { struct nsim_nexthop *nexthop, *nexthop_old; int err; nexthop = nsim_nexthop_create(data, info); if (IS_ERR(nexthop)) return PTR_ERR(nexthop); nexthop_old = rhashtable_lookup_fast(&data->nexthop_ht, &info->id, nsim_nexthop_ht_params); if (!nexthop_old) err = nsim_nexthop_add(data, nexthop, info->extack); else err = nsim_nexthop_replace(data, nexthop, nexthop_old, info->extack); if (err) nsim_nexthop_destroy(nexthop); return err; } static void nsim_nexthop_remove(struct nsim_fib_data *data, struct nh_notifier_info *info) { struct nsim_nexthop *nexthop; nexthop = rhashtable_lookup_fast(&data->nexthop_ht, &info->id, nsim_nexthop_ht_params); if (!nexthop) return; rhashtable_remove_fast(&data->nexthop_ht, &nexthop->ht_node, nsim_nexthop_ht_params); nsim_nexthop_account(data, nexthop->occ, false, info->extack); nsim_nexthop_destroy(nexthop); } static int nsim_nexthop_res_table_pre_replace(struct nsim_fib_data *data, struct nh_notifier_info *info) { if (data->fail_res_nexthop_group_replace) { NL_SET_ERR_MSG_MOD(info->extack, "Failed to replace a resilient nexthop group"); return -EINVAL; } return 0; } static int nsim_nexthop_bucket_replace(struct nsim_fib_data *data, struct nh_notifier_info *info) { if (data->fail_nexthop_bucket_replace) { NL_SET_ERR_MSG_MOD(info->extack, "Failed to replace nexthop bucket"); return -EINVAL; } nexthop_bucket_set_hw_flags(info->net, info->id, info->nh_res_bucket->bucket_index, false, true); return 0; } static int nsim_nexthop_event_nb(struct notifier_block *nb, unsigned long event, void *ptr) { struct nsim_fib_data *data = container_of(nb, struct nsim_fib_data, nexthop_nb); struct nh_notifier_info *info = ptr; int err = 0; mutex_lock(&data->nh_lock); switch (event) { case NEXTHOP_EVENT_REPLACE: err = nsim_nexthop_insert(data, info); break; case NEXTHOP_EVENT_DEL: nsim_nexthop_remove(data, info); break; case NEXTHOP_EVENT_RES_TABLE_PRE_REPLACE: err = nsim_nexthop_res_table_pre_replace(data, info); break; case NEXTHOP_EVENT_BUCKET_REPLACE: err = nsim_nexthop_bucket_replace(data, info); break; default: break; } mutex_unlock(&data->nh_lock); return notifier_from_errno(err); } static void nsim_nexthop_free(void *ptr, void *arg) { struct nsim_nexthop *nexthop = ptr; struct nsim_fib_data *data = arg; struct net *net; net = devlink_net(data->devlink); nsim_nexthop_hw_flags_set(net, nexthop, false); nsim_nexthop_account(data, nexthop->occ, false, NULL); nsim_nexthop_destroy(nexthop); } static ssize_t nsim_nexthop_bucket_activity_write(struct file *file, const char __user *user_buf, size_t size, loff_t *ppos) { struct nsim_fib_data *data = file->private_data; struct net *net = devlink_net(data->devlink); struct nsim_nexthop *nexthop; unsigned long *activity; loff_t pos = *ppos; u16 bucket_index; char buf[128]; int err = 0; u32 nhid; if (pos != 0) return -EINVAL; if (size > sizeof(buf) - 1) return -EINVAL; if (copy_from_user(buf, user_buf, size)) return -EFAULT; buf[size] = 0; if (sscanf(buf, "%u %hu", &nhid, &bucket_index) != 2) return -EINVAL; rtnl_lock(); nexthop = rhashtable_lookup_fast(&data->nexthop_ht, &nhid, nsim_nexthop_ht_params); if (!nexthop || !nexthop->is_resilient || bucket_index >= nexthop->occ) { err = -EINVAL; goto out; } activity = bitmap_zalloc(nexthop->occ, GFP_KERNEL); if (!activity) { err = -ENOMEM; goto out; } bitmap_set(activity, bucket_index, 1); nexthop_res_grp_activity_update(net, nhid, nexthop->occ, activity); bitmap_free(activity); out: rtnl_unlock(); *ppos = size; return err ?: size; } static const struct file_operations nsim_nexthop_bucket_activity_fops = { .open = simple_open, .write = nsim_nexthop_bucket_activity_write, .owner = THIS_MODULE, }; static u64 nsim_fib_ipv4_resource_occ_get(void *priv) { struct nsim_fib_data *data = priv; return nsim_fib_get_val(data, NSIM_RESOURCE_IPV4_FIB, false); } static u64 nsim_fib_ipv4_rules_res_occ_get(void *priv) { struct nsim_fib_data *data = priv; return nsim_fib_get_val(data, NSIM_RESOURCE_IPV4_FIB_RULES, false); } static u64 nsim_fib_ipv6_resource_occ_get(void *priv) { struct nsim_fib_data *data = priv; return nsim_fib_get_val(data, NSIM_RESOURCE_IPV6_FIB, false); } static u64 nsim_fib_ipv6_rules_res_occ_get(void *priv) { struct nsim_fib_data *data = priv; return nsim_fib_get_val(data, NSIM_RESOURCE_IPV6_FIB_RULES, false); } static u64 nsim_fib_nexthops_res_occ_get(void *priv) { struct nsim_fib_data *data = priv; return nsim_fib_get_val(data, NSIM_RESOURCE_NEXTHOPS, false); } static void nsim_fib_set_max_all(struct nsim_fib_data *data, struct devlink *devlink) { static const enum nsim_resource_id res_ids[] = { NSIM_RESOURCE_IPV4_FIB, NSIM_RESOURCE_IPV4_FIB_RULES, NSIM_RESOURCE_IPV6_FIB, NSIM_RESOURCE_IPV6_FIB_RULES, NSIM_RESOURCE_NEXTHOPS, }; int i; for (i = 0; i < ARRAY_SIZE(res_ids); i++) { int err; u64 val; err = devl_resource_size_get(devlink, res_ids[i], &val); if (err) val = (u64) -1; nsim_fib_set_max(data, res_ids[i], val); } } static void nsim_fib_event_work(struct work_struct *work) { struct nsim_fib_data *data = container_of(work, struct nsim_fib_data, fib_event_work); struct nsim_fib_event *fib_event, *next_fib_event; LIST_HEAD(fib_event_queue); spin_lock_bh(&data->fib_event_queue_lock); list_splice_init(&data->fib_event_queue, &fib_event_queue); spin_unlock_bh(&data->fib_event_queue_lock); mutex_lock(&data->fib_lock); list_for_each_entry_safe(fib_event, next_fib_event, &fib_event_queue, list) { nsim_fib_event(fib_event); list_del(&fib_event->list); kfree(fib_event); cond_resched(); } mutex_unlock(&data->fib_lock); } static void nsim_fib_flush_work(struct work_struct *work) { struct nsim_fib_data *data = container_of(work, struct nsim_fib_data, fib_flush_work); struct nsim_fib_rt *fib_rt, *fib_rt_tmp; /* Process pending work. */ flush_work(&data->fib_event_work); mutex_lock(&data->fib_lock); list_for_each_entry_safe(fib_rt, fib_rt_tmp, &data->fib_rt_list, list) { rhashtable_remove_fast(&data->fib_rt_ht, &fib_rt->ht_node, nsim_fib_rt_ht_params); nsim_fib_rt_free(fib_rt, data); } mutex_unlock(&data->fib_lock); } static int nsim_fib_debugfs_init(struct nsim_fib_data *data, struct nsim_dev *nsim_dev) { data->ddir = debugfs_create_dir("fib", nsim_dev->ddir); if (IS_ERR(data->ddir)) return PTR_ERR(data->ddir); data->fail_route_offload = false; debugfs_create_bool("fail_route_offload", 0600, data->ddir, &data->fail_route_offload); data->fail_res_nexthop_group_replace = false; debugfs_create_bool("fail_res_nexthop_group_replace", 0600, data->ddir, &data->fail_res_nexthop_group_replace); data->fail_nexthop_bucket_replace = false; debugfs_create_bool("fail_nexthop_bucket_replace", 0600, data->ddir, &data->fail_nexthop_bucket_replace); debugfs_create_file("nexthop_bucket_activity", 0200, data->ddir, data, &nsim_nexthop_bucket_activity_fops); data->fail_route_delete = false; debugfs_create_bool("fail_route_delete", 0600, data->ddir, &data->fail_route_delete); return 0; } static void nsim_fib_debugfs_exit(struct nsim_fib_data *data) { debugfs_remove_recursive(data->ddir); } struct nsim_fib_data *nsim_fib_create(struct devlink *devlink, struct netlink_ext_ack *extack) { struct nsim_fib_data *data; struct nsim_dev *nsim_dev; int err; data = kzalloc(sizeof(*data), GFP_KERNEL); if (!data) return ERR_PTR(-ENOMEM); data->devlink = devlink; nsim_dev = devlink_priv(devlink); err = nsim_fib_debugfs_init(data, nsim_dev); if (err) goto err_data_free; mutex_init(&data->nh_lock); err = rhashtable_init(&data->nexthop_ht, &nsim_nexthop_ht_params); if (err) goto err_debugfs_exit; mutex_init(&data->fib_lock); INIT_LIST_HEAD(&data->fib_rt_list); err = rhashtable_init(&data->fib_rt_ht, &nsim_fib_rt_ht_params); if (err) goto err_rhashtable_nexthop_destroy; INIT_WORK(&data->fib_event_work, nsim_fib_event_work); INIT_WORK(&data->fib_flush_work, nsim_fib_flush_work); INIT_LIST_HEAD(&data->fib_event_queue); spin_lock_init(&data->fib_event_queue_lock); nsim_fib_set_max_all(data, devlink); data->nexthop_nb.notifier_call = nsim_nexthop_event_nb; err = register_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb, extack); if (err) { pr_err("Failed to register nexthop notifier\n"); goto err_rhashtable_fib_destroy; } data->fib_nb.notifier_call = nsim_fib_event_nb; err = register_fib_notifier(devlink_net(devlink), &data->fib_nb, nsim_fib_dump_inconsistent, extack); if (err) { pr_err("Failed to register fib notifier\n"); goto err_nexthop_nb_unregister; } devl_resource_occ_get_register(devlink, NSIM_RESOURCE_IPV4_FIB, nsim_fib_ipv4_resource_occ_get, data); devl_resource_occ_get_register(devlink, NSIM_RESOURCE_IPV4_FIB_RULES, nsim_fib_ipv4_rules_res_occ_get, data); devl_resource_occ_get_register(devlink, NSIM_RESOURCE_IPV6_FIB, nsim_fib_ipv6_resource_occ_get, data); devl_resource_occ_get_register(devlink, NSIM_RESOURCE_IPV6_FIB_RULES, nsim_fib_ipv6_rules_res_occ_get, data); devl_resource_occ_get_register(devlink, NSIM_RESOURCE_NEXTHOPS, nsim_fib_nexthops_res_occ_get, data); return data; err_nexthop_nb_unregister: unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb); err_rhashtable_fib_destroy: cancel_work_sync(&data->fib_flush_work); flush_work(&data->fib_event_work); rhashtable_free_and_destroy(&data->fib_rt_ht, nsim_fib_rt_free, data); err_rhashtable_nexthop_destroy: rhashtable_free_and_destroy(&data->nexthop_ht, nsim_nexthop_free, data); mutex_destroy(&data->fib_lock); err_debugfs_exit: mutex_destroy(&data->nh_lock); nsim_fib_debugfs_exit(data); err_data_free: kfree(data); return ERR_PTR(err); } void nsim_fib_destroy(struct devlink *devlink, struct nsim_fib_data *data) { devl_resource_occ_get_unregister(devlink, NSIM_RESOURCE_NEXTHOPS); devl_resource_occ_get_unregister(devlink, NSIM_RESOURCE_IPV6_FIB_RULES); devl_resource_occ_get_unregister(devlink, NSIM_RESOURCE_IPV6_FIB); devl_resource_occ_get_unregister(devlink, NSIM_RESOURCE_IPV4_FIB_RULES); devl_resource_occ_get_unregister(devlink, NSIM_RESOURCE_IPV4_FIB); unregister_fib_notifier(devlink_net(devlink), &data->fib_nb); unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb); cancel_work_sync(&data->fib_flush_work); flush_work(&data->fib_event_work); rhashtable_free_and_destroy(&data->fib_rt_ht, nsim_fib_rt_free, data); rhashtable_free_and_destroy(&data->nexthop_ht, nsim_nexthop_free, data); WARN_ON_ONCE(!list_empty(&data->fib_event_queue)); WARN_ON_ONCE(!list_empty(&data->fib_rt_list)); mutex_destroy(&data->fib_lock); mutex_destroy(&data->nh_lock); nsim_fib_debugfs_exit(data); kfree(data); } |
| 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5250 5251 5252 5253 5254 5255 5256 5257 5258 5259 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309 5310 5311 5312 5313 5314 5315 5316 5317 5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334 5335 5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378 5379 5380 5381 5382 5383 5384 5385 5386 5387 5388 5389 5390 5391 5392 5393 5394 5395 5396 5397 5398 5399 5400 5401 5402 5403 5404 5405 5406 5407 5408 5409 5410 5411 5412 5413 5414 5415 5416 5417 5418 5419 5420 5421 5422 5423 5424 5425 5426 5427 5428 5429 5430 5431 5432 5433 5434 5435 5436 5437 5438 5439 5440 5441 5442 5443 5444 5445 5446 5447 5448 5449 5450 5451 5452 5453 5454 5455 5456 5457 5458 5459 5460 5461 5462 5463 5464 5465 5466 5467 5468 5469 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 5497 5498 5499 5500 5501 5502 5503 5504 5505 5506 5507 5508 5509 5510 5511 5512 5513 5514 5515 5516 5517 5518 5519 5520 5521 5522 5523 5524 5525 5526 5527 5528 5529 5530 5531 5532 5533 5534 5535 5536 5537 5538 5539 5540 5541 5542 5543 5544 5545 5546 5547 5548 5549 5550 5551 5552 5553 5554 5555 5556 5557 5558 5559 5560 5561 5562 5563 5564 5565 5566 5567 5568 5569 5570 5571 5572 5573 5574 5575 5576 5577 5578 5579 5580 5581 5582 5583 5584 5585 5586 5587 5588 5589 5590 5591 5592 5593 5594 5595 5596 5597 5598 5599 5600 5601 5602 5603 5604 5605 5606 5607 5608 5609 5610 5611 5612 5613 5614 5615 5616 5617 5618 5619 5620 5621 5622 5623 5624 5625 5626 5627 5628 5629 5630 5631 5632 5633 5634 5635 5636 5637 5638 5639 5640 5641 5642 5643 5644 5645 5646 5647 5648 5649 5650 5651 5652 5653 5654 5655 5656 5657 5658 5659 5660 5661 5662 5663 5664 5665 5666 5667 5668 5669 5670 5671 5672 5673 5674 5675 5676 5677 5678 5679 5680 5681 5682 5683 5684 5685 5686 5687 5688 5689 5690 5691 5692 5693 5694 5695 5696 5697 5698 5699 5700 5701 5702 5703 5704 5705 5706 5707 5708 5709 5710 5711 5712 5713 5714 5715 5716 5717 5718 5719 5720 5721 5722 5723 5724 5725 5726 5727 5728 5729 5730 5731 5732 5733 5734 5735 5736 5737 5738 5739 5740 5741 5742 5743 5744 5745 5746 5747 5748 5749 5750 5751 5752 5753 5754 5755 5756 5757 5758 5759 5760 5761 5762 5763 5764 5765 5766 5767 5768 5769 5770 5771 5772 5773 5774 5775 5776 5777 5778 5779 5780 5781 5782 5783 5784 5785 5786 5787 5788 5789 5790 5791 5792 5793 5794 5795 5796 5797 5798 5799 5800 5801 5802 5803 5804 5805 5806 5807 5808 5809 5810 5811 5812 5813 5814 5815 5816 5817 5818 5819 5820 5821 5822 5823 5824 5825 5826 5827 5828 5829 5830 5831 5832 5833 5834 5835 5836 5837 5838 5839 5840 5841 5842 5843 5844 5845 5846 5847 5848 5849 5850 5851 5852 5853 5854 5855 5856 5857 5858 5859 5860 5861 5862 5863 5864 5865 5866 5867 5868 5869 5870 5871 5872 5873 5874 5875 5876 5877 5878 5879 5880 5881 5882 5883 5884 5885 5886 5887 5888 5889 5890 5891 5892 5893 5894 5895 5896 5897 5898 5899 5900 5901 5902 5903 5904 5905 5906 5907 5908 5909 5910 5911 5912 5913 5914 5915 5916 5917 5918 5919 5920 5921 5922 5923 5924 5925 5926 5927 5928 5929 5930 5931 5932 5933 5934 5935 5936 5937 5938 5939 5940 5941 5942 5943 5944 5945 5946 5947 5948 5949 5950 5951 5952 5953 5954 5955 5956 5957 5958 5959 5960 5961 5962 5963 5964 5965 5966 5967 5968 5969 5970 5971 5972 5973 5974 5975 5976 5977 5978 5979 5980 5981 5982 5983 5984 5985 5986 5987 5988 5989 5990 5991 5992 5993 5994 5995 5996 5997 5998 5999 6000 6001 6002 6003 6004 6005 6006 6007 6008 6009 6010 6011 6012 6013 6014 6015 6016 6017 6018 6019 6020 6021 6022 6023 6024 6025 6026 6027 6028 6029 6030 6031 6032 6033 6034 6035 6036 6037 6038 6039 6040 6041 6042 6043 6044 6045 6046 6047 6048 6049 6050 6051 6052 6053 6054 6055 6056 6057 6058 6059 6060 6061 6062 6063 6064 6065 6066 6067 6068 6069 6070 6071 6072 6073 6074 6075 6076 6077 6078 6079 6080 6081 6082 6083 6084 6085 6086 6087 6088 6089 6090 6091 6092 6093 6094 6095 6096 6097 6098 6099 6100 6101 6102 6103 6104 6105 6106 6107 6108 6109 6110 6111 6112 6113 6114 6115 6116 6117 6118 6119 6120 6121 6122 6123 6124 6125 6126 6127 6128 6129 6130 6131 6132 6133 6134 6135 6136 6137 6138 6139 6140 6141 6142 6143 6144 6145 6146 6147 6148 6149 6150 6151 6152 6153 6154 6155 6156 6157 6158 6159 6160 6161 6162 6163 6164 6165 6166 6167 6168 6169 6170 6171 6172 6173 6174 6175 6176 6177 6178 6179 6180 6181 6182 6183 6184 6185 6186 6187 6188 6189 6190 6191 6192 6193 6194 6195 6196 6197 6198 6199 6200 6201 6202 6203 6204 6205 6206 6207 6208 6209 6210 6211 6212 6213 6214 6215 6216 6217 6218 6219 6220 6221 6222 6223 6224 6225 6226 6227 6228 6229 6230 6231 6232 6233 6234 6235 6236 6237 6238 6239 6240 6241 6242 6243 6244 6245 6246 6247 6248 6249 6250 6251 6252 6253 6254 6255 6256 6257 6258 6259 6260 6261 6262 6263 6264 6265 6266 6267 6268 6269 6270 6271 6272 6273 6274 6275 6276 6277 6278 6279 6280 6281 6282 6283 6284 6285 6286 6287 6288 6289 6290 6291 6292 6293 6294 6295 6296 6297 6298 6299 6300 6301 6302 6303 6304 6305 6306 6307 6308 6309 6310 6311 6312 6313 6314 6315 6316 6317 6318 | // SPDX-License-Identifier: GPL-2.0 /* * This file contains work-arounds for many known PCI hardware bugs. * Devices present only on certain architectures (host bridges et cetera) * should be handled in arch-specific code. * * Note: any quirks for hotpluggable devices must _NOT_ be declared __init. * * Copyright (c) 1999 Martin Mares <mj@ucw.cz> * * Init/reset quirks for USB host controllers should be in the USB quirks * file, where their drivers can use them. */ #include <linux/aer.h> #include <linux/align.h> #include <linux/bitfield.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/export.h> #include <linux/pci.h> #include <linux/isa-dma.h> /* isa_dma_bridge_buggy */ #include <linux/init.h> #include <linux/delay.h> #include <linux/acpi.h> #include <linux/dmi.h> #include <linux/ioport.h> #include <linux/sched.h> #include <linux/ktime.h> #include <linux/mm.h> #include <linux/nvme.h> #include <linux/platform_data/x86/apple.h> #include <linux/pm_runtime.h> #include <linux/sizes.h> #include <linux/suspend.h> #include <linux/switchtec.h> #include "pci.h" static bool pcie_lbms_seen(struct pci_dev *dev, u16 lnksta) { unsigned long count; int ret; ret = pcie_lbms_count(dev, &count); if (ret < 0) return lnksta & PCI_EXP_LNKSTA_LBMS; return count > 0; } /* * Retrain the link of a downstream PCIe port by hand if necessary. * * This is needed at least where a downstream port of the ASMedia ASM2824 * Gen 3 switch is wired to the upstream port of the Pericom PI7C9X2G304 * Gen 2 switch, and observed with the Delock Riser Card PCI Express x1 > * 2 x PCIe x1 device, P/N 41433, plugged into the SiFive HiFive Unmatched * board. * * In such a configuration the switches are supposed to negotiate the link * speed of preferably 5.0GT/s, falling back to 2.5GT/s. However the link * continues switching between the two speeds indefinitely and the data * link layer never reaches the active state, with link training reported * repeatedly active ~84% of the time. Forcing the target link speed to * 2.5GT/s with the upstream ASM2824 device makes the two switches talk to * each other correctly however. And more interestingly retraining with a * higher target link speed afterwards lets the two successfully negotiate * 5.0GT/s. * * With the ASM2824 we can rely on the otherwise optional Data Link Layer * Link Active status bit and in the failed link training scenario it will * be off along with the Link Bandwidth Management Status indicating that * hardware has changed the link speed or width in an attempt to correct * unreliable link operation. For a port that has been left unconnected * both bits will be clear. So use this information to detect the problem * rather than polling the Link Training bit and watching out for flips or * at least the active status. * * Since the exact nature of the problem isn't known and in principle this * could trigger where an ASM2824 device is downstream rather upstream, * apply this erratum workaround to any downstream ports as long as they * support Link Active reporting and have the Link Control 2 register. * Restrict the speed to 2.5GT/s then with the Target Link Speed field, * request a retrain and check the result. * * If this turns out successful and we know by the Vendor:Device ID it is * safe to do so, then lift the restriction, letting the devices negotiate * a higher speed. Also check for a similar 2.5GT/s speed restriction the * firmware may have already arranged and lift it with ports that already * report their data link being up. * * Otherwise revert the speed to the original setting and request a retrain * again to remove any residual state, ignoring the result as it's supposed * to fail anyway. * * Return 0 if the link has been successfully retrained. Return an error * if retraining was not needed or we attempted a retrain and it failed. */ int pcie_failed_link_retrain(struct pci_dev *dev) { static const struct pci_device_id ids[] = { { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */ {} }; u16 lnksta, lnkctl2; int ret = -ENOTTY; if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) || !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting) return ret; pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2); pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta); if (!(lnksta & PCI_EXP_LNKSTA_DLLLA) && pcie_lbms_seen(dev, lnksta)) { u16 oldlnkctl2 = lnkctl2; pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n"); ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false); if (ret) { pci_info(dev, "retraining failed\n"); pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true); return ret; } pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta); } if ((lnksta & PCI_EXP_LNKSTA_DLLLA) && (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT && pci_match_id(ids, dev)) { u32 lnkcap; pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n"); pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false); if (ret) { pci_info(dev, "retraining failed\n"); return ret; } } return ret; } static ktime_t fixup_debug_start(struct pci_dev *dev, void (*fn)(struct pci_dev *dev)) { if (initcall_debug) pci_info(dev, "calling %pS @ %i\n", fn, task_pid_nr(current)); return ktime_get(); } static void fixup_debug_report(struct pci_dev *dev, ktime_t calltime, void (*fn)(struct pci_dev *dev)) { ktime_t delta, rettime; unsigned long long duration; rettime = ktime_get(); delta = ktime_sub(rettime, calltime); duration = (unsigned long long) ktime_to_ns(delta) >> 10; if (initcall_debug || duration > 10000) pci_info(dev, "%pS took %lld usecs\n", fn, duration); } static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f, struct pci_fixup *end) { ktime_t calltime; for (; f < end; f++) if ((f->class == (u32) (dev->class >> f->class_shift) || f->class == (u32) PCI_ANY_ID) && (f->vendor == dev->vendor || f->vendor == (u16) PCI_ANY_ID) && (f->device == dev->device || f->device == (u16) PCI_ANY_ID)) { void (*hook)(struct pci_dev *dev); #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS hook = offset_to_ptr(&f->hook_offset); #else hook = f->hook; #endif calltime = fixup_debug_start(dev, hook); hook(dev); fixup_debug_report(dev, calltime, hook); } } extern struct pci_fixup __start_pci_fixups_early[]; extern struct pci_fixup __end_pci_fixups_early[]; extern struct pci_fixup __start_pci_fixups_header[]; extern struct pci_fixup __end_pci_fixups_header[]; extern struct pci_fixup __start_pci_fixups_final[]; extern struct pci_fixup __end_pci_fixups_final[]; extern struct pci_fixup __start_pci_fixups_enable[]; extern struct pci_fixup __end_pci_fixups_enable[]; extern struct pci_fixup __start_pci_fixups_resume[]; extern struct pci_fixup __end_pci_fixups_resume[]; extern struct pci_fixup __start_pci_fixups_resume_early[]; extern struct pci_fixup __end_pci_fixups_resume_early[]; extern struct pci_fixup __start_pci_fixups_suspend[]; extern struct pci_fixup __end_pci_fixups_suspend[]; extern struct pci_fixup __start_pci_fixups_suspend_late[]; extern struct pci_fixup __end_pci_fixups_suspend_late[]; static bool pci_apply_fixup_final_quirks; void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev) { struct pci_fixup *start, *end; switch (pass) { case pci_fixup_early: start = __start_pci_fixups_early; end = __end_pci_fixups_early; break; case pci_fixup_header: start = __start_pci_fixups_header; end = __end_pci_fixups_header; break; case pci_fixup_final: if (!pci_apply_fixup_final_quirks) return; start = __start_pci_fixups_final; end = __end_pci_fixups_final; break; case pci_fixup_enable: start = __start_pci_fixups_enable; end = __end_pci_fixups_enable; break; case pci_fixup_resume: start = __start_pci_fixups_resume; end = __end_pci_fixups_resume; break; case pci_fixup_resume_early: start = __start_pci_fixups_resume_early; end = __end_pci_fixups_resume_early; break; case pci_fixup_suspend: start = __start_pci_fixups_suspend; end = __end_pci_fixups_suspend; break; case pci_fixup_suspend_late: start = __start_pci_fixups_suspend_late; end = __end_pci_fixups_suspend_late; break; default: /* stupid compiler warning, you would think with an enum... */ return; } pci_do_fixups(dev, start, end); } EXPORT_SYMBOL(pci_fixup_device); static int __init pci_apply_final_quirks(void) { struct pci_dev *dev = NULL; u8 cls = 0; u8 tmp; if (pci_cache_line_size) pr_info("PCI: CLS %u bytes\n", pci_cache_line_size << 2); pci_apply_fixup_final_quirks = true; for_each_pci_dev(dev) { pci_fixup_device(pci_fixup_final, dev); /* * If arch hasn't set it explicitly yet, use the CLS * value shared by all PCI devices. If there's a * mismatch, fall back to the default value. */ if (!pci_cache_line_size) { pci_read_config_byte(dev, PCI_CACHE_LINE_SIZE, &tmp); if (!cls) cls = tmp; if (!tmp || cls == tmp) continue; pci_info(dev, "CLS mismatch (%u != %u), using %u bytes\n", cls << 2, tmp << 2, pci_dfl_cache_line_size << 2); pci_cache_line_size = pci_dfl_cache_line_size; } } if (!pci_cache_line_size) { pr_info("PCI: CLS %u bytes, default %u\n", cls << 2, pci_dfl_cache_line_size << 2); pci_cache_line_size = cls ? cls : pci_dfl_cache_line_size; } return 0; } fs_initcall_sync(pci_apply_final_quirks); /* * Decoding should be disabled for a PCI device during BAR sizing to avoid * conflict. But doing so may cause problems on host bridge and perhaps other * key system devices. For devices that need to have mmio decoding always-on, * we need to set the dev->mmio_always_on bit. */ static void quirk_mmio_always_on(struct pci_dev *dev) { dev->mmio_always_on = 1; } DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on); /* * The Mellanox Tavor device gives false positive parity errors. Disable * parity error reporting. */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX, PCI_DEVICE_ID_MELLANOX_TAVOR, pci_disable_parity); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX, PCI_DEVICE_ID_MELLANOX_TAVOR_BRIDGE, pci_disable_parity); /* * Deal with broken BIOSes that neglect to enable passive release, * which can cause problems in combination with the 82441FX/PPro MTRRs */ static void quirk_passive_release(struct pci_dev *dev) { struct pci_dev *d = NULL; unsigned char dlc; /* * We have to make sure a particular bit is set in the PIIX3 * ISA bridge, so we have to go out and find it. */ while ((d = pci_get_device(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, d))) { pci_read_config_byte(d, 0x82, &dlc); if (!(dlc & 1<<1)) { pci_info(d, "PIIX3: Enabling Passive Release\n"); dlc |= 1<<1; pci_write_config_byte(d, 0x82, dlc); } } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82441, quirk_passive_release); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82441, quirk_passive_release); #ifdef CONFIG_X86_32 /* * The VIA VP2/VP3/MVP3 seem to have some 'features'. There may be a * workaround but VIA don't answer queries. If you happen to have good * contacts at VIA ask them for me please -- Alan * * This appears to be BIOS not version dependent. So presumably there is a * chipset level fix. */ static void quirk_isa_dma_hangs(struct pci_dev *dev) { if (!isa_dma_bridge_buggy) { isa_dma_bridge_buggy = 1; pci_info(dev, "Activating ISA DMA hang workarounds\n"); } } /* * It's not totally clear which chipsets are the problematic ones. We know * 82C586 and 82C596 variants are affected. */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C586_0, quirk_isa_dma_hangs); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C596, quirk_isa_dma_hangs); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, quirk_isa_dma_hangs); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AL, PCI_DEVICE_ID_AL_M1533, quirk_isa_dma_hangs); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_CBUS_1, quirk_isa_dma_hangs); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_CBUS_2, quirk_isa_dma_hangs); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_CBUS_3, quirk_isa_dma_hangs); #endif #ifdef CONFIG_HAS_IOPORT /* * Intel NM10 "Tiger Point" LPC PM1a_STS.BM_STS must be clear * for some HT machines to use C4 w/o hanging. */ static void quirk_tigerpoint_bm_sts(struct pci_dev *dev) { u32 pmbase; u16 pm1a; pci_read_config_dword(dev, 0x40, &pmbase); pmbase = pmbase & 0xff80; pm1a = inw(pmbase); if (pm1a & 0x10) { pci_info(dev, FW_BUG "Tiger Point LPC.BM_STS cleared\n"); outw(0x10, pmbase); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_TGP_LPC, quirk_tigerpoint_bm_sts); #endif /* Chipsets where PCI->PCI transfers vanish or hang */ static void quirk_nopcipci(struct pci_dev *dev) { if ((pci_pci_problems & PCIPCI_FAIL) == 0) { pci_info(dev, "Disabling direct PCI/PCI transfers\n"); pci_pci_problems |= PCIPCI_FAIL; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_5597, quirk_nopcipci); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_496, quirk_nopcipci); static void quirk_nopciamd(struct pci_dev *dev) { u8 rev; pci_read_config_byte(dev, 0x08, &rev); if (rev == 0x13) { /* Erratum 24 */ pci_info(dev, "Chipset erratum: Disabling direct PCI/AGP transfers\n"); pci_pci_problems |= PCIAGP_FAIL; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8151_0, quirk_nopciamd); /* Triton requires workarounds to be used by the drivers */ static void quirk_triton(struct pci_dev *dev) { if ((pci_pci_problems&PCIPCI_TRITON) == 0) { pci_info(dev, "Limiting direct PCI/PCI transfers\n"); pci_pci_problems |= PCIPCI_TRITON; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82437, quirk_triton); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82437VX, quirk_triton); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82439, quirk_triton); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82439TX, quirk_triton); /* * VIA Apollo KT133 needs PCI latency patch * Made according to a Windows driver-based patch by George E. Breese; * see PCI Latency Adjust on http://www.viahardware.com/download/viatweak.shtm * Also see http://www.au-ja.org/review-kt133a-1-en.phtml for the info on * which Mr Breese based his work. * * Updated based on further information from the site and also on * information provided by VIA */ static void quirk_vialatency(struct pci_dev *dev) { struct pci_dev *p; u8 busarb; /* * Ok, we have a potential problem chipset here. Now see if we have * a buggy southbridge. */ p = pci_get_device(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C686, NULL); if (p != NULL) { /* * 0x40 - 0x4f == 686B, 0x10 - 0x2f == 686A; * thanks Dan Hollis. * Check for buggy part revisions */ if (p->revision < 0x40 || p->revision > 0x42) goto exit; } else { p = pci_get_device(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8231, NULL); if (p == NULL) /* No problem parts */ goto exit; /* Check for buggy part revisions */ if (p->revision < 0x10 || p->revision > 0x12) goto exit; } /* * Ok we have the problem. Now set the PCI master grant to occur * every master grant. The apparent bug is that under high PCI load * (quite common in Linux of course) you can get data loss when the * CPU is held off the bus for 3 bus master requests. This happens * to include the IDE controllers.... * * VIA only apply this fix when an SB Live! is present but under * both Linux and Windows this isn't enough, and we have seen * corruption without SB Live! but with things like 3 UDMA IDE * controllers. So we ignore that bit of the VIA recommendation.. */ pci_read_config_byte(dev, 0x76, &busarb); /* * Set bit 4 and bit 5 of byte 76 to 0x01 * "Master priority rotation on every PCI master grant" */ busarb &= ~(1<<5); busarb |= (1<<4); pci_write_config_byte(dev, 0x76, busarb); pci_info(dev, "Applying VIA southbridge workaround\n"); exit: pci_dev_put(p); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8363_0, quirk_vialatency); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8371_1, quirk_vialatency); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8361, quirk_vialatency); /* Must restore this on a resume from RAM */ DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8363_0, quirk_vialatency); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8371_1, quirk_vialatency); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8361, quirk_vialatency); /* VIA Apollo VP3 needs ETBF on BT848/878 */ static void quirk_viaetbf(struct pci_dev *dev) { if ((pci_pci_problems&PCIPCI_VIAETBF) == 0) { pci_info(dev, "Limiting direct PCI/PCI transfers\n"); pci_pci_problems |= PCIPCI_VIAETBF; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C597_0, quirk_viaetbf); static void quirk_vsfx(struct pci_dev *dev) { if ((pci_pci_problems&PCIPCI_VSFX) == 0) { pci_info(dev, "Limiting direct PCI/PCI transfers\n"); pci_pci_problems |= PCIPCI_VSFX; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C576, quirk_vsfx); /* * ALi Magik requires workarounds to be used by the drivers that DMA to AGP * space. Latency must be set to 0xA and Triton workaround applied too. * [Info kindly provided by ALi] */ static void quirk_alimagik(struct pci_dev *dev) { if ((pci_pci_problems&PCIPCI_ALIMAGIK) == 0) { pci_info(dev, "Limiting direct PCI/PCI transfers\n"); pci_pci_problems |= PCIPCI_ALIMAGIK|PCIPCI_TRITON; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AL, PCI_DEVICE_ID_AL_M1647, quirk_alimagik); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AL, PCI_DEVICE_ID_AL_M1651, quirk_alimagik); /* Natoma has some interesting boundary conditions with Zoran stuff at least */ static void quirk_natoma(struct pci_dev *dev) { if ((pci_pci_problems&PCIPCI_NATOMA) == 0) { pci_info(dev, "Limiting direct PCI/PCI transfers\n"); pci_pci_problems |= PCIPCI_NATOMA; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82441, quirk_natoma); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82443LX_0, quirk_natoma); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82443LX_1, quirk_natoma); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82443BX_0, quirk_natoma); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82443BX_1, quirk_natoma); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82443BX_2, quirk_natoma); /* * This chip can cause PCI parity errors if config register 0xA0 is read * while DMAs are occurring. */ static void quirk_citrine(struct pci_dev *dev) { dev->cfg_size = 0xA0; } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, PCI_DEVICE_ID_IBM_CITRINE, quirk_citrine); /* * This chip can cause bus lockups if config addresses above 0x600 * are read or written. */ static void quirk_nfp6000(struct pci_dev *dev) { dev->cfg_size = 0x600; } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NETRONOME, PCI_DEVICE_ID_NETRONOME_NFP4000, quirk_nfp6000); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NETRONOME, PCI_DEVICE_ID_NETRONOME_NFP6000, quirk_nfp6000); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NETRONOME, PCI_DEVICE_ID_NETRONOME_NFP5000, quirk_nfp6000); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NETRONOME, PCI_DEVICE_ID_NETRONOME_NFP6000_VF, quirk_nfp6000); /* On IBM Crocodile ipr SAS adapters, expand BAR to system page size */ static void quirk_extend_bar_to_page(struct pci_dev *dev) { int i; for (i = 0; i < PCI_STD_NUM_BARS; i++) { struct resource *r = &dev->resource[i]; const char *r_name = pci_resource_name(dev, i); if (r->flags & IORESOURCE_MEM && resource_size(r) < PAGE_SIZE) { resource_set_range(r, 0, PAGE_SIZE); r->flags |= IORESOURCE_UNSET; pci_info(dev, "%s %pR: expanded to page size\n", r_name, r); } } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, 0x034a, quirk_extend_bar_to_page); /* * S3 868 and 968 chips report region size equal to 32M, but they decode 64M. * If it's needed, re-allocate the region. */ static void quirk_s3_64M(struct pci_dev *dev) { struct resource *r = &dev->resource[0]; if (!IS_ALIGNED(r->start, SZ_64M) || resource_size(r) != SZ_64M) { r->flags |= IORESOURCE_UNSET; resource_set_range(r, 0, SZ_64M); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_S3, PCI_DEVICE_ID_S3_868, quirk_s3_64M); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_S3, PCI_DEVICE_ID_S3_968, quirk_s3_64M); static void quirk_io(struct pci_dev *dev, int pos, unsigned int size, const char *name) { u32 region; struct pci_bus_region bus_region; struct resource *res = pci_resource_n(dev, pos); const char *res_name = pci_resource_name(dev, pos); pci_read_config_dword(dev, PCI_BASE_ADDRESS_0 + (pos << 2), ®ion); if (!region) return; res->name = pci_name(dev); res->flags = region & ~PCI_BASE_ADDRESS_IO_MASK; res->flags |= (IORESOURCE_IO | IORESOURCE_PCI_FIXED | IORESOURCE_SIZEALIGN); region &= ~(size - 1); /* Convert from PCI bus to resource space */ bus_region.start = region; bus_region.end = region + size - 1; pcibios_bus_to_resource(dev->bus, res, &bus_region); pci_info(dev, FW_BUG "%s %pR: %s quirk\n", res_name, res, name); } /* * Some CS5536 BIOSes (for example, the Soekris NET5501 board w/ comBIOS * ver. 1.33 20070103) don't set the correct ISA PCI region header info. * BAR0 should be 8 bytes; instead, it may be set to something like 8k * (which conflicts w/ BAR1's memory range). * * CS553x's ISA PCI BARs may also be read-only (ref: * https://bugzilla.kernel.org/show_bug.cgi?id=85991 - Comment #4 forward). */ static void quirk_cs5536_vsa(struct pci_dev *dev) { static char *name = "CS5536 ISA bridge"; if (pci_resource_len(dev, 0) != 8) { quirk_io(dev, 0, 8, name); /* SMB */ quirk_io(dev, 1, 256, name); /* GPIO */ quirk_io(dev, 2, 64, name); /* MFGPT */ pci_info(dev, "%s bug detected (incorrect header); workaround applied\n", name); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CS5536_ISA, quirk_cs5536_vsa); static void quirk_io_region(struct pci_dev *dev, int port, unsigned int size, int nr, const char *name) { u16 region; struct pci_bus_region bus_region; struct resource *res = pci_resource_n(dev, nr); pci_read_config_word(dev, port, ®ion); region &= ~(size - 1); if (!region) return; res->name = pci_name(dev); res->flags = IORESOURCE_IO; /* Convert from PCI bus to resource space */ bus_region.start = region; bus_region.end = region + size - 1; pcibios_bus_to_resource(dev->bus, res, &bus_region); /* * "res" is typically a bridge window resource that's not being * used for a bridge window, so it's just a place to stash this * non-standard resource. Printing "nr" or pci_resource_name() of * it doesn't really make sense. */ if (!pci_claim_resource(dev, nr)) pci_info(dev, "quirk: %pR claimed by %s\n", res, name); } /* * ATI Northbridge setups MCE the processor if you even read somewhere * between 0x3b0->0x3bb or read 0x3d3 */ static void quirk_ati_exploding_mce(struct pci_dev *dev) { pci_info(dev, "ATI Northbridge, reserving I/O ports 0x3b0 to 0x3bb\n"); /* Mae rhaid i ni beidio ag edrych ar y lleoliadiau I/O hyn */ request_region(0x3b0, 0x0C, "RadeonIGP"); request_region(0x3d3, 0x01, "RadeonIGP"); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RS100, quirk_ati_exploding_mce); /* * In the AMD NL platform, this device ([1022:7912]) has a class code of * PCI_CLASS_SERIAL_USB_XHCI (0x0c0330), which means the xhci driver will * claim it. The same applies on the VanGogh platform device ([1022:163a]). * * But the dwc3 driver is a more specific driver for this device, and we'd * prefer to use it instead of xhci. To prevent xhci from claiming the * device, change the class code to 0x0c03fe, which the PCI r3.0 spec * defines as "USB device (not host controller)". The dwc3 driver can then * claim it based on its Vendor and Device ID. */ static void quirk_amd_dwc_class(struct pci_dev *pdev) { u32 class = pdev->class; if (class != PCI_CLASS_SERIAL_USB_DEVICE) { /* Use "USB Device (not host controller)" class */ pdev->class = PCI_CLASS_SERIAL_USB_DEVICE; pci_info(pdev, "PCI class overridden (%#08x -> %#08x) so dwc3 driver can claim this instead of xhci\n", class, pdev->class); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_NL_USB, quirk_amd_dwc_class); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_VANGOGH_USB, quirk_amd_dwc_class); /* * Synopsys USB 3.x host HAPS platform has a class code of * PCI_CLASS_SERIAL_USB_XHCI, and xhci driver can claim it. However, these * devices should use dwc3-haps driver. Change these devices' class code to * PCI_CLASS_SERIAL_USB_DEVICE to prevent the xhci-pci driver from claiming * them. */ static void quirk_synopsys_haps(struct pci_dev *pdev) { u32 class = pdev->class; switch (pdev->device) { case PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3: case PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3_AXI: case PCI_DEVICE_ID_SYNOPSYS_HAPSUSB31: pdev->class = PCI_CLASS_SERIAL_USB_DEVICE; pci_info(pdev, "PCI class overridden (%#08x -> %#08x) so dwc3 driver can claim this instead of xhci\n", class, pdev->class); break; } } DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_SYNOPSYS, PCI_ANY_ID, PCI_CLASS_SERIAL_USB_XHCI, 0, quirk_synopsys_haps); /* * Let's make the southbridge information explicit instead of having to * worry about people probing the ACPI areas, for example.. (Yes, it * happens, and if you read the wrong ACPI register it will put the machine * to sleep with no way of waking it up again. Bummer). * * ALI M7101: Two IO regions pointed to by words at * 0xE0 (64 bytes of ACPI registers) * 0xE2 (32 bytes of SMB registers) */ static void quirk_ali7101_acpi(struct pci_dev *dev) { quirk_io_region(dev, 0xE0, 64, PCI_BRIDGE_RESOURCES, "ali7101 ACPI"); quirk_io_region(dev, 0xE2, 32, PCI_BRIDGE_RESOURCES+1, "ali7101 SMB"); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AL, PCI_DEVICE_ID_AL_M7101, quirk_ali7101_acpi); static void piix4_io_quirk(struct pci_dev *dev, const char *name, unsigned int port, unsigned int enable) { u32 devres; u32 mask, size, base; pci_read_config_dword(dev, port, &devres); if ((devres & enable) != enable) return; mask = (devres >> 16) & 15; base = devres & 0xffff; size = 16; for (;;) { unsigned int bit = size >> 1; if ((bit & mask) == bit) break; size = bit; } /* * For now we only print it out. Eventually we'll want to * reserve it (at least if it's in the 0x1000+ range), but * let's get enough confirmation reports first. */ base &= -size; pci_info(dev, "%s PIO at %04x-%04x\n", name, base, base + size - 1); } static void piix4_mem_quirk(struct pci_dev *dev, const char *name, unsigned int port, unsigned int enable) { u32 devres; u32 mask, size, base; pci_read_config_dword(dev, port, &devres); if ((devres & enable) != enable) return; base = devres & 0xffff0000; mask = (devres & 0x3f) << 16; size = 128 << 16; for (;;) { unsigned int bit = size >> 1; if ((bit & mask) == bit) break; size = bit; } /* * For now we only print it out. Eventually we'll want to * reserve it, but let's get enough confirmation reports first. */ base &= -size; pci_info(dev, "%s MMIO at %04x-%04x\n", name, base, base + size - 1); } /* * PIIX4 ACPI: Two IO regions pointed to by longwords at * 0x40 (64 bytes of ACPI registers) * 0x90 (16 bytes of SMB registers) * and a few strange programmable PIIX4 device resources. */ static void quirk_piix4_acpi(struct pci_dev *dev) { u32 res_a; quirk_io_region(dev, 0x40, 64, PCI_BRIDGE_RESOURCES, "PIIX4 ACPI"); quirk_io_region(dev, 0x90, 16, PCI_BRIDGE_RESOURCES+1, "PIIX4 SMB"); /* Device resource A has enables for some of the other ones */ pci_read_config_dword(dev, 0x5c, &res_a); piix4_io_quirk(dev, "PIIX4 devres B", 0x60, 3 << 21); piix4_io_quirk(dev, "PIIX4 devres C", 0x64, 3 << 21); /* Device resource D is just bitfields for static resources */ /* Device 12 enabled? */ if (res_a & (1 << 29)) { piix4_io_quirk(dev, "PIIX4 devres E", 0x68, 1 << 20); piix4_mem_quirk(dev, "PIIX4 devres F", 0x6c, 1 << 7); } /* Device 13 enabled? */ if (res_a & (1 << 30)) { piix4_io_quirk(dev, "PIIX4 devres G", 0x70, 1 << 20); piix4_mem_quirk(dev, "PIIX4 devres H", 0x74, 1 << 7); } piix4_io_quirk(dev, "PIIX4 devres I", 0x78, 1 << 20); piix4_io_quirk(dev, "PIIX4 devres J", 0x7c, 1 << 20); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371AB_3, quirk_piix4_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82443MX_3, quirk_piix4_acpi); #define ICH_PMBASE 0x40 #define ICH_ACPI_CNTL 0x44 #define ICH4_ACPI_EN 0x10 #define ICH6_ACPI_EN 0x80 #define ICH4_GPIOBASE 0x58 #define ICH4_GPIO_CNTL 0x5c #define ICH4_GPIO_EN 0x10 #define ICH6_GPIOBASE 0x48 #define ICH6_GPIO_CNTL 0x4c #define ICH6_GPIO_EN 0x10 /* * ICH4, ICH4-M, ICH5, ICH5-M ACPI: Three IO regions pointed to by longwords at * 0x40 (128 bytes of ACPI, GPIO & TCO registers) * 0x58 (64 bytes of GPIO I/O space) */ static void quirk_ich4_lpc_acpi(struct pci_dev *dev) { u8 enable; /* * The check for PCIBIOS_MIN_IO is to ensure we won't create a conflict * with low legacy (and fixed) ports. We don't know the decoding * priority and can't tell whether the legacy device or the one created * here is really at that address. This happens on boards with broken * BIOSes. */ pci_read_config_byte(dev, ICH_ACPI_CNTL, &enable); if (enable & ICH4_ACPI_EN) quirk_io_region(dev, ICH_PMBASE, 128, PCI_BRIDGE_RESOURCES, "ICH4 ACPI/GPIO/TCO"); pci_read_config_byte(dev, ICH4_GPIO_CNTL, &enable); if (enable & ICH4_GPIO_EN) quirk_io_region(dev, ICH4_GPIOBASE, 64, PCI_BRIDGE_RESOURCES+1, "ICH4 GPIO"); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801AA_0, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801AB_0, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801BA_0, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801BA_10, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_0, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_12, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801DB_0, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801DB_12, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801EB_0, quirk_ich4_lpc_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB_1, quirk_ich4_lpc_acpi); static void ich6_lpc_acpi_gpio(struct pci_dev *dev) { u8 enable; pci_read_config_byte(dev, ICH_ACPI_CNTL, &enable); if (enable & ICH6_ACPI_EN) quirk_io_region(dev, ICH_PMBASE, 128, PCI_BRIDGE_RESOURCES, "ICH6 ACPI/GPIO/TCO"); pci_read_config_byte(dev, ICH6_GPIO_CNTL, &enable); if (enable & ICH6_GPIO_EN) quirk_io_region(dev, ICH6_GPIOBASE, 64, PCI_BRIDGE_RESOURCES+1, "ICH6 GPIO"); } static void ich6_lpc_generic_decode(struct pci_dev *dev, unsigned int reg, const char *name, int dynsize) { u32 val; u32 size, base; pci_read_config_dword(dev, reg, &val); /* Enabled? */ if (!(val & 1)) return; base = val & 0xfffc; if (dynsize) { /* * This is not correct. It is 16, 32 or 64 bytes depending on * register D31:F0:ADh bits 5:4. * * But this gets us at least _part_ of it. */ size = 16; } else { size = 128; } base &= ~(size-1); /* * Just print it out for now. We should reserve it after more * debugging. */ pci_info(dev, "%s PIO at %04x-%04x\n", name, base, base+size-1); } static void quirk_ich6_lpc(struct pci_dev *dev) { /* Shared ACPI/GPIO decode with all ICH6+ */ ich6_lpc_acpi_gpio(dev); /* ICH6-specific generic IO decode */ ich6_lpc_generic_decode(dev, 0x84, "LPC Generic IO decode 1", 0); ich6_lpc_generic_decode(dev, 0x88, "LPC Generic IO decode 2", 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_0, quirk_ich6_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_1, quirk_ich6_lpc); static void ich7_lpc_generic_decode(struct pci_dev *dev, unsigned int reg, const char *name) { u32 val; u32 mask, base; pci_read_config_dword(dev, reg, &val); /* Enabled? */ if (!(val & 1)) return; /* IO base in bits 15:2, mask in bits 23:18, both are dword-based */ base = val & 0xfffc; mask = (val >> 16) & 0xfc; mask |= 3; /* * Just print it out for now. We should reserve it after more * debugging. */ pci_info(dev, "%s PIO at %04x (mask %04x)\n", name, base, mask); } /* ICH7-10 has the same common LPC generic IO decode registers */ static void quirk_ich7_lpc(struct pci_dev *dev) { /* We share the common ACPI/GPIO decode with ICH6 */ ich6_lpc_acpi_gpio(dev); /* And have 4 ICH7+ generic decodes */ ich7_lpc_generic_decode(dev, 0x84, "ICH7 LPC Generic IO decode 1"); ich7_lpc_generic_decode(dev, 0x88, "ICH7 LPC Generic IO decode 2"); ich7_lpc_generic_decode(dev, 0x8c, "ICH7 LPC Generic IO decode 3"); ich7_lpc_generic_decode(dev, 0x90, "ICH7 LPC Generic IO decode 4"); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_0, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_1, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_31, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH8_0, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH8_2, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH8_3, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH8_1, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH8_4, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH9_2, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH9_4, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH9_7, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH9_8, quirk_ich7_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH10_1, quirk_ich7_lpc); /* * VIA ACPI: One IO region pointed to by longword at * 0x48 or 0x20 (256 bytes of ACPI registers) */ static void quirk_vt82c586_acpi(struct pci_dev *dev) { if (dev->revision & 0x10) quirk_io_region(dev, 0x48, 256, PCI_BRIDGE_RESOURCES, "vt82c586 ACPI"); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C586_3, quirk_vt82c586_acpi); /* * VIA VT82C686 ACPI: Three IO region pointed to by (long)words at * 0x48 (256 bytes of ACPI registers) * 0x70 (128 bytes of hardware monitoring register) * 0x90 (16 bytes of SMB registers) */ static void quirk_vt82c686_acpi(struct pci_dev *dev) { quirk_vt82c586_acpi(dev); quirk_io_region(dev, 0x70, 128, PCI_BRIDGE_RESOURCES+1, "vt82c686 HW-mon"); quirk_io_region(dev, 0x90, 16, PCI_BRIDGE_RESOURCES+2, "vt82c686 SMB"); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C686_4, quirk_vt82c686_acpi); /* * VIA VT8235 ISA Bridge: Two IO regions pointed to by words at * 0x88 (128 bytes of power management registers) * 0xd0 (16 bytes of SMB registers) */ static void quirk_vt8235_acpi(struct pci_dev *dev) { quirk_io_region(dev, 0x88, 128, PCI_BRIDGE_RESOURCES, "vt8235 PM"); quirk_io_region(dev, 0xd0, 16, PCI_BRIDGE_RESOURCES+1, "vt8235 SMB"); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8235, quirk_vt8235_acpi); /* * TI XIO2000a PCIe-PCI Bridge erroneously reports it supports fast * back-to-back: Disable fast back-to-back on the secondary bus segment */ static void quirk_xio2000a(struct pci_dev *dev) { struct pci_dev *pdev; u16 command; pci_warn(dev, "TI XIO2000a quirk detected; secondary bus fast back-to-back transfers disabled\n"); list_for_each_entry(pdev, &dev->subordinate->devices, bus_list) { pci_read_config_word(pdev, PCI_COMMAND, &command); if (command & PCI_COMMAND_FAST_BACK) pci_write_config_word(pdev, PCI_COMMAND, command & ~PCI_COMMAND_FAST_BACK); } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_TI, PCI_DEVICE_ID_TI_XIO2000A, quirk_xio2000a); #ifdef CONFIG_X86_IO_APIC #include <asm/io_apic.h> /* * VIA 686A/B: If an IO-APIC is active, we need to route all on-chip * devices to the external APIC. * * TODO: When we have device-specific interrupt routers, this code will go * away from quirks. */ static void quirk_via_ioapic(struct pci_dev *dev) { u8 tmp; if (nr_ioapics < 1) tmp = 0; /* nothing routed to external APIC */ else tmp = 0x1f; /* all known bits (4-0) routed to external APIC */ pci_info(dev, "%s VIA external APIC routing\n", tmp ? "Enabling" : "Disabling"); /* Offset 0x58: External APIC IRQ output control */ pci_write_config_byte(dev, 0x58, tmp); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C686, quirk_via_ioapic); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C686, quirk_via_ioapic); /* * VIA 8237: Some BIOSes don't set the 'Bypass APIC De-Assert Message' Bit. * This leads to doubled level interrupt rates. * Set this bit to get rid of cycle wastage. * Otherwise uncritical. */ static void quirk_via_vt8237_bypass_apic_deassert(struct pci_dev *dev) { u8 misc_control2; #define BYPASS_APIC_DEASSERT 8 pci_read_config_byte(dev, 0x5B, &misc_control2); if (!(misc_control2 & BYPASS_APIC_DEASSERT)) { pci_info(dev, "Bypassing VIA 8237 APIC De-Assert Message\n"); pci_write_config_byte(dev, 0x5B, misc_control2|BYPASS_APIC_DEASSERT); } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8237, quirk_via_vt8237_bypass_apic_deassert); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8237, quirk_via_vt8237_bypass_apic_deassert); /* * The AMD IO-APIC can hang the box when an APIC IRQ is masked. * We check all revs >= B0 (yet not in the pre production!) as the bug * is currently marked NoFix * * We have multiple reports of hangs with this chipset that went away with * noapic specified. For the moment we assume it's the erratum. We may be wrong * of course. However the advice is demonstrably good even if so. */ static void quirk_amd_ioapic(struct pci_dev *dev) { if (dev->revision >= 0x02) { pci_warn(dev, "I/O APIC: AMD Erratum #22 may be present. In the event of instability try\n"); pci_warn(dev, " : booting with the \"noapic\" option\n"); } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_VIPER_7410, quirk_amd_ioapic); #endif /* CONFIG_X86_IO_APIC */ #if defined(CONFIG_ARM64) && defined(CONFIG_PCI_ATS) static void quirk_cavium_sriov_rnm_link(struct pci_dev *dev) { /* Fix for improper SR-IOV configuration on Cavium cn88xx RNM device */ if (dev->subsystem_device == 0xa118) dev->sriov->link = dev->devfn; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa018, quirk_cavium_sriov_rnm_link); #endif /* * Some settings of MMRBC can lead to data corruption so block changes. * See AMD 8131 HyperTransport PCI-X Tunnel Revision Guide */ static void quirk_amd_8131_mmrbc(struct pci_dev *dev) { if (dev->subordinate && dev->revision <= 0x12) { pci_info(dev, "AMD8131 rev %x detected; disabling PCI-X MMRBC\n", dev->revision); dev->subordinate->bus_flags |= PCI_BUS_FLAGS_NO_MMRBC; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_BRIDGE, quirk_amd_8131_mmrbc); /* * FIXME: it is questionable that quirk_via_acpi() is needed. It shows up * as an ISA bridge, and does not support the PCI_INTERRUPT_LINE register * at all. Therefore it seems like setting the pci_dev's IRQ to the value * of the ACPI SCI interrupt is only done for convenience. * -jgarzik */ static void quirk_via_acpi(struct pci_dev *d) { u8 irq; /* VIA ACPI device: SCI IRQ line in PCI config byte 0x42 */ pci_read_config_byte(d, 0x42, &irq); irq &= 0xf; if (irq && (irq != 2)) d->irq = irq; } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C586_3, quirk_via_acpi); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C686_4, quirk_via_acpi); /* VIA bridges which have VLink */ static int via_vlink_dev_lo = -1, via_vlink_dev_hi = 18; static void quirk_via_bridge(struct pci_dev *dev) { /* See what bridge we have and find the device ranges */ switch (dev->device) { case PCI_DEVICE_ID_VIA_82C686: /* * The VT82C686 is special; it attaches to PCI and can have * any device number. All its subdevices are functions of * that single device. */ via_vlink_dev_lo = PCI_SLOT(dev->devfn); via_vlink_dev_hi = PCI_SLOT(dev->devfn); break; case PCI_DEVICE_ID_VIA_8237: case PCI_DEVICE_ID_VIA_8237A: via_vlink_dev_lo = 15; break; case PCI_DEVICE_ID_VIA_8235: via_vlink_dev_lo = 16; break; case PCI_DEVICE_ID_VIA_8231: case PCI_DEVICE_ID_VIA_8233_0: case PCI_DEVICE_ID_VIA_8233A: case PCI_DEVICE_ID_VIA_8233C_0: via_vlink_dev_lo = 17; break; } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C686, quirk_via_bridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8231, quirk_via_bridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8233_0, quirk_via_bridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8233A, quirk_via_bridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8233C_0, quirk_via_bridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8235, quirk_via_bridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8237, quirk_via_bridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8237A, quirk_via_bridge); /* * quirk_via_vlink - VIA VLink IRQ number update * @dev: PCI device * * If the device we are dealing with is on a PIC IRQ we need to ensure that * the IRQ line register which usually is not relevant for PCI cards, is * actually written so that interrupts get sent to the right place. * * We only do this on systems where a VIA south bridge was detected, and * only for VIA devices on the motherboard (see quirk_via_bridge above). */ static void quirk_via_vlink(struct pci_dev *dev) { u8 irq, new_irq; /* Check if we have VLink at all */ if (via_vlink_dev_lo == -1) return; new_irq = dev->irq; /* Don't quirk interrupts outside the legacy IRQ range */ if (!new_irq || new_irq > 15) return; /* Internal device ? */ if (dev->bus->number != 0 || PCI_SLOT(dev->devfn) > via_vlink_dev_hi || PCI_SLOT(dev->devfn) < via_vlink_dev_lo) return; /* * This is an internal VLink device on a PIC interrupt. The BIOS * ought to have set this but may not have, so we redo it. */ pci_read_config_byte(dev, PCI_INTERRUPT_LINE, &irq); if (new_irq != irq) { pci_info(dev, "VIA VLink IRQ fixup, from %d to %d\n", irq, new_irq); udelay(15); /* unknown if delay really needed */ pci_write_config_byte(dev, PCI_INTERRUPT_LINE, new_irq); } } DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_VIA, PCI_ANY_ID, quirk_via_vlink); /* * VIA VT82C598 has its device ID settable and many BIOSes set it to the ID * of VT82C597 for backward compatibility. We need to switch it off to be * able to recognize the real type of the chip. */ static void quirk_vt82c598_id(struct pci_dev *dev) { pci_write_config_byte(dev, 0xfc, 0); pci_read_config_word(dev, PCI_DEVICE_ID, &dev->device); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_82C597_0, quirk_vt82c598_id); /* * CardBus controllers have a legacy base address that enables them to * respond as i82365 pcmcia controllers. We don't want them to do this * even if the Linux CardBus driver is not loaded, because the Linux i82365 * driver does not (and should not) handle CardBus. */ static void quirk_cardbus_legacy(struct pci_dev *dev) { pci_write_config_dword(dev, PCI_CB_LEGACY_MODE_BASE, 0); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_CARDBUS, 8, quirk_cardbus_legacy); DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_BRIDGE_CARDBUS, 8, quirk_cardbus_legacy); /* * Following the PCI ordering rules is optional on the AMD762. I'm not sure * what the designers were smoking but let's not inhale... * * To be fair to AMD, it follows the spec by default, it's BIOS people who * turn it off! */ static void quirk_amd_ordering(struct pci_dev *dev) { u32 pcic; pci_read_config_dword(dev, 0x4C, &pcic); if ((pcic & 6) != 6) { pcic |= 6; pci_warn(dev, "BIOS failed to enable PCI standards compliance; fixing this error\n"); pci_write_config_dword(dev, 0x4C, pcic); pci_read_config_dword(dev, 0x84, &pcic); pcic |= (1 << 23); /* Required in this mode */ pci_write_config_dword(dev, 0x84, pcic); } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FE_GATE_700C, quirk_amd_ordering); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FE_GATE_700C, quirk_amd_ordering); /* * DreamWorks-provided workaround for Dunord I-3000 problem * * This card decodes and responds to addresses not apparently assigned to * it. We force a larger allocation to ensure that nothing gets put too * close to it. */ static void quirk_dunord(struct pci_dev *dev) { struct resource *r = &dev->resource[1]; r->flags |= IORESOURCE_UNSET; resource_set_range(r, 0, SZ_16M); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_DUNORD, PCI_DEVICE_ID_DUNORD_I3000, quirk_dunord); /* * i82380FB mobile docking controller: its PCI-to-PCI bridge is subtractive * decoding (transparent), and does indicate this in the ProgIf. * Unfortunately, the ProgIf value is wrong - 0x80 instead of 0x01. */ static void quirk_transparent_bridge(struct pci_dev *dev) { dev->transparent = 1; } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82380FB, quirk_transparent_bridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TOSHIBA, 0x605, quirk_transparent_bridge); /* * Common misconfiguration of the MediaGX/Geode PCI master that will reduce * PCI bandwidth from 70MB/s to 25MB/s. See the GXM/GXLV/GX1 datasheets * found at http://www.national.com/analog for info on what these bits do. * <christer@weinigel.se> */ static void quirk_mediagx_master(struct pci_dev *dev) { u8 reg; pci_read_config_byte(dev, 0x41, ®); if (reg & 2) { reg &= ~2; pci_info(dev, "Fixup for MediaGX/Geode Slave Disconnect Boundary (0x41=0x%02x)\n", reg); pci_write_config_byte(dev, 0x41, reg); } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CYRIX, PCI_DEVICE_ID_CYRIX_PCI_MASTER, quirk_mediagx_master); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_CYRIX, PCI_DEVICE_ID_CYRIX_PCI_MASTER, quirk_mediagx_master); /* * Ensure C0 rev restreaming is off. This is normally done by the BIOS but * in the odd case it is not the results are corruption hence the presence * of a Linux check. */ static void quirk_disable_pxb(struct pci_dev *pdev) { u16 config; if (pdev->revision != 0x04) /* Only C0 requires this */ return; pci_read_config_word(pdev, 0x40, &config); if (config & (1<<6)) { config &= ~(1<<6); pci_write_config_word(pdev, 0x40, config); pci_info(pdev, "C0 revision 450NX. Disabling PCI restreaming\n"); } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82454NX, quirk_disable_pxb); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82454NX, quirk_disable_pxb); static void quirk_amd_ide_mode(struct pci_dev *pdev) { /* set SBX00/Hudson-2 SATA in IDE mode to AHCI mode */ u8 tmp; pci_read_config_byte(pdev, PCI_CLASS_DEVICE, &tmp); if (tmp == 0x01) { pci_read_config_byte(pdev, 0x40, &tmp); pci_write_config_byte(pdev, 0x40, tmp|1); pci_write_config_byte(pdev, 0x9, 1); pci_write_config_byte(pdev, 0xa, 6); pci_write_config_byte(pdev, 0x40, tmp); pdev->class = PCI_CLASS_STORAGE_SATA_AHCI; pci_info(pdev, "set SATA to AHCI mode\n"); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP600_SATA, quirk_amd_ide_mode); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP600_SATA, quirk_amd_ide_mode); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP700_SATA, quirk_amd_ide_mode); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP700_SATA, quirk_amd_ide_mode); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_HUDSON2_SATA_IDE, quirk_amd_ide_mode); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_HUDSON2_SATA_IDE, quirk_amd_ide_mode); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x7900, quirk_amd_ide_mode); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_AMD, 0x7900, quirk_amd_ide_mode); /* Serverworks CSB5 IDE does not fully support native mode */ static void quirk_svwks_csb5ide(struct pci_dev *pdev) { u8 prog; pci_read_config_byte(pdev, PCI_CLASS_PROG, &prog); if (prog & 5) { prog &= ~5; pdev->class &= ~5; pci_write_config_byte(pdev, PCI_CLASS_PROG, prog); /* PCI layer will sort out resources */ } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_CSB5IDE, quirk_svwks_csb5ide); /* Intel 82801CAM ICH3-M datasheet says IDE modes must be the same */ static void quirk_ide_samemode(struct pci_dev *pdev) { u8 prog; pci_read_config_byte(pdev, PCI_CLASS_PROG, &prog); if (((prog & 1) && !(prog & 4)) || ((prog & 4) && !(prog & 1))) { pci_info(pdev, "IDE mode mismatch; forcing legacy mode\n"); prog &= ~5; pdev->class &= ~5; pci_write_config_byte(pdev, PCI_CLASS_PROG, prog); } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_10, quirk_ide_samemode); /* Some ATA devices break if put into D3 */ static void quirk_no_ata_d3(struct pci_dev *pdev) { pdev->dev_flags |= PCI_DEV_FLAGS_NO_D3; } /* Quirk the legacy ATA devices only. The AHCI ones are ok */ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_ANY_ID, PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_ATI, PCI_ANY_ID, PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3); /* ALi loses some register settings that we cannot then restore */ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID, PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3); /* VIA comes back fine but we need to keep it alive or ACPI GTM failures occur when mode detecting */ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID, PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3); /* * This was originally an Alpha-specific thing, but it really fits here. * The i82375 PCI/EISA bridge appears as non-classified. Fix that. */ static void quirk_eisa_bridge(struct pci_dev *dev) { dev->class = PCI_CLASS_BRIDGE_EISA << 8; } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82375, quirk_eisa_bridge); /* * On ASUS P4B boards, the SMBus PCI Device within the ICH2/4 southbridge * is not activated. The myth is that Asus said that they do not want the * users to be irritated by just another PCI Device in the Win98 device * manager. (see the file prog/hotplug/README.p4b in the lm_sensors * package 2.7.0 for details) * * The SMBus PCI Device can be activated by setting a bit in the ICH LPC * bridge. Unfortunately, this device has no subvendor/subdevice ID. So it * becomes necessary to do this tweak in two steps -- the chosen trigger * is either the Host bridge (preferred) or on-board VGA controller. * * Note that we used to unhide the SMBus that way on Toshiba laptops * (Satellite A40 and Tecra M2) but then found that the thermal management * was done by SMM code, which could cause unsynchronized concurrent * accesses to the SMBus registers, with potentially bad effects. Thus you * should be very careful when adding new entries: if SMM is accessing the * Intel SMBus, this is a very good reason to leave it hidden. * * Likewise, many recent laptops use ACPI for thermal management. If the * ACPI DSDT code accesses the SMBus, then Linux should not access it * natively, and keeping the SMBus hidden is the right thing to do. If you * are about to add an entry in the table below, please first disassemble * the DSDT and double-check that there is no code accessing the SMBus. */ static int asus_hides_smbus; static void asus_hides_smbus_hostbridge(struct pci_dev *dev) { if (unlikely(dev->subsystem_vendor == PCI_VENDOR_ID_ASUSTEK)) { if (dev->device == PCI_DEVICE_ID_INTEL_82845_HB) switch (dev->subsystem_device) { case 0x8025: /* P4B-LX */ case 0x8070: /* P4B */ case 0x8088: /* P4B533 */ case 0x1626: /* L3C notebook */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82845G_HB) switch (dev->subsystem_device) { case 0x80b1: /* P4GE-V */ case 0x80b2: /* P4PE */ case 0x8093: /* P4B533-V */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82850_HB) switch (dev->subsystem_device) { case 0x8030: /* P4T533 */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_7205_0) switch (dev->subsystem_device) { case 0x8070: /* P4G8X Deluxe */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_E7501_MCH) switch (dev->subsystem_device) { case 0x80c9: /* PU-DLS */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82855GM_HB) switch (dev->subsystem_device) { case 0x1751: /* M2N notebook */ case 0x1821: /* M5N notebook */ case 0x1897: /* A6L notebook */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82855PM_HB) switch (dev->subsystem_device) { case 0x184b: /* W1N notebook */ case 0x186a: /* M6Ne notebook */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82865_HB) switch (dev->subsystem_device) { case 0x80f2: /* P4P800-X */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82915GM_HB) switch (dev->subsystem_device) { case 0x1882: /* M6V notebook */ case 0x1977: /* A6VA notebook */ asus_hides_smbus = 1; } } else if (unlikely(dev->subsystem_vendor == PCI_VENDOR_ID_HP)) { if (dev->device == PCI_DEVICE_ID_INTEL_82855PM_HB) switch (dev->subsystem_device) { case 0x088C: /* HP Compaq nc8000 */ case 0x0890: /* HP Compaq nc6000 */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82865_HB) switch (dev->subsystem_device) { case 0x12bc: /* HP D330L */ case 0x12bd: /* HP D530 */ case 0x006a: /* HP Compaq nx9500 */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82875_HB) switch (dev->subsystem_device) { case 0x12bf: /* HP xw4100 */ asus_hides_smbus = 1; } } else if (unlikely(dev->subsystem_vendor == PCI_VENDOR_ID_SAMSUNG)) { if (dev->device == PCI_DEVICE_ID_INTEL_82855PM_HB) switch (dev->subsystem_device) { case 0xC00C: /* Samsung P35 notebook */ asus_hides_smbus = 1; } } else if (unlikely(dev->subsystem_vendor == PCI_VENDOR_ID_COMPAQ)) { if (dev->device == PCI_DEVICE_ID_INTEL_82855PM_HB) switch (dev->subsystem_device) { case 0x0058: /* Compaq Evo N620c */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82810_IG3) switch (dev->subsystem_device) { case 0xB16C: /* Compaq Deskpro EP 401963-001 (PCA# 010174) */ /* Motherboard doesn't have Host bridge * subvendor/subdevice IDs, therefore checking * its on-board VGA controller */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82801DB_2) switch (dev->subsystem_device) { case 0x00b8: /* Compaq Evo D510 CMT */ case 0x00b9: /* Compaq Evo D510 SFF */ case 0x00ba: /* Compaq Evo D510 USDT */ /* Motherboard doesn't have Host bridge * subvendor/subdevice IDs and on-board VGA * controller is disabled if an AGP card is * inserted, therefore checking USB UHCI * Controller #1 */ asus_hides_smbus = 1; } else if (dev->device == PCI_DEVICE_ID_INTEL_82815_CGC) switch (dev->subsystem_device) { case 0x001A: /* Compaq Deskpro EN SSF P667 815E */ /* Motherboard doesn't have host bridge * subvendor/subdevice IDs, therefore checking * its on-board VGA controller */ asus_hides_smbus = 1; } } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82845_HB, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82845G_HB, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82850_HB, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82865_HB, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82875_HB, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_7205_0, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7501_MCH, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82855PM_HB, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82855GM_HB, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82915GM_HB, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82810_IG3, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801DB_2, asus_hides_smbus_hostbridge); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82815_CGC, asus_hides_smbus_hostbridge); static void asus_hides_smbus_lpc(struct pci_dev *dev) { u16 val; if (likely(!asus_hides_smbus)) return; pci_read_config_word(dev, 0xF2, &val); if (val & 0x8) { pci_write_config_word(dev, 0xF2, val & (~0x8)); pci_read_config_word(dev, 0xF2, &val); if (val & 0x8) pci_info(dev, "i801 SMBus device continues to play 'hide and seek'! 0x%x\n", val); else pci_info(dev, "Enabled i801 SMBus device\n"); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801AA_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801DB_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801BA_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_12, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801DB_12, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801EB_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801AA_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801DB_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801BA_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_0, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_12, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801DB_12, asus_hides_smbus_lpc); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801EB_0, asus_hides_smbus_lpc); /* It appears we just have one such device. If not, we have a warning */ static void __iomem *asus_rcba_base; static void asus_hides_smbus_lpc_ich6_suspend(struct pci_dev *dev) { u32 rcba; if (likely(!asus_hides_smbus)) return; WARN_ON(asus_rcba_base); pci_read_config_dword(dev, 0xF0, &rcba); /* use bits 31:14, 16 kB aligned */ asus_rcba_base = ioremap(rcba & 0xFFFFC000, 0x4000); if (asus_rcba_base == NULL) return; } static void asus_hides_smbus_lpc_ich6_resume_early(struct pci_dev *dev) { u32 val; if (likely(!asus_hides_smbus || !asus_rcba_base)) return; /* read the Function Disable register, dword mode only */ val = readl(asus_rcba_base + 0x3418); /* enable the SMBus device */ writel(val & 0xFFFFFFF7, asus_rcba_base + 0x3418); } static void asus_hides_smbus_lpc_ich6_resume(struct pci_dev *dev) { if (likely(!asus_hides_smbus || !asus_rcba_base)) return; iounmap(asus_rcba_base); asus_rcba_base = NULL; pci_info(dev, "Enabled ICH6/i801 SMBus device\n"); } static void asus_hides_smbus_lpc_ich6(struct pci_dev *dev) { asus_hides_smbus_lpc_ich6_suspend(dev); asus_hides_smbus_lpc_ich6_resume_early(dev); asus_hides_smbus_lpc_ich6_resume(dev); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_1, asus_hides_smbus_lpc_ich6); DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_1, asus_hides_smbus_lpc_ich6_suspend); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_1, asus_hides_smbus_lpc_ich6_resume); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH6_1, asus_hides_smbus_lpc_ich6_resume_early); /* SiS 96x south bridge: BIOS typically hides SMBus device... */ static void quirk_sis_96x_smbus(struct pci_dev *dev) { u8 val = 0; pci_read_config_byte(dev, 0x77, &val); if (val & 0x10) { pci_info(dev, "Enabling SiS 96x SMBus\n"); pci_write_config_byte(dev, 0x77, val & ~0x10); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_961, quirk_sis_96x_smbus); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_962, quirk_sis_96x_smbus); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_963, quirk_sis_96x_smbus); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_LPC, quirk_sis_96x_smbus); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_961, quirk_sis_96x_smbus); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_962, quirk_sis_96x_smbus); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_963, quirk_sis_96x_smbus); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_LPC, quirk_sis_96x_smbus); /* * ... This is further complicated by the fact that some SiS96x south * bridges pretend to be 85C503/5513 instead. In that case see if we * spotted a compatible north bridge to make sure. * (pci_find_device() doesn't work yet) * * We can also enable the sis96x bit in the discovery register.. */ #define SIS_DETECT_REGISTER 0x40 static void quirk_sis_503(struct pci_dev *dev) { u8 reg; u16 devid; pci_read_config_byte(dev, SIS_DETECT_REGISTER, ®); pci_write_config_byte(dev, SIS_DETECT_REGISTER, reg | (1 << 6)); pci_read_config_word(dev, PCI_DEVICE_ID, &devid); if (((devid & 0xfff0) != 0x0960) && (devid != 0x0018)) { pci_write_config_byte(dev, SIS_DETECT_REGISTER, reg); return; } /* * Ok, it now shows up as a 96x. Run the 96x quirk by hand in case * it has already been processed. (Depends on link order, which is * apparently not guaranteed) */ dev->device = devid; quirk_sis_96x_smbus(dev); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_503, quirk_sis_503); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_503, quirk_sis_503); /* * On ASUS A8V and A8V Deluxe boards, the onboard AC97 audio controller * and MC97 modem controller are disabled when a second PCI soundcard is * present. This patch, tweaking the VT8237 ISA bridge, enables them. * -- bjd */ static void asus_hides_ac97_lpc(struct pci_dev *dev) { u8 val; int asus_hides_ac97 = 0; if (likely(dev->subsystem_vendor == PCI_VENDOR_ID_ASUSTEK)) { if (dev->device == PCI_DEVICE_ID_VIA_8237) asus_hides_ac97 = 1; } if (!asus_hides_ac97) return; pci_read_config_byte(dev, 0x50, &val); if (val & 0xc0) { pci_write_config_byte(dev, 0x50, val & (~0xc0)); pci_read_config_byte(dev, 0x50, &val); if (val & 0xc0) pci_info(dev, "Onboard AC97/MC97 devices continue to play 'hide and seek'! 0x%x\n", val); else pci_info(dev, "Enabled onboard AC97/MC97 devices\n"); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8237, asus_hides_ac97_lpc); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8237, asus_hides_ac97_lpc); #if defined(CONFIG_ATA) || defined(CONFIG_ATA_MODULE) /* * If we are using libata we can drive this chip properly but must do this * early on to make the additional device appear during the PCI scanning. */ static void quirk_jmicron_ata(struct pci_dev *pdev) { u32 conf1, conf5, class; u8 hdr; /* Only poke fn 0 */ if (PCI_FUNC(pdev->devfn)) return; pci_read_config_dword(pdev, 0x40, &conf1); pci_read_config_dword(pdev, 0x80, &conf5); conf1 &= ~0x00CFF302; /* Clear bit 1, 8, 9, 12-19, 22, 23 */ conf5 &= ~(1 << 24); /* Clear bit 24 */ switch (pdev->device) { case PCI_DEVICE_ID_JMICRON_JMB360: /* SATA single port */ case PCI_DEVICE_ID_JMICRON_JMB362: /* SATA dual ports */ case PCI_DEVICE_ID_JMICRON_JMB364: /* SATA dual ports */ /* The controller should be in single function ahci mode */ conf1 |= 0x0002A100; /* Set 8, 13, 15, 17 */ break; case PCI_DEVICE_ID_JMICRON_JMB365: case PCI_DEVICE_ID_JMICRON_JMB366: /* Redirect IDE second PATA port to the right spot */ conf5 |= (1 << 24); fallthrough; case PCI_DEVICE_ID_JMICRON_JMB361: case PCI_DEVICE_ID_JMICRON_JMB363: case PCI_DEVICE_ID_JMICRON_JMB369: /* Enable dual function mode, AHCI on fn 0, IDE fn1 */ /* Set the class codes correctly and then direct IDE 0 */ conf1 |= 0x00C2A1B3; /* Set 0, 1, 4, 5, 7, 8, 13, 15, 17, 22, 23 */ break; case PCI_DEVICE_ID_JMICRON_JMB368: /* The controller should be in single function IDE mode */ conf1 |= 0x00C00000; /* Set 22, 23 */ break; } pci_write_config_dword(pdev, 0x40, conf1); pci_write_config_dword(pdev, 0x80, conf5); /* Update pdev accordingly */ pci_read_config_byte(pdev, PCI_HEADER_TYPE, &hdr); pdev->hdr_type = hdr & PCI_HEADER_TYPE_MASK; pdev->multifunction = FIELD_GET(PCI_HEADER_TYPE_MFD, hdr); pci_read_config_dword(pdev, PCI_CLASS_REVISION, &class); pdev->class = class >> 8; } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB360, quirk_jmicron_ata); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB361, quirk_jmicron_ata); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB362, quirk_jmicron_ata); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB363, quirk_jmicron_ata); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB364, quirk_jmicron_ata); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB365, quirk_jmicron_ata); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB366, quirk_jmicron_ata); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB368, quirk_jmicron_ata); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB369, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB360, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB361, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB362, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB363, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB364, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB365, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB366, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB368, quirk_jmicron_ata); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB369, quirk_jmicron_ata); #endif static void quirk_jmicron_async_suspend(struct pci_dev *dev) { if (dev->multifunction) { device_disable_async_suspend(&dev->dev); pci_info(dev, "async suspend disabled to avoid multi-function power-on ordering issue\n"); } } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_CLASS_STORAGE_IDE, 8, quirk_jmicron_async_suspend); DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_CLASS_STORAGE_SATA_AHCI, 0, quirk_jmicron_async_suspend); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, 0x2362, quirk_jmicron_async_suspend); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, 0x236f, quirk_jmicron_async_suspend); #ifdef CONFIG_X86_IO_APIC static void quirk_alder_ioapic(struct pci_dev *pdev) { int i; if ((pdev->class >> 8) != 0xff00) return; /* * The first BAR is the location of the IO-APIC... we must * not touch this (and it's already covered by the fixmap), so * forcibly insert it into the resource tree. */ if (pci_resource_start(pdev, 0) && pci_resource_len(pdev, 0)) insert_resource(&iomem_resource, &pdev->resource[0]); /* * The next five BARs all seem to be rubbish, so just clean * them out. */ for (i = 1; i < PCI_STD_NUM_BARS; i++) memset(&pdev->resource[i], 0, sizeof(pdev->resource[i])); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_EESSC, quirk_alder_ioapic); #endif static void quirk_no_msi(struct pci_dev *dev) { pci_info(dev, "avoiding MSI to work around a hardware defect\n"); dev->no_msi = 1; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4386, quirk_no_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4387, quirk_no_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4388, quirk_no_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4389, quirk_no_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x438a, quirk_no_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x438b, quirk_no_msi); static void quirk_pcie_mch(struct pci_dev *pdev) { pdev->no_msi = 1; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7520_MCH, quirk_pcie_mch); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7320_MCH, quirk_pcie_mch); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7525_MCH, quirk_pcie_mch); DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_HUAWEI, 0x1610, PCI_CLASS_BRIDGE_PCI, 8, quirk_pcie_mch); /* * HiSilicon KunPeng920 and KunPeng930 have devices appear as PCI but are * actually on the AMBA bus. These fake PCI devices can support SVA via * SMMU stall feature, by setting dma-can-stall for ACPI platforms. * * Normally stalling must not be enabled for PCI devices, since it would * break the PCI requirement for free-flowing writes and may lead to * deadlock. We expect PCI devices to support ATS and PRI if they want to * be fault-tolerant, so there's no ACPI binding to describe anything else, * even when a "PCI" device turns out to be a regular old SoC device * dressed up as a RCiEP and normal rules don't apply. */ static void quirk_huawei_pcie_sva(struct pci_dev *pdev) { struct property_entry properties[] = { PROPERTY_ENTRY_BOOL("dma-can-stall"), {}, }; if (pdev->revision != 0x21 && pdev->revision != 0x30) return; pdev->pasid_no_tlp = 1; /* * Set the dma-can-stall property on ACPI platforms. Device tree * can set it directly. */ if (!pdev->dev.of_node && device_create_managed_software_node(&pdev->dev, properties, NULL)) pci_warn(pdev, "could not add stall property"); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0xa255, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0xa256, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0xa258, quirk_huawei_pcie_sva); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0xa259, quirk_huawei_pcie_sva); /* * It's possible for the MSI to get corrupted if SHPC and ACPI are used * together on certain PXH-based systems. */ static void quirk_pcie_pxh(struct pci_dev *dev) { dev->no_msi = 1; pci_warn(dev, "PXH quirk detected; SHPC device MSI disabled\n"); } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXHD_0, quirk_pcie_pxh); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXHD_1, quirk_pcie_pxh); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_0, quirk_pcie_pxh); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_1, quirk_pcie_pxh); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXHV, quirk_pcie_pxh); /* * Some Intel PCI Express chipsets have trouble with downstream device * power management. */ static void quirk_intel_pcie_pm(struct pci_dev *dev) { pci_pm_d3hot_delay = 120; dev->no_d1d2 = 1; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25e2, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25e3, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25e4, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25e5, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25e6, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25e7, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25f7, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25f8, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25f9, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x25fa, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2601, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2602, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2603, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2604, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2605, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2606, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2607, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2608, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2609, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x260a, quirk_intel_pcie_pm); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x260b, quirk_intel_pcie_pm); static void quirk_d3hot_delay(struct pci_dev *dev, unsigned int delay) { if (dev->d3hot_delay >= delay) return; dev->d3hot_delay = delay; pci_info(dev, "extending delay after power-on from D3hot to %d msec\n", dev->d3hot_delay); } static void quirk_radeon_pm(struct pci_dev *dev) { if (dev->subsystem_vendor == PCI_VENDOR_ID_APPLE && dev->subsystem_device == 0x00e2) quirk_d3hot_delay(dev, 20); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6741, quirk_radeon_pm); /* * NVIDIA Ampere-based HDA controllers can wedge the whole device if a bus * reset is performed too soon after transition to D0, extend d3hot_delay * to previous effective default for all NVIDIA HDA controllers. */ static void quirk_nvidia_hda_pm(struct pci_dev *dev) { quirk_d3hot_delay(dev, 20); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_nvidia_hda_pm); /* * Ryzen5/7 XHCI controllers fail upon resume from runtime suspend or s2idle. * https://bugzilla.kernel.org/show_bug.cgi?id=205587 * * The kernel attempts to transition these devices to D3cold, but that seems * to be ineffective on the platforms in question; the PCI device appears to * remain on in D3hot state. The D3hot-to-D0 transition then requires an * extended delay in order to succeed. */ static void quirk_ryzen_xhci_d3hot(struct pci_dev *dev) { quirk_d3hot_delay(dev, 20); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e0, quirk_ryzen_xhci_d3hot); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e1, quirk_ryzen_xhci_d3hot); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1639, quirk_ryzen_xhci_d3hot); #ifdef CONFIG_X86_IO_APIC static int dmi_disable_ioapicreroute(const struct dmi_system_id *d) { noioapicreroute = 1; pr_info("%s detected: disable boot interrupt reroute\n", d->ident); return 0; } static const struct dmi_system_id boot_interrupt_dmi_table[] = { /* * Systems to exclude from boot interrupt reroute quirks */ { .callback = dmi_disable_ioapicreroute, .ident = "ASUSTek Computer INC. M2N-LR", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "ASUSTek Computer INC."), DMI_MATCH(DMI_PRODUCT_NAME, "M2N-LR"), }, }, {} }; /* * Boot interrupts on some chipsets cannot be turned off. For these chipsets, * remap the original interrupt in the Linux kernel to the boot interrupt, so * that a PCI device's interrupt handler is installed on the boot interrupt * line instead. */ static void quirk_reroute_to_boot_interrupts_intel(struct pci_dev *dev) { dmi_check_system(boot_interrupt_dmi_table); if (noioapicquirk || noioapicreroute) return; dev->irq_reroute_variant = INTEL_IRQ_REROUTE_VARIANT; pci_info(dev, "rerouting interrupts for [%04x:%04x]\n", dev->vendor, dev->device); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80333_0, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80333_1, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB2_0, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_0, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_1, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXHV, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80332_0, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80332_1, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80333_0, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80333_1, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB2_0, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_0, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_1, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXHV, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80332_0, quirk_reroute_to_boot_interrupts_intel); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80332_1, quirk_reroute_to_boot_interrupts_intel); /* * On some chipsets we can disable the generation of legacy INTx boot * interrupts. */ /* * IO-APIC1 on 6300ESB generates boot interrupts, see Intel order no * 300641-004US, section 5.7.3. * * Core IO on Xeon E5 1600/2600/4600, see Intel order no 326509-003. * Core IO on Xeon E5 v2, see Intel order no 329188-003. * Core IO on Xeon E7 v2, see Intel order no 329595-002. * Core IO on Xeon E5 v3, see Intel order no 330784-003. * Core IO on Xeon E7 v3, see Intel order no 332315-001US. * Core IO on Xeon E5 v4, see Intel order no 333810-002US. * Core IO on Xeon E7 v4, see Intel order no 332315-001US. * Core IO on Xeon D-1500, see Intel order no 332051-001. * Core IO on Xeon Scalable, see Intel order no 610950. */ #define INTEL_6300_IOAPIC_ABAR 0x40 /* Bus 0, Dev 29, Func 5 */ #define INTEL_6300_DISABLE_BOOT_IRQ (1<<14) #define INTEL_CIPINTRC_CFG_OFFSET 0x14C /* Bus 0, Dev 5, Func 0 */ #define INTEL_CIPINTRC_DIS_INTX_ICH (1<<25) static void quirk_disable_intel_boot_interrupt(struct pci_dev *dev) { u16 pci_config_word; u32 pci_config_dword; if (noioapicquirk) return; switch (dev->device) { case PCI_DEVICE_ID_INTEL_ESB_10: pci_read_config_word(dev, INTEL_6300_IOAPIC_ABAR, &pci_config_word); pci_config_word |= INTEL_6300_DISABLE_BOOT_IRQ; pci_write_config_word(dev, INTEL_6300_IOAPIC_ABAR, pci_config_word); break; case 0x3c28: /* Xeon E5 1600/2600/4600 */ case 0x0e28: /* Xeon E5/E7 V2 */ case 0x2f28: /* Xeon E5/E7 V3,V4 */ case 0x6f28: /* Xeon D-1500 */ case 0x2034: /* Xeon Scalable Family */ pci_read_config_dword(dev, INTEL_CIPINTRC_CFG_OFFSET, &pci_config_dword); pci_config_dword |= INTEL_CIPINTRC_DIS_INTX_ICH; pci_write_config_dword(dev, INTEL_CIPINTRC_CFG_OFFSET, pci_config_dword); break; default: return; } pci_info(dev, "disabled boot interrupts on device [%04x:%04x]\n", dev->vendor, dev->device); } /* * Device 29 Func 5 Device IDs of IO-APIC * containing ABAR—APIC1 Alternate Base Address Register */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB_10, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB_10, quirk_disable_intel_boot_interrupt); /* * Device 5 Func 0 Device IDs of Core IO modules/hubs * containing Coherent Interface Protocol Interrupt Control * * Device IDs obtained from volume 2 datasheets of commented * families above. */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x3c28, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0e28, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2f28, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x6f28, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2034, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x3c28, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x0e28, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x2f28, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x6f28, quirk_disable_intel_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x2034, quirk_disable_intel_boot_interrupt); /* Disable boot interrupts on HT-1000 */ #define BC_HT1000_FEATURE_REG 0x64 #define BC_HT1000_PIC_REGS_ENABLE (1<<0) #define BC_HT1000_MAP_IDX 0xC00 #define BC_HT1000_MAP_DATA 0xC01 static void quirk_disable_broadcom_boot_interrupt(struct pci_dev *dev) { u32 pci_config_dword; u8 irq; if (noioapicquirk) return; pci_read_config_dword(dev, BC_HT1000_FEATURE_REG, &pci_config_dword); pci_write_config_dword(dev, BC_HT1000_FEATURE_REG, pci_config_dword | BC_HT1000_PIC_REGS_ENABLE); for (irq = 0x10; irq < 0x10 + 32; irq++) { outb(irq, BC_HT1000_MAP_IDX); outb(0x00, BC_HT1000_MAP_DATA); } pci_write_config_dword(dev, BC_HT1000_FEATURE_REG, pci_config_dword); pci_info(dev, "disabled boot interrupts on device [%04x:%04x]\n", dev->vendor, dev->device); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_HT1000SB, quirk_disable_broadcom_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_HT1000SB, quirk_disable_broadcom_boot_interrupt); /* Disable boot interrupts on AMD and ATI chipsets */ /* * NOIOAMODE needs to be disabled to disable "boot interrupts". For AMD 8131 * rev. A0 and B0, NOIOAMODE needs to be disabled anyway to fix IO-APIC mode * (due to an erratum). */ #define AMD_813X_MISC 0x40 #define AMD_813X_NOIOAMODE (1<<0) #define AMD_813X_REV_B1 0x12 #define AMD_813X_REV_B2 0x13 static void quirk_disable_amd_813x_boot_interrupt(struct pci_dev *dev) { u32 pci_config_dword; if (noioapicquirk) return; if ((dev->revision == AMD_813X_REV_B1) || (dev->revision == AMD_813X_REV_B2)) return; pci_read_config_dword(dev, AMD_813X_MISC, &pci_config_dword); pci_config_dword &= ~AMD_813X_NOIOAMODE; pci_write_config_dword(dev, AMD_813X_MISC, pci_config_dword); pci_info(dev, "disabled boot interrupts on device [%04x:%04x]\n", dev->vendor, dev->device); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_BRIDGE, quirk_disable_amd_813x_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_BRIDGE, quirk_disable_amd_813x_boot_interrupt); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8132_BRIDGE, quirk_disable_amd_813x_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8132_BRIDGE, quirk_disable_amd_813x_boot_interrupt); #define AMD_8111_PCI_IRQ_ROUTING 0x56 static void quirk_disable_amd_8111_boot_interrupt(struct pci_dev *dev) { u16 pci_config_word; if (noioapicquirk) return; pci_read_config_word(dev, AMD_8111_PCI_IRQ_ROUTING, &pci_config_word); if (!pci_config_word) { pci_info(dev, "boot interrupts on device [%04x:%04x] already disabled\n", dev->vendor, dev->device); return; } pci_write_config_word(dev, AMD_8111_PCI_IRQ_ROUTING, 0); pci_info(dev, "disabled boot interrupts on device [%04x:%04x]\n", dev->vendor, dev->device); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8111_SMBUS, quirk_disable_amd_8111_boot_interrupt); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8111_SMBUS, quirk_disable_amd_8111_boot_interrupt); #endif /* CONFIG_X86_IO_APIC */ /* * Toshiba TC86C001 IDE controller reports the standard 8-byte BAR0 size * but the PIO transfers won't work if BAR0 falls at the odd 8 bytes. * Re-allocate the region if needed... */ static void quirk_tc86c001_ide(struct pci_dev *dev) { struct resource *r = &dev->resource[0]; if (r->start & 0x8) { r->flags |= IORESOURCE_UNSET; resource_set_range(r, 0, SZ_16); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TOSHIBA_2, PCI_DEVICE_ID_TOSHIBA_TC86C001_IDE, quirk_tc86c001_ide); /* * PLX PCI 9050 PCI Target bridge controller has an erratum that prevents the * local configuration registers accessible via BAR0 (memory) or BAR1 (i/o) * being read correctly if bit 7 of the base address is set. * The BAR0 or BAR1 region may be disabled (size 0) or enabled (size 128). * Re-allocate the regions to a 256-byte boundary if necessary. */ static void quirk_plx_pci9050(struct pci_dev *dev) { unsigned int bar; /* Fixed in revision 2 (PCI 9052). */ if (dev->revision >= 2) return; for (bar = 0; bar <= 1; bar++) if (pci_resource_len(dev, bar) == 0x80 && (pci_resource_start(dev, bar) & 0x80)) { struct resource *r = &dev->resource[bar]; pci_info(dev, "Re-allocating PLX PCI 9050 BAR %u to length 256 to avoid bit 7 bug\n", bar); r->flags |= IORESOURCE_UNSET; resource_set_range(r, 0, SZ_256); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_9050, quirk_plx_pci9050); /* * The following Meilhaus (vendor ID 0x1402) device IDs (amongst others) * may be using the PLX PCI 9050: 0x0630, 0x0940, 0x0950, 0x0960, 0x100b, * 0x1400, 0x140a, 0x140b, 0x14e0, 0x14ea, 0x14eb, 0x1604, 0x1608, 0x160c, * 0x168f, 0x2000, 0x2600, 0x3000, 0x810a, 0x810b. * * Currently, device IDs 0x2000 and 0x2600 are used by the Comedi "me_daq" * driver. */ DECLARE_PCI_FIXUP_HEADER(0x1402, 0x2000, quirk_plx_pci9050); DECLARE_PCI_FIXUP_HEADER(0x1402, 0x2600, quirk_plx_pci9050); static void quirk_netmos(struct pci_dev *dev) { unsigned int num_parallel = (dev->subsystem_device & 0xf0) >> 4; unsigned int num_serial = dev->subsystem_device & 0xf; /* * These Netmos parts are multiport serial devices with optional * parallel ports. Even when parallel ports are present, they * are identified as class SERIAL, which means the serial driver * will claim them. To prevent this, mark them as class OTHER. * These combo devices should be claimed by parport_serial. * * The subdevice ID is of the form 0x00PS, where <P> is the number * of parallel ports and <S> is the number of serial ports. */ switch (dev->device) { case PCI_DEVICE_ID_NETMOS_9835: /* Well, this rule doesn't hold for the following 9835 device */ if (dev->subsystem_vendor == PCI_VENDOR_ID_IBM && dev->subsystem_device == 0x0299) return; fallthrough; case PCI_DEVICE_ID_NETMOS_9735: case PCI_DEVICE_ID_NETMOS_9745: case PCI_DEVICE_ID_NETMOS_9845: case PCI_DEVICE_ID_NETMOS_9855: if (num_parallel) { pci_info(dev, "Netmos %04x (%u parallel, %u serial); changing class SERIAL to OTHER (use parport_serial)\n", dev->device, num_parallel, num_serial); dev->class = (PCI_CLASS_COMMUNICATION_OTHER << 8) | (dev->class & 0xff); } } } DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_NETMOS, PCI_ANY_ID, PCI_CLASS_COMMUNICATION_SERIAL, 8, quirk_netmos); static void quirk_e100_interrupt(struct pci_dev *dev) { u16 command, pmcsr; u8 __iomem *csr; u8 cmd_hi; switch (dev->device) { /* PCI IDs taken from drivers/net/e100.c */ case 0x1029: case 0x1030 ... 0x1034: case 0x1038 ... 0x103E: case 0x1050 ... 0x1057: case 0x1059: case 0x1064 ... 0x106B: case 0x1091 ... 0x1095: case 0x1209: case 0x1229: case 0x2449: case 0x2459: case 0x245D: case 0x27DC: break; default: return; } /* * Some firmware hands off the e100 with interrupts enabled, * which can cause a flood of interrupts if packets are * received before the driver attaches to the device. So * disable all e100 interrupts here. The driver will * re-enable them when it's ready. */ pci_read_config_word(dev, PCI_COMMAND, &command); if (!(command & PCI_COMMAND_MEMORY) || !pci_resource_start(dev, 0)) return; /* * Check that the device is in the D0 power state. If it's not, * there is no point to look any further. */ if (dev->pm_cap) { pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr); if ((pmcsr & PCI_PM_CTRL_STATE_MASK) != PCI_D0) return; } /* Convert from PCI bus to resource space. */ csr = ioremap(pci_resource_start(dev, 0), 8); if (!csr) { pci_warn(dev, "Can't map e100 registers\n"); return; } cmd_hi = readb(csr + 3); if (cmd_hi == 0) { pci_warn(dev, "Firmware left e100 interrupts enabled; disabling\n"); writeb(1, csr + 3); } iounmap(csr); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, PCI_CLASS_NETWORK_ETHERNET, 8, quirk_e100_interrupt); /* * The 82575 and 82598 may experience data corruption issues when transitioning * out of L0S. To prevent this we need to disable L0S on the PCIe link. */ static void quirk_disable_aspm_l0s(struct pci_dev *dev) { pci_info(dev, "Disabling L0s\n"); pci_disable_link_state(dev, PCIE_LINK_STATE_L0S); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10a7, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10a9, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10b6, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c6, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c7, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c8, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10d6, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10db, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10dd, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10e1, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10ec, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s); static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev) { pci_info(dev, "Disabling ASPM L0s/L1\n"); pci_disable_link_state(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1); } /* * ASM1083/1085 PCIe-PCI bridge devices cause AER timeout errors on the * upstream PCIe root port when ASPM is enabled. At least L0s mode is affected; * disable both L0s and L1 for now to be safe. */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); /* * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain * Link bit cleared after starting the link retrain process to allow this * process to finish. * * Affected devices: PI7C9X110, PI7C9X111SL, PI7C9X130. See also the * Pericom Errata Sheet PI7C9X111SLB_errata_rev1.2_102711.pdf. */ static void quirk_enable_clear_retrain_link(struct pci_dev *dev) { dev->clear_retrain_link = 1; pci_info(dev, "Enable PCIe Retrain Link quirk\n"); } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_PERICOM, 0xe110, quirk_enable_clear_retrain_link); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_PERICOM, 0xe111, quirk_enable_clear_retrain_link); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_PERICOM, 0xe130, quirk_enable_clear_retrain_link); static void fixup_rev1_53c810(struct pci_dev *dev) { u32 class = dev->class; /* * rev 1 ncr53c810 chips don't set the class at all which means * they don't get their resources remapped. Fix that here. */ if (class) return; dev->class = PCI_CLASS_STORAGE_SCSI << 8; pci_info(dev, "NCR 53c810 rev 1 PCI class overridden (%#08x -> %#08x)\n", class, dev->class); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NCR, PCI_DEVICE_ID_NCR_53C810, fixup_rev1_53c810); /* Enable 1k I/O space granularity on the Intel P64H2 */ static void quirk_p64h2_1k_io(struct pci_dev *dev) { u16 en1k; pci_read_config_word(dev, 0x40, &en1k); if (en1k & 0x200) { pci_info(dev, "Enable I/O Space to 1KB granularity\n"); dev->io_window_1k = 1; } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1460, quirk_p64h2_1k_io); /* * Under some circumstances, AER is not linked with extended capabilities. * Force it to be linked by setting the corresponding control bit in the * config space. */ static void quirk_nvidia_ck804_pcie_aer_ext_cap(struct pci_dev *dev) { uint8_t b; if (pci_read_config_byte(dev, 0xf41, &b) == 0) { if (!(b & 0x20)) { pci_write_config_byte(dev, 0xf41, b | 0x20); pci_info(dev, "Linking AER extended capability\n"); } } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_CK804_PCIE, quirk_nvidia_ck804_pcie_aer_ext_cap); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_CK804_PCIE, quirk_nvidia_ck804_pcie_aer_ext_cap); static void quirk_via_cx700_pci_parking_caching(struct pci_dev *dev) { /* * Disable PCI Bus Parking and PCI Master read caching on CX700 * which causes unspecified timing errors with a VT6212L on the PCI * bus leading to USB2.0 packet loss. * * This quirk is only enabled if a second (on the external PCI bus) * VT6212L is found -- the CX700 core itself also contains a USB * host controller with the same PCI ID as the VT6212L. */ /* Count VT6212L instances */ struct pci_dev *p = pci_get_device(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8235_USB_2, NULL); uint8_t b; /* * p should contain the first (internal) VT6212L -- see if we have * an external one by searching again. */ p = pci_get_device(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8235_USB_2, p); if (!p) return; pci_dev_put(p); if (pci_read_config_byte(dev, 0x76, &b) == 0) { if (b & 0x40) { /* Turn off PCI Bus Parking */ pci_write_config_byte(dev, 0x76, b ^ 0x40); pci_info(dev, "Disabling VIA CX700 PCI parking\n"); } } if (pci_read_config_byte(dev, 0x72, &b) == 0) { if (b != 0) { /* Turn off PCI Master read caching */ pci_write_config_byte(dev, 0x72, 0x0); /* Set PCI Master Bus time-out to "1x16 PCLK" */ pci_write_config_byte(dev, 0x75, 0x1); /* Disable "Read FIFO Timer" */ pci_write_config_byte(dev, 0x77, 0x0); pci_info(dev, "Disabling VIA CX700 PCI caching\n"); } } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, 0x324e, quirk_via_cx700_pci_parking_caching); static void quirk_brcm_5719_limit_mrrs(struct pci_dev *dev) { u32 rev; pci_read_config_dword(dev, 0xf4, &rev); /* Only CAP the MRRS if the device is a 5719 A0 */ if (rev == 0x05719000) { int readrq = pcie_get_readrq(dev); if (readrq > 2048) pcie_set_readrq(dev, 2048); } } DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5719, quirk_brcm_5719_limit_mrrs); /* * Originally in EDAC sources for i82875P: Intel tells BIOS developers to * hide device 6 which configures the overflow device access containing the * DRBs - this is where we expose device 6. * http://www.x86-secret.com/articles/tweak/pat/patsecrets-2.htm */ static void quirk_unhide_mch_dev6(struct pci_dev *dev) { u8 reg; if (pci_read_config_byte(dev, 0xF4, ®) == 0 && !(reg & 0x02)) { pci_info(dev, "Enabling MCH 'Overflow' Device\n"); pci_write_config_byte(dev, 0xF4, reg | 0x02); } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82865_HB, quirk_unhide_mch_dev6); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82875_HB, quirk_unhide_mch_dev6); #ifdef CONFIG_PCI_MSI /* * Some chipsets do not support MSI. We cannot easily rely on setting * PCI_BUS_FLAGS_NO_MSI in its bus flags because there are actually some * other buses controlled by the chipset even if Linux is not aware of it. * Instead of setting the flag on all buses in the machine, simply disable * MSI globally. */ static void quirk_disable_all_msi(struct pci_dev *dev) { pci_no_msi(); pci_warn(dev, "MSI quirk detected; MSI disabled\n"); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_GCNB_LE, quirk_disable_all_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RS400_200, quirk_disable_all_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RS480, quirk_disable_all_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_VT3336, quirk_disable_all_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_VT3351, quirk_disable_all_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_VT3364, quirk_disable_all_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8380_0, quirk_disable_all_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SI, 0x0761, quirk_disable_all_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SAMSUNG, 0xa5e3, quirk_disable_all_msi); /* Disable MSI on chipsets that are known to not support it */ static void quirk_disable_msi(struct pci_dev *dev) { if (dev->subordinate) { pci_warn(dev, "MSI quirk detected; subordinate MSI disabled\n"); dev->subordinate->bus_flags |= PCI_BUS_FLAGS_NO_MSI; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_BRIDGE, quirk_disable_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, 0xa238, quirk_disable_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x5a3f, quirk_disable_msi); /* * The APC bridge device in AMD 780 family northbridges has some random * OEM subsystem ID in its vendor ID register (erratum 18), so instead * we use the possible vendor/device IDs of the host bridge for the * declared quirk, and search for the APC bridge by slot number. */ static void quirk_amd_780_apc_msi(struct pci_dev *host_bridge) { struct pci_dev *apc_bridge; apc_bridge = pci_get_slot(host_bridge->bus, PCI_DEVFN(1, 0)); if (apc_bridge) { if (apc_bridge->device == 0x9602) quirk_disable_msi(apc_bridge); pci_dev_put(apc_bridge); } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x9600, quirk_amd_780_apc_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x9601, quirk_amd_780_apc_msi); /* * Go through the list of HyperTransport capabilities and return 1 if a HT * MSI capability is found and enabled. */ static int msi_ht_cap_enabled(struct pci_dev *dev) { int pos, ttl = PCI_FIND_CAP_TTL; pos = pci_find_ht_capability(dev, HT_CAPTYPE_MSI_MAPPING); while (pos && ttl--) { u8 flags; if (pci_read_config_byte(dev, pos + HT_MSI_FLAGS, &flags) == 0) { pci_info(dev, "Found %s HT MSI Mapping\n", flags & HT_MSI_FLAGS_ENABLE ? "enabled" : "disabled"); return (flags & HT_MSI_FLAGS_ENABLE) != 0; } pos = pci_find_next_ht_capability(dev, pos, HT_CAPTYPE_MSI_MAPPING); } return 0; } /* Check the HyperTransport MSI mapping to know whether MSI is enabled or not */ static void quirk_msi_ht_cap(struct pci_dev *dev) { if (!msi_ht_cap_enabled(dev)) quirk_disable_msi(dev); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_HT2000_PCIE, quirk_msi_ht_cap); /* * The nVidia CK804 chipset may have 2 HT MSI mappings. MSI is supported * if the MSI capability is set in any of these mappings. */ static void quirk_nvidia_ck804_msi_ht_cap(struct pci_dev *dev) { struct pci_dev *pdev; /* * Check HT MSI cap on this chipset and the root one. A single one * having MSI is enough to be sure that MSI is supported. */ pdev = pci_get_slot(dev->bus, 0); if (!pdev) return; if (!msi_ht_cap_enabled(pdev)) quirk_msi_ht_cap(dev); pci_dev_put(pdev); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_CK804_PCIE, quirk_nvidia_ck804_msi_ht_cap); /* Force enable MSI mapping capability on HT bridges */ static void ht_enable_msi_mapping(struct pci_dev *dev) { int pos, ttl = PCI_FIND_CAP_TTL; pos = pci_find_ht_capability(dev, HT_CAPTYPE_MSI_MAPPING); while (pos && ttl--) { u8 flags; if (pci_read_config_byte(dev, pos + HT_MSI_FLAGS, &flags) == 0) { pci_info(dev, "Enabling HT MSI Mapping\n"); pci_write_config_byte(dev, pos + HT_MSI_FLAGS, flags | HT_MSI_FLAGS_ENABLE); } pos = pci_find_next_ht_capability(dev, pos, HT_CAPTYPE_MSI_MAPPING); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_HT1000_PXB, ht_enable_msi_mapping); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8132_BRIDGE, ht_enable_msi_mapping); /* * The P5N32-SLI motherboards from Asus have a problem with MSI * for the MCP55 NIC. It is not yet determined whether the MSI problem * also affects other devices. As for now, turn off MSI for this device. */ static void nvenet_msi_disable(struct pci_dev *dev) { const char *board_name = dmi_get_system_info(DMI_BOARD_NAME); if (board_name && (strstr(board_name, "P5N32-SLI PREMIUM") || strstr(board_name, "P5N32-E SLI"))) { pci_info(dev, "Disabling MSI for MCP55 NIC on P5N32-SLI\n"); dev->no_msi = 1; } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_15, nvenet_msi_disable); /* * PCIe spec r6.0 sec 6.1.4.3 says that if MSI/MSI-X is enabled, the device * can't use INTx interrupts. Tegra's PCIe Root Ports don't generate MSI * interrupts for PME and AER events; instead only INTx interrupts are * generated. Though Tegra's PCIe Root Ports can generate MSI interrupts * for other events, since PCIe specification doesn't support using a mix of * INTx and MSI/MSI-X, it is required to disable MSI interrupts to avoid port * service drivers registering their respective ISRs for MSIs. */ static void pci_quirk_nvidia_tegra_disable_rp_msi(struct pci_dev *dev) { dev->no_msi = 1; } DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x1ad0, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x1ad1, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x1ad2, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0bf0, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0bf1, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0e1c, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0e1d, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0e12, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0e13, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0fae, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0faf, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x10e5, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x10e6, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x229a, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x229c, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_NVIDIA, 0x229e, PCI_CLASS_BRIDGE_PCI, 8, pci_quirk_nvidia_tegra_disable_rp_msi); /* * Some versions of the MCP55 bridge from Nvidia have a legacy IRQ routing * config register. This register controls the routing of legacy * interrupts from devices that route through the MCP55. If this register * is misprogrammed, interrupts are only sent to the BSP, unlike * conventional systems where the IRQ is broadcast to all online CPUs. Not * having this register set properly prevents kdump from booting up * properly, so let's make sure that we have it set correctly. * Note that this is an undocumented register. */ static void nvbridge_check_legacy_irq_routing(struct pci_dev *dev) { u32 cfg; if (!pci_find_capability(dev, PCI_CAP_ID_HT)) return; pci_read_config_dword(dev, 0x74, &cfg); if (cfg & ((1 << 2) | (1 << 15))) { pr_info("Rewriting IRQ routing register on MCP55\n"); cfg &= ~((1 << 2) | (1 << 15)); pci_write_config_dword(dev, 0x74, cfg); } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_MCP55_BRIDGE_V0, nvbridge_check_legacy_irq_routing); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_MCP55_BRIDGE_V4, nvbridge_check_legacy_irq_routing); static int ht_check_msi_mapping(struct pci_dev *dev) { int pos, ttl = PCI_FIND_CAP_TTL; int found = 0; /* Check if there is HT MSI cap or enabled on this device */ pos = pci_find_ht_capability(dev, HT_CAPTYPE_MSI_MAPPING); while (pos && ttl--) { u8 flags; if (found < 1) found = 1; if (pci_read_config_byte(dev, pos + HT_MSI_FLAGS, &flags) == 0) { if (flags & HT_MSI_FLAGS_ENABLE) { if (found < 2) { found = 2; break; } } } pos = pci_find_next_ht_capability(dev, pos, HT_CAPTYPE_MSI_MAPPING); } return found; } static int host_bridge_with_leaf(struct pci_dev *host_bridge) { struct pci_dev *dev; int pos; int i, dev_no; int found = 0; dev_no = host_bridge->devfn >> 3; for (i = dev_no + 1; i < 0x20; i++) { dev = pci_get_slot(host_bridge->bus, PCI_DEVFN(i, 0)); if (!dev) continue; /* found next host bridge? */ pos = pci_find_ht_capability(dev, HT_CAPTYPE_SLAVE); if (pos != 0) { pci_dev_put(dev); break; } if (ht_check_msi_mapping(dev)) { found = 1; pci_dev_put(dev); break; } pci_dev_put(dev); } return found; } #define PCI_HT_CAP_SLAVE_CTRL0 4 /* link control */ #define PCI_HT_CAP_SLAVE_CTRL1 8 /* link control to */ static int is_end_of_ht_chain(struct pci_dev *dev) { int pos, ctrl_off; int end = 0; u16 flags, ctrl; pos = pci_find_ht_capability(dev, HT_CAPTYPE_SLAVE); if (!pos) goto out; pci_read_config_word(dev, pos + PCI_CAP_FLAGS, &flags); ctrl_off = ((flags >> 10) & 1) ? PCI_HT_CAP_SLAVE_CTRL0 : PCI_HT_CAP_SLAVE_CTRL1; pci_read_config_word(dev, pos + ctrl_off, &ctrl); if (ctrl & (1 << 6)) end = 1; out: return end; } static void nv_ht_enable_msi_mapping(struct pci_dev *dev) { struct pci_dev *host_bridge; int pos; int i, dev_no; int found = 0; dev_no = dev->devfn >> 3; for (i = dev_no; i >= 0; i--) { host_bridge = pci_get_slot(dev->bus, PCI_DEVFN(i, 0)); if (!host_bridge) continue; pos = pci_find_ht_capability(host_bridge, HT_CAPTYPE_SLAVE); if (pos != 0) { found = 1; break; } pci_dev_put(host_bridge); } if (!found) return; /* don't enable end_device/host_bridge with leaf directly here */ if (host_bridge == dev && is_end_of_ht_chain(host_bridge) && host_bridge_with_leaf(host_bridge)) goto out; /* root did that ! */ if (msi_ht_cap_enabled(host_bridge)) goto out; ht_enable_msi_mapping(dev); out: pci_dev_put(host_bridge); } static void ht_disable_msi_mapping(struct pci_dev *dev) { int pos, ttl = PCI_FIND_CAP_TTL; pos = pci_find_ht_capability(dev, HT_CAPTYPE_MSI_MAPPING); while (pos && ttl--) { u8 flags; if (pci_read_config_byte(dev, pos + HT_MSI_FLAGS, &flags) == 0) { pci_info(dev, "Disabling HT MSI Mapping\n"); pci_write_config_byte(dev, pos + HT_MSI_FLAGS, flags & ~HT_MSI_FLAGS_ENABLE); } pos = pci_find_next_ht_capability(dev, pos, HT_CAPTYPE_MSI_MAPPING); } } static void __nv_msi_ht_cap_quirk(struct pci_dev *dev, int all) { struct pci_dev *host_bridge; int pos; int found; if (!pci_msi_enabled()) return; /* check if there is HT MSI cap or enabled on this device */ found = ht_check_msi_mapping(dev); /* no HT MSI CAP */ if (found == 0) return; /* * HT MSI mapping should be disabled on devices that are below * a non-HyperTransport host bridge. Locate the host bridge. */ host_bridge = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus), 0, PCI_DEVFN(0, 0)); if (host_bridge == NULL) { pci_warn(dev, "nv_msi_ht_cap_quirk didn't locate host bridge\n"); return; } pos = pci_find_ht_capability(host_bridge, HT_CAPTYPE_SLAVE); if (pos != 0) { /* Host bridge is to HT */ if (found == 1) { /* it is not enabled, try to enable it */ if (all) ht_enable_msi_mapping(dev); else nv_ht_enable_msi_mapping(dev); } goto out; } /* HT MSI is not enabled */ if (found == 1) goto out; /* Host bridge is not to HT, disable HT MSI mapping on this device */ ht_disable_msi_mapping(dev); out: pci_dev_put(host_bridge); } static void nv_msi_ht_cap_quirk_all(struct pci_dev *dev) { return __nv_msi_ht_cap_quirk(dev, 1); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AL, PCI_ANY_ID, nv_msi_ht_cap_quirk_all); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID, nv_msi_ht_cap_quirk_all); static void nv_msi_ht_cap_quirk_leaf(struct pci_dev *dev) { return __nv_msi_ht_cap_quirk(dev, 0); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, nv_msi_ht_cap_quirk_leaf); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, nv_msi_ht_cap_quirk_leaf); static void quirk_msi_intx_disable_bug(struct pci_dev *dev) { dev->dev_flags |= PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG; } static void quirk_msi_intx_disable_ati_bug(struct pci_dev *dev) { struct pci_dev *p; /* * SB700 MSI issue will be fixed at HW level from revision A21; * we need check PCI REVISION ID of SMBus controller to get SB700 * revision. */ p = pci_get_device(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_SBX00_SMBUS, NULL); if (!p) return; if ((p->revision < 0x3B) && (p->revision >= 0x30)) dev->dev_flags |= PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG; pci_dev_put(p); } static void quirk_msi_intx_disable_qca_bug(struct pci_dev *dev) { /* AR816X/AR817X/E210X MSI is fixed at HW level from revision 0x18 */ if (dev->revision < 0x18) { pci_info(dev, "set MSI_INTX_DISABLE_BUG flag\n"); dev->dev_flags |= PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG; } } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5780, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5780S, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5714, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5714S, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5715, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5715S, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4390, quirk_msi_intx_disable_ati_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4391, quirk_msi_intx_disable_ati_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4392, quirk_msi_intx_disable_ati_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4393, quirk_msi_intx_disable_ati_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4394, quirk_msi_intx_disable_ati_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4373, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4374, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4375, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x1062, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x1063, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x2060, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x2062, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x1073, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x1083, quirk_msi_intx_disable_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x1090, quirk_msi_intx_disable_qca_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x1091, quirk_msi_intx_disable_qca_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x10a0, quirk_msi_intx_disable_qca_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x10a1, quirk_msi_intx_disable_qca_bug); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0xe091, quirk_msi_intx_disable_qca_bug); /* * Amazon's Annapurna Labs 1c36:0031 Root Ports don't support MSI-X, so it * should be disabled on platforms where the device (mistakenly) advertises it. * * Notice that this quirk also disables MSI (which may work, but hasn't been * tested), since currently there is no standard way to disable only MSI-X. * * The 0031 device id is reused for other non Root Port device types, * therefore the quirk is registered for the PCI_CLASS_BRIDGE_PCI class. */ static void quirk_al_msi_disable(struct pci_dev *dev) { dev->no_msi = 1; pci_warn(dev, "Disabling MSI/MSI-X\n"); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS, 0x0031, PCI_CLASS_BRIDGE_PCI, 8, quirk_al_msi_disable); #endif /* CONFIG_PCI_MSI */ /* * Allow manual resource allocation for PCI hotplug bridges via * pci=hpmemsize=nnM and pci=hpiosize=nnM parameters. For some PCI-PCI * hotplug bridges, like PLX 6254 (former HINT HB6), kernel fails to * allocate resources when hotplug device is inserted and PCI bus is * rescanned. */ static void quirk_hotplug_bridge(struct pci_dev *dev) { dev->is_hotplug_bridge = 1; } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge); /* * This is a quirk for the Ricoh MMC controller found as a part of some * multifunction chips. * * This is very similar and based on the ricoh_mmc driver written by * Philip Langdale. Thank you for these magic sequences. * * These chips implement the four main memory card controllers (SD, MMC, * MS, xD) and one or both of CardBus or FireWire. * * It happens that they implement SD and MMC support as separate * controllers (and PCI functions). The Linux SDHCI driver supports MMC * cards but the chip detects MMC cards in hardware and directs them to the * MMC controller - so the SDHCI driver never sees them. * * To get around this, we must disable the useless MMC controller. At that * point, the SDHCI controller will start seeing them. It seems to be the * case that the relevant PCI registers to deactivate the MMC controller * live on PCI function 0, which might be the CardBus controller or the * FireWire controller, depending on the particular chip in question * * This has to be done early, because as soon as we disable the MMC controller * other PCI functions shift up one level, e.g. function #2 becomes function * #1, and this will confuse the PCI core. */ #ifdef CONFIG_MMC_RICOH_MMC static void ricoh_mmc_fixup_rl5c476(struct pci_dev *dev) { u8 write_enable; u8 write_target; u8 disable; /* * Disable via CardBus interface * * This must be done via function #0 */ if (PCI_FUNC(dev->devfn)) return; pci_read_config_byte(dev, 0xB7, &disable); if (disable & 0x02) return; pci_read_config_byte(dev, 0x8E, &write_enable); pci_write_config_byte(dev, 0x8E, 0xAA); pci_read_config_byte(dev, 0x8D, &write_target); pci_write_config_byte(dev, 0x8D, 0xB7); pci_write_config_byte(dev, 0xB7, disable | 0x02); pci_write_config_byte(dev, 0x8E, write_enable); pci_write_config_byte(dev, 0x8D, write_target); pci_notice(dev, "proprietary Ricoh MMC controller disabled (via CardBus function)\n"); pci_notice(dev, "MMC cards are now supported by standard SDHCI controller\n"); } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_RL5C476, ricoh_mmc_fixup_rl5c476); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_RL5C476, ricoh_mmc_fixup_rl5c476); static void ricoh_mmc_fixup_r5c832(struct pci_dev *dev) { u8 write_enable; u8 disable; /* * Disable via FireWire interface * * This must be done via function #0 */ if (PCI_FUNC(dev->devfn)) return; /* * RICOH 0xe822 and 0xe823 SD/MMC card readers fail to recognize * certain types of SD/MMC cards. Lowering the SD base clock * frequency from 200Mhz to 50Mhz fixes this issue. * * 0x150 - SD2.0 mode enable for changing base clock * frequency to 50Mhz * 0xe1 - Base clock frequency * 0x32 - 50Mhz new clock frequency * 0xf9 - Key register for 0x150 * 0xfc - key register for 0xe1 */ if (dev->device == PCI_DEVICE_ID_RICOH_R5CE822 || dev->device == PCI_DEVICE_ID_RICOH_R5CE823) { pci_write_config_byte(dev, 0xf9, 0xfc); pci_write_config_byte(dev, 0x150, 0x10); pci_write_config_byte(dev, 0xf9, 0x00); pci_write_config_byte(dev, 0xfc, 0x01); pci_write_config_byte(dev, 0xe1, 0x32); pci_write_config_byte(dev, 0xfc, 0x00); pci_notice(dev, "MMC controller base frequency changed to 50Mhz.\n"); } pci_read_config_byte(dev, 0xCB, &disable); if (disable & 0x02) return; pci_read_config_byte(dev, 0xCA, &write_enable); pci_write_config_byte(dev, 0xCA, 0x57); pci_write_config_byte(dev, 0xCB, disable | 0x02); pci_write_config_byte(dev, 0xCA, write_enable); pci_notice(dev, "proprietary Ricoh MMC controller disabled (via FireWire function)\n"); pci_notice(dev, "MMC cards are now supported by standard SDHCI controller\n"); } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5C832, ricoh_mmc_fixup_r5c832); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5C832, ricoh_mmc_fixup_r5c832); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5CE822, ricoh_mmc_fixup_r5c832); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5CE822, ricoh_mmc_fixup_r5c832); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5CE823, ricoh_mmc_fixup_r5c832); DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5CE823, ricoh_mmc_fixup_r5c832); #endif /*CONFIG_MMC_RICOH_MMC*/ #ifdef CONFIG_DMAR_TABLE #define VTUNCERRMSK_REG 0x1ac #define VTD_MSK_SPEC_ERRORS (1 << 31) /* * This is a quirk for masking VT-d spec-defined errors to platform error * handling logic. Without this, platforms using Intel 7500, 5500 chipsets * (and the derivative chipsets like X58 etc) seem to generate NMI/SMI (based * on the RAS config settings of the platform) when a VT-d fault happens. * The resulting SMI caused the system to hang. * * VT-d spec-related errors are already handled by the VT-d OS code, so no * need to report the same error through other channels. */ static void vtd_mask_spec_errors(struct pci_dev *dev) { u32 word; pci_read_config_dword(dev, VTUNCERRMSK_REG, &word); pci_write_config_dword(dev, VTUNCERRMSK_REG, word | VTD_MSK_SPEC_ERRORS); } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, vtd_mask_spec_errors); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x3c28, vtd_mask_spec_errors); #endif static void fixup_ti816x_class(struct pci_dev *dev) { u32 class = dev->class; /* TI 816x devices do not have class code set when in PCIe boot mode */ dev->class = PCI_CLASS_MULTIMEDIA_VIDEO << 8; pci_info(dev, "PCI class overridden (%#08x -> %#08x)\n", class, dev->class); } DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_TI, 0xb800, PCI_CLASS_NOT_DEFINED, 8, fixup_ti816x_class); /* * Some PCIe devices do not work reliably with the claimed maximum * payload size supported. */ static void fixup_mpss_256(struct pci_dev *dev) { dev->pcie_mpss = 1; /* 256 bytes */ } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SOLARFLARE, PCI_DEVICE_ID_SOLARFLARE_SFC4000A_0, fixup_mpss_256); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SOLARFLARE, PCI_DEVICE_ID_SOLARFLARE_SFC4000A_1, fixup_mpss_256); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SOLARFLARE, PCI_DEVICE_ID_SOLARFLARE_SFC4000B, fixup_mpss_256); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_ASMEDIA, 0x0612, fixup_mpss_256); /* * Intel 5000 and 5100 Memory controllers have an erratum with read completion * coalescing (which is enabled by default on some BIOSes) and MPS of 256B. * Since there is no way of knowing what the PCIe MPS on each fabric will be * until all of the devices are discovered and buses walked, read completion * coalescing must be disabled. Unfortunately, it cannot be re-enabled because * it is possible to hotplug a device with MPS of 256B. */ static void quirk_intel_mc_errata(struct pci_dev *dev) { int err; u16 rcc; if (pcie_bus_config == PCIE_BUS_TUNE_OFF || pcie_bus_config == PCIE_BUS_DEFAULT) return; /* * Intel erratum specifies bits to change but does not say what * they are. Keeping them magical until such time as the registers * and values can be explained. */ err = pci_read_config_word(dev, 0x48, &rcc); if (err) { pci_err(dev, "Error attempting to read the read completion coalescing register\n"); return; } if (!(rcc & (1 << 10))) return; rcc &= ~(1 << 10); err = pci_write_config_word(dev, 0x48, rcc); if (err) { pci_err(dev, "Error attempting to write the read completion coalescing register\n"); return; } pr_info_once("Read completion coalescing disabled due to hardware erratum relating to 256B MPS\n"); } /* Intel 5000 series memory controllers and ports 2-7 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25c0, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d0, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d4, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d8, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e2, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e3, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e4, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e5, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e6, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e7, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f7, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f8, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f9, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25fa, quirk_intel_mc_errata); /* Intel 5100 series memory controllers and ports 2-7 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65c0, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e2, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e3, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e4, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e5, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e6, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e7, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f7, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f8, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f9, quirk_intel_mc_errata); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65fa, quirk_intel_mc_errata); /* * Ivytown NTB BAR sizes are misreported by the hardware due to an erratum. * To work around this, query the size it should be configured to by the * device and modify the resource end to correspond to this new size. */ static void quirk_intel_ntb(struct pci_dev *dev) { int rc; u8 val; rc = pci_read_config_byte(dev, 0x00D0, &val); if (rc) return; resource_set_size(&dev->resource[2], (resource_size_t)1 << val); rc = pci_read_config_byte(dev, 0x00D1, &val); if (rc) return; resource_set_size(&dev->resource[4], (resource_size_t)1 << val); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0e08, quirk_intel_ntb); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0e0d, quirk_intel_ntb); /* * Some BIOS implementations leave the Intel GPU interrupts enabled, even * though no one is handling them (e.g., if the i915 driver is never * loaded). Additionally the interrupt destination is not set up properly * and the interrupt ends up -somewhere-. * * These spurious interrupts are "sticky" and the kernel disables the * (shared) interrupt line after 100,000+ generated interrupts. * * Fix it by disabling the still enabled interrupts. This resolves crashes * often seen on monitor unplug. */ #define I915_DEIER_REG 0x4400c static void disable_igfx_irq(struct pci_dev *dev) { void __iomem *regs = pci_iomap(dev, 0, 0); if (regs == NULL) { pci_warn(dev, "igfx quirk: Can't iomap PCI device\n"); return; } /* Check if any interrupt line is still enabled */ if (readl(regs + I915_DEIER_REG) != 0) { pci_warn(dev, "BIOS left Intel GPU interrupts enabled; disabling\n"); writel(0, regs + I915_DEIER_REG); } pci_iounmap(dev, regs); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0042, disable_igfx_irq); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0046, disable_igfx_irq); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x004a, disable_igfx_irq); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0102, disable_igfx_irq); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0106, disable_igfx_irq); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x010a, disable_igfx_irq); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0152, disable_igfx_irq); /* * PCI devices which are on Intel chips can skip the 10ms delay * before entering D3 mode. */ static void quirk_remove_d3hot_delay(struct pci_dev *dev) { dev->d3hot_delay = 0; } /* C600 Series devices do not need 10ms d3hot_delay */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0412, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0c00, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0c0c, quirk_remove_d3hot_delay); /* Lynxpoint-H PCH devices do not need 10ms d3hot_delay */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c02, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c18, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c1c, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c20, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c22, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c26, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c2d, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c31, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c3a, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c3d, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x8c4e, quirk_remove_d3hot_delay); /* Intel Cherrytrail devices do not need 10ms d3hot_delay */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2280, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2298, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x229c, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x22b0, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x22b5, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x22b7, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x22b8, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x22d8, quirk_remove_d3hot_delay); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x22dc, quirk_remove_d3hot_delay); /* * Some devices may pass our check in pci_intx_mask_supported() if * PCI_COMMAND_INTX_DISABLE works though they actually do not properly * support this feature. */ static void quirk_broken_intx_masking(struct pci_dev *dev) { dev->broken_intx_masking = 1; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x0030, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(0x1814, 0x0601, /* Ralink RT2800 802.11n PCI */ quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(0x1b7c, 0x0004, /* Ceton InfiniTV4 */ quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CREATIVE, PCI_DEVICE_ID_CREATIVE_20K2, quirk_broken_intx_masking); /* * Realtek RTL8169 PCI Gigabit Ethernet Controller (rev 10) * Subsystem: Realtek RTL8169/8110 Family PCI Gigabit Ethernet NIC * * RTL8110SC - Fails under PCI device assignment using DisINTx masking. */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, 0x8169, quirk_broken_intx_masking); /* * Intel i40e (XL710/X710) 10/20/40GbE NICs all have broken INTx masking, * DisINTx can be set but the interrupt status bit is non-functional. */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1572, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1574, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1580, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1581, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1583, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1584, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1585, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1586, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1587, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1588, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1589, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x158a, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x158b, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x37d0, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x37d1, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x37d2, quirk_broken_intx_masking); static u16 mellanox_broken_intx_devs[] = { PCI_DEVICE_ID_MELLANOX_HERMON_SDR, PCI_DEVICE_ID_MELLANOX_HERMON_DDR, PCI_DEVICE_ID_MELLANOX_HERMON_QDR, PCI_DEVICE_ID_MELLANOX_HERMON_DDR_GEN2, PCI_DEVICE_ID_MELLANOX_HERMON_QDR_GEN2, PCI_DEVICE_ID_MELLANOX_HERMON_EN, PCI_DEVICE_ID_MELLANOX_HERMON_EN_GEN2, PCI_DEVICE_ID_MELLANOX_CONNECTX_EN, PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_T_GEN2, PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_GEN2, PCI_DEVICE_ID_MELLANOX_CONNECTX_EN_5_GEN2, PCI_DEVICE_ID_MELLANOX_CONNECTX2, PCI_DEVICE_ID_MELLANOX_CONNECTX3, PCI_DEVICE_ID_MELLANOX_CONNECTX3_PRO, }; #define CONNECTX_4_CURR_MAX_MINOR 99 #define CONNECTX_4_INTX_SUPPORT_MINOR 14 /* * Check ConnectX-4/LX FW version to see if it supports legacy interrupts. * If so, don't mark it as broken. * FW minor > 99 means older FW version format and no INTx masking support. * FW minor < 14 means new FW version format and no INTx masking support. */ static void mellanox_check_broken_intx_masking(struct pci_dev *pdev) { __be32 __iomem *fw_ver; u16 fw_major; u16 fw_minor; u16 fw_subminor; u32 fw_maj_min; u32 fw_sub_min; int i; for (i = 0; i < ARRAY_SIZE(mellanox_broken_intx_devs); i++) { if (pdev->device == mellanox_broken_intx_devs[i]) { pdev->broken_intx_masking = 1; return; } } /* * Getting here means Connect-IB cards and up. Connect-IB has no INTx * support so shouldn't be checked further */ if (pdev->device == PCI_DEVICE_ID_MELLANOX_CONNECTIB) return; if (pdev->device != PCI_DEVICE_ID_MELLANOX_CONNECTX4 && pdev->device != PCI_DEVICE_ID_MELLANOX_CONNECTX4_LX) return; /* For ConnectX-4 and ConnectX-4LX, need to check FW support */ if (pci_enable_device_mem(pdev)) { pci_warn(pdev, "Can't enable device memory\n"); return; } fw_ver = ioremap(pci_resource_start(pdev, 0), 4); if (!fw_ver) { pci_warn(pdev, "Can't map ConnectX-4 initialization segment\n"); goto out; } /* Reading from resource space should be 32b aligned */ fw_maj_min = ioread32be(fw_ver); fw_sub_min = ioread32be(fw_ver + 1); fw_major = fw_maj_min & 0xffff; fw_minor = fw_maj_min >> 16; fw_subminor = fw_sub_min & 0xffff; if (fw_minor > CONNECTX_4_CURR_MAX_MINOR || fw_minor < CONNECTX_4_INTX_SUPPORT_MINOR) { pci_warn(pdev, "ConnectX-4: FW %u.%u.%u doesn't support INTx masking, disabling. Please upgrade FW to %d.14.1100 and up for INTx support\n", fw_major, fw_minor, fw_subminor, pdev->device == PCI_DEVICE_ID_MELLANOX_CONNECTX4 ? 12 : 14); pdev->broken_intx_masking = 1; } iounmap(fw_ver); out: pci_disable_device(pdev); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX, PCI_ANY_ID, mellanox_check_broken_intx_masking); static void quirk_no_bus_reset(struct pci_dev *dev) { dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET; } /* * Some NVIDIA GPU devices do not work with bus reset, SBR needs to be * prevented for those affected devices. */ static void quirk_nvidia_no_bus_reset(struct pci_dev *dev) { if ((dev->device & 0xffc0) == 0x2340) quirk_no_bus_reset(dev); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, quirk_nvidia_no_bus_reset); /* * Some Atheros AR9xxx and QCA988x chips do not behave after a bus reset. * The device will throw a Link Down error on AER-capable systems and * regardless of AER, config space of the device is never accessible again * and typically causes the system to hang or reset when access is attempted. * https://lore.kernel.org/r/20140923210318.498dacbd@dualc.maya.org/ */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0030, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, quirk_no_bus_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003e, quirk_no_bus_reset); /* * Root port on some Cavium CN8xxx chips do not successfully complete a bus * reset when used with certain child devices. After the reset, config * accesses to the child may fail. */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_no_bus_reset); /* * Some TI KeyStone C667X devices do not support bus/hot reset. The PCIESS * automatically disables LTSSM when Secondary Bus Reset is received and * the device stops working. Prevent bus reset for these devices. With * this change, the device can be assigned to VMs with VFIO, but it will * leak state between VMs. Reference * https://e2e.ti.com/support/processors/f/791/t/954382 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TI, 0xb005, quirk_no_bus_reset); static void quirk_no_pm_reset(struct pci_dev *dev) { /* * We can't do a bus reset on root bus devices, but an ineffective * PM reset may be better than nothing. */ if (!pci_is_root_bus(dev->bus)) dev->dev_flags |= PCI_DEV_FLAGS_NO_PM_RESET; } /* * Some AMD/ATI GPUS (HD8570 - Oland) report that a D3hot->D0 transition * causes a reset (i.e., they advertise NoSoftRst-). This transition seems * to have no effect on the device: it retains the framebuffer contents and * monitor sync. Advertising this support makes other layers, like VFIO, * assume pci_reset_function() is viable for this device. Mark it as * unavailable to skip it when testing reset methods. */ DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_ATI, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, 8, quirk_no_pm_reset); /* * Spectrum-{1,2,3,4} devices report that a D3hot->D0 transition causes a reset * (i.e., they advertise NoSoftRst-). However, this transition does not have * any effect on the device: It continues to be operational and network ports * remain up. Advertising this support makes it seem as if a PM reset is viable * for these devices. Mark it as unavailable to skip it when testing reset * methods. */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0xcb84, quirk_no_pm_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0xcf6c, quirk_no_pm_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0xcf70, quirk_no_pm_reset); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0xcf80, quirk_no_pm_reset); /* * Thunderbolt controllers with broken MSI hotplug signaling: * Entire 1st generation (Light Ridge, Eagle Ridge, Light Peak) and part * of the 2nd generation (Cactus Ridge 4C up to revision 1, Port Ridge). */ static void quirk_thunderbolt_hotplug_msi(struct pci_dev *pdev) { if (pdev->is_hotplug_bridge && (pdev->device != PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C || pdev->revision <= 1)) pdev->no_msi = 1; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_RIDGE, quirk_thunderbolt_hotplug_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_EAGLE_RIDGE, quirk_thunderbolt_hotplug_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_PEAK, quirk_thunderbolt_hotplug_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, quirk_thunderbolt_hotplug_msi); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE, quirk_thunderbolt_hotplug_msi); #ifdef CONFIG_ACPI /* * Apple: Shutdown Cactus Ridge Thunderbolt controller. * * On Apple hardware the Cactus Ridge Thunderbolt controller needs to be * shutdown before suspend. Otherwise the native host interface (NHI) will not * be present after resume if a device was plugged in before suspend. * * The Thunderbolt controller consists of a PCIe switch with downstream * bridges leading to the NHI and to the tunnel PCI bridges. * * This quirk cuts power to the whole chip. Therefore we have to apply it * during suspend_noirq of the upstream bridge. * * Power is automagically restored before resume. No action is needed. */ static void quirk_apple_poweroff_thunderbolt(struct pci_dev *dev) { acpi_handle bridge, SXIO, SXFP, SXLV; if (!x86_apple_machine) return; if (pci_pcie_type(dev) != PCI_EXP_TYPE_UPSTREAM) return; /* * SXIO/SXFP/SXLF turns off power to the Thunderbolt controller. * We don't know how to turn it back on again, but firmware does, * so we can only use SXIO/SXFP/SXLF if we're suspending via * firmware. */ if (!pm_suspend_via_firmware()) return; bridge = ACPI_HANDLE(&dev->dev); if (!bridge) return; /* * SXIO and SXLV are present only on machines requiring this quirk. * Thunderbolt bridges in external devices might have the same * device ID as those on the host, but they will not have the * associated ACPI methods. This implicitly checks that we are at * the right bridge. */ if (ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXIO", &SXIO)) || ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXFP", &SXFP)) || ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXLV", &SXLV))) return; pci_info(dev, "quirk: cutting power to Thunderbolt controller...\n"); /* magic sequence */ acpi_execute_simple_method(SXIO, NULL, 1); acpi_execute_simple_method(SXFP, NULL, 0); msleep(300); acpi_execute_simple_method(SXLV, NULL, 0); acpi_execute_simple_method(SXIO, NULL, 0); acpi_execute_simple_method(SXLV, NULL, 0); } DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, quirk_apple_poweroff_thunderbolt); #endif /* * Following are device-specific reset methods which can be used to * reset a single function if other methods (e.g. FLR, PM D0->D3) are * not available. */ static int reset_intel_82599_sfp_virtfn(struct pci_dev *dev, bool probe) { /* * http://www.intel.com/content/dam/doc/datasheet/82599-10-gbe-controller-datasheet.pdf * * The 82599 supports FLR on VFs, but FLR support is reported only * in the PF DEVCAP (sec 9.3.10.4), not in the VF DEVCAP (sec 9.5). * Thus we must call pcie_flr() directly without first checking if it is * supported. */ if (!probe) pcie_flr(dev); return 0; } #define SOUTH_CHICKEN2 0xc2004 #define PCH_PP_STATUS 0xc7200 #define PCH_PP_CONTROL 0xc7204 #define MSG_CTL 0x45010 #define NSDE_PWR_STATE 0xd0100 #define IGD_OPERATION_TIMEOUT 10000 /* set timeout 10 seconds */ static int reset_ivb_igd(struct pci_dev *dev, bool probe) { void __iomem *mmio_base; unsigned long timeout; u32 val; if (probe) return 0; mmio_base = pci_iomap(dev, 0, 0); if (!mmio_base) return -ENOMEM; iowrite32(0x00000002, mmio_base + MSG_CTL); /* * Clobbering SOUTH_CHICKEN2 register is fine only if the next * driver loaded sets the right bits. However, this's a reset and * the bits have been set by i915 previously, so we clobber * SOUTH_CHICKEN2 register directly here. */ iowrite32(0x00000005, mmio_base + SOUTH_CHICKEN2); val = ioread32(mmio_base + PCH_PP_CONTROL) & 0xfffffffe; iowrite32(val, mmio_base + PCH_PP_CONTROL); timeout = jiffies + msecs_to_jiffies(IGD_OPERATION_TIMEOUT); do { val = ioread32(mmio_base + PCH_PP_STATUS); if ((val & 0xb0000000) == 0) goto reset_complete; msleep(10); } while (time_before(jiffies, timeout)); pci_warn(dev, "timeout during reset\n"); reset_complete: iowrite32(0x00000002, mmio_base + NSDE_PWR_STATE); pci_iounmap(dev, mmio_base); return 0; } /* Device-specific reset method for Chelsio T4-based adapters */ static int reset_chelsio_generic_dev(struct pci_dev *dev, bool probe) { u16 old_command; u16 msix_flags; /* * If this isn't a Chelsio T4-based device, return -ENOTTY indicating * that we have no device-specific reset method. */ if ((dev->device & 0xf000) != 0x4000) return -ENOTTY; /* * If this is the "probe" phase, return 0 indicating that we can * reset this device. */ if (probe) return 0; /* * T4 can wedge if there are DMAs in flight within the chip and Bus * Master has been disabled. We need to have it on till the Function * Level Reset completes. (BUS_MASTER is disabled in * pci_reset_function()). */ pci_read_config_word(dev, PCI_COMMAND, &old_command); pci_write_config_word(dev, PCI_COMMAND, old_command | PCI_COMMAND_MASTER); /* * Perform the actual device function reset, saving and restoring * configuration information around the reset. */ pci_save_state(dev); /* * T4 also suffers a Head-Of-Line blocking problem if MSI-X interrupts * are disabled when an MSI-X interrupt message needs to be delivered. * So we briefly re-enable MSI-X interrupts for the duration of the * FLR. The pci_restore_state() below will restore the original * MSI-X state. */ pci_read_config_word(dev, dev->msix_cap+PCI_MSIX_FLAGS, &msix_flags); if ((msix_flags & PCI_MSIX_FLAGS_ENABLE) == 0) pci_write_config_word(dev, dev->msix_cap+PCI_MSIX_FLAGS, msix_flags | PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL); pcie_flr(dev); /* * Restore the configuration information (BAR values, etc.) including * the original PCI Configuration Space Command word, and return * success. */ pci_restore_state(dev); pci_write_config_word(dev, PCI_COMMAND, old_command); return 0; } #define PCI_DEVICE_ID_INTEL_82599_SFP_VF 0x10ed #define PCI_DEVICE_ID_INTEL_IVB_M_VGA 0x0156 #define PCI_DEVICE_ID_INTEL_IVB_M2_VGA 0x0166 /* * The Samsung SM961/PM961 controller can sometimes enter a fatal state after * FLR where config space reads from the device return -1. We seem to be * able to avoid this condition if we disable the NVMe controller prior to * FLR. This quirk is generic for any NVMe class device requiring similar * assistance to quiesce the device prior to FLR. * * NVMe specification: https://nvmexpress.org/resources/specifications/ * Revision 1.0e: * Chapter 2: Required and optional PCI config registers * Chapter 3: NVMe control registers * Chapter 7.3: Reset behavior */ static int nvme_disable_and_flr(struct pci_dev *dev, bool probe) { void __iomem *bar; u16 cmd; u32 cfg; if (dev->class != PCI_CLASS_STORAGE_EXPRESS || pcie_reset_flr(dev, PCI_RESET_PROBE) || !pci_resource_start(dev, 0)) return -ENOTTY; if (probe) return 0; bar = pci_iomap(dev, 0, NVME_REG_CC + sizeof(cfg)); if (!bar) return -ENOTTY; pci_read_config_word(dev, PCI_COMMAND, &cmd); pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY); cfg = readl(bar + NVME_REG_CC); /* Disable controller if enabled */ if (cfg & NVME_CC_ENABLE) { u32 cap = readl(bar + NVME_REG_CAP); unsigned long timeout; /* * Per nvme_disable_ctrl() skip shutdown notification as it * could complete commands to the admin queue. We only intend * to quiesce the device before reset. */ cfg &= ~(NVME_CC_SHN_MASK | NVME_CC_ENABLE); writel(cfg, bar + NVME_REG_CC); /* * Some controllers require an additional delay here, see * NVME_QUIRK_DELAY_BEFORE_CHK_RDY. None of those are yet * supported by this quirk. */ /* Cap register provides max timeout in 500ms increments */ timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies; for (;;) { u32 status = readl(bar + NVME_REG_CSTS); /* Ready status becomes zero on disable complete */ if (!(status & NVME_CSTS_RDY)) break; msleep(100); if (time_after(jiffies, timeout)) { pci_warn(dev, "Timeout waiting for NVMe ready status to clear after disable\n"); break; } } } pci_iounmap(dev, bar); pcie_flr(dev); return 0; } /* * Some NVMe controllers such as Intel DC P3700 and Solidigm P44 Pro will * timeout waiting for ready status to change after NVMe enable if the driver * starts interacting with the device too soon after FLR. A 250ms delay after * FLR has heuristically proven to produce reliably working results for device * assignment cases. */ static int delay_250ms_after_flr(struct pci_dev *dev, bool probe) { if (probe) return pcie_reset_flr(dev, PCI_RESET_PROBE); pcie_reset_flr(dev, PCI_RESET_DO_RESET); msleep(250); return 0; } #define PCI_DEVICE_ID_HINIC_VF 0x375E #define HINIC_VF_FLR_TYPE 0x1000 #define HINIC_VF_FLR_CAP_BIT (1UL << 30) #define HINIC_VF_OP 0xE80 #define HINIC_VF_FLR_PROC_BIT (1UL << 18) #define HINIC_OPERATION_TIMEOUT 15000 /* 15 seconds */ /* Device-specific reset method for Huawei Intelligent NIC virtual functions */ static int reset_hinic_vf_dev(struct pci_dev *pdev, bool probe) { unsigned long timeout; void __iomem *bar; u32 val; if (probe) return 0; bar = pci_iomap(pdev, 0, 0); if (!bar) return -ENOTTY; /* Get and check firmware capabilities */ val = ioread32be(bar + HINIC_VF_FLR_TYPE); if (!(val & HINIC_VF_FLR_CAP_BIT)) { pci_iounmap(pdev, bar); return -ENOTTY; } /* Set HINIC_VF_FLR_PROC_BIT for the start of FLR */ val = ioread32be(bar + HINIC_VF_OP); val = val | HINIC_VF_FLR_PROC_BIT; iowrite32be(val, bar + HINIC_VF_OP); pcie_flr(pdev); /* * The device must recapture its Bus and Device Numbers after FLR * in order generate Completions. Issue a config write to let the * device capture this information. */ pci_write_config_word(pdev, PCI_VENDOR_ID, 0); /* Firmware clears HINIC_VF_FLR_PROC_BIT when reset is complete */ timeout = jiffies + msecs_to_jiffies(HINIC_OPERATION_TIMEOUT); do { val = ioread32be(bar + HINIC_VF_OP); if (!(val & HINIC_VF_FLR_PROC_BIT)) goto reset_complete; msleep(20); } while (time_before(jiffies, timeout)); val = ioread32be(bar + HINIC_VF_OP); if (!(val & HINIC_VF_FLR_PROC_BIT)) goto reset_complete; pci_warn(pdev, "Reset dev timeout, FLR ack reg: %#010x\n", val); reset_complete: pci_iounmap(pdev, bar); return 0; } static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF, reset_intel_82599_sfp_virtfn }, { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M_VGA, reset_ivb_igd }, { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA, reset_ivb_igd }, { PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr }, { PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr }, { PCI_VENDOR_ID_INTEL, 0x0a54, delay_250ms_after_flr }, { PCI_VENDOR_ID_SOLIDIGM, 0xf1ac, delay_250ms_after_flr }, { PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID, reset_chelsio_generic_dev }, { PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HINIC_VF, reset_hinic_vf_dev }, { 0 } }; /* * These device-specific reset methods are here rather than in a driver * because when a host assigns a device to a guest VM, the host may need * to reset the device but probably doesn't have a driver for it. */ int pci_dev_specific_reset(struct pci_dev *dev, bool probe) { const struct pci_dev_reset_methods *i; for (i = pci_dev_reset_methods; i->reset; i++) { if ((i->vendor == dev->vendor || i->vendor == (u16)PCI_ANY_ID) && (i->device == dev->device || i->device == (u16)PCI_ANY_ID)) return i->reset(dev, probe); } return -ENOTTY; } static void quirk_dma_func0_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 0) pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1); } /* * https://bugzilla.redhat.com/show_bug.cgi?id=605888 * * Some Ricoh devices use function 0 as the PCIe requester ID for DMA. */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe832, quirk_dma_func0_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, quirk_dma_func0_alias); /* Some Glenfly chips use function 0 as the PCIe Requester ID for DMA */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_GLENFLY, 0x3d40, quirk_dma_func0_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_GLENFLY, 0x3d41, quirk_dma_func0_alias); static void quirk_dma_func1_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 1) pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1); } /* * Marvell 88SE9123 uses function 1 as the requester ID for DMA. In some * SKUs function 1 is present and is a legacy IDE controller, in other * SKUs this function is not present, making this a ghost requester. * https://bugzilla.kernel.org/show_bug.cgi?id=42679 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9120, quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9123, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c136 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9125, quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9128, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c14 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9130, quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9170, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c47 + c57 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9172, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c59 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x917a, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c78 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9182, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c134 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9183, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c46 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x91a0, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c135 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9215, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c127 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9220, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230, quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235, quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642, quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645, quirk_dma_func1_alias); /* https://bugs.gentoo.org/show_bug.cgi?id=497630 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB388_ESD, quirk_dma_func1_alias); /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c117 */ DECLARE_PCI_FIXUP_HEADER(0x1c28, /* Lite-On */ 0x0122, /* Plextor M6E (Marvell 88SS9183)*/ quirk_dma_func1_alias); /* * Some devices DMA with the wrong devfn, not just the wrong function. * quirk_fixed_dma_alias() uses this table to create fixed aliases, where * the alias is "fixed" and independent of the device devfn. * * For example, the Adaptec 3405 is a PCIe card with an Intel 80333 I/O * processor. To software, this appears as a PCIe-to-PCI/X bridge with a * single device on the secondary bus. In reality, the single exposed * device at 0e.0 is the Address Translation Unit (ATU) of the controller * that provides a bridge to the internal bus of the I/O processor. The * controller supports private devices, which can be hidden from PCI config * space. In the case of the Adaptec 3405, a private device at 01.0 * appears to be the DMA engine, which therefore needs to become a DMA * alias for the device. */ static const struct pci_device_id fixed_dma_alias_tbl[] = { { PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x0285, PCI_VENDOR_ID_ADAPTEC2, 0x02bb), /* Adaptec 3405 */ .driver_data = PCI_DEVFN(1, 0) }, { PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x0285, PCI_VENDOR_ID_ADAPTEC2, 0x02bc), /* Adaptec 3805 */ .driver_data = PCI_DEVFN(1, 0) }, { 0 } }; static void quirk_fixed_dma_alias(struct pci_dev *dev) { const struct pci_device_id *id; id = pci_match_id(fixed_dma_alias_tbl, dev); if (id) pci_add_dma_alias(dev, id->driver_data, 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias); /* * A few PCIe-to-PCI bridges fail to expose a PCIe capability, resulting in * using the wrong DMA alias for the device. Some of these devices can be * used as either forward or reverse bridges, so we need to test whether the * device is operating in the correct mode. We could probably apply this * quirk to PCI_ANY_ID, but for now we'll just use known offenders. The test * is for a non-root, non-PCIe bridge where the upstream device is PCIe and * is not a PCIe-to-PCI bridge, then @pdev is actually a PCIe-to-PCI bridge. */ static void quirk_use_pcie_bridge_dma_alias(struct pci_dev *pdev) { if (!pci_is_root_bus(pdev->bus) && pdev->hdr_type == PCI_HEADER_TYPE_BRIDGE && !pci_is_pcie(pdev) && pci_is_pcie(pdev->bus->self) && pci_pcie_type(pdev->bus->self) != PCI_EXP_TYPE_PCI_BRIDGE) pdev->dev_flags |= PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS; } /* ASM1083/1085, https://bugzilla.kernel.org/show_bug.cgi?id=44881#c46 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_use_pcie_bridge_dma_alias); /* Tundra 8113, https://bugzilla.kernel.org/show_bug.cgi?id=44881#c43 */ DECLARE_PCI_FIXUP_HEADER(0x10e3, 0x8113, quirk_use_pcie_bridge_dma_alias); /* ITE 8892, https://bugzilla.kernel.org/show_bug.cgi?id=73551 */ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias); /* ITE 8893 has the same problem as the 8892 */ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8893, quirk_use_pcie_bridge_dma_alias); /* Intel 82801, https://bugzilla.kernel.org/show_bug.cgi?id=44881#c49 */ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias); /* * MIC x200 NTB forwards PCIe traffic using multiple alien RIDs. They have to * be added as aliases to the DMA device in order to allow buffer access * when IOMMU is enabled. Following devfns have to match RIT-LUT table * programmed in the EEPROM. */ static void quirk_mic_x200_dma_alias(struct pci_dev *pdev) { pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1); pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1); pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, quirk_mic_x200_dma_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, quirk_mic_x200_dma_alias); /* * Intel Visual Compute Accelerator (VCA) is a family of PCIe add-in devices * exposing computational units via Non Transparent Bridges (NTB, PEX 87xx). * * Similarly to MIC x200, we need to add DMA aliases to allow buffer access * when IOMMU is enabled. These aliases allow computational unit access to * host memory. These aliases mark the whole VCA device as one IOMMU * group. * * All possible slot numbers (0x20) are used, since we are unable to tell * what slot is used on other side. This quirk is intended for both host * and computational unit sides. The VCA devices have up to five functions * (four for DMA channels and one additional). */ static void quirk_pex_vca_alias(struct pci_dev *pdev) { const unsigned int num_pci_slots = 0x20; unsigned int slot; for (slot = 0; slot < num_pci_slots; slot++) pci_add_dma_alias(pdev, PCI_DEVFN(slot, 0x0), 5); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2954, quirk_pex_vca_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2955, quirk_pex_vca_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2956, quirk_pex_vca_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2958, quirk_pex_vca_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2959, quirk_pex_vca_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x295A, quirk_pex_vca_alias); /* * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 are * associated not at the root bus, but at a bridge below. This quirk avoids * generating invalid DMA aliases. */ static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev) { pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT; } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000, quirk_bridge_cavm_thrx2_pcie_root); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084, quirk_bridge_cavm_thrx2_pcie_root); /* * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero) * class code. Fix it. */ static void quirk_tw686x_class(struct pci_dev *pdev) { u32 class = pdev->class; /* Use "Multimedia controller" class */ pdev->class = (PCI_CLASS_MULTIMEDIA_OTHER << 8) | 0x01; pci_info(pdev, "TW686x PCI class overridden (%#08x -> %#08x)\n", class, pdev->class); } DECLARE_PCI_FIXUP_CLASS_EARLY(0x1797, 0x6864, PCI_CLASS_NOT_DEFINED, 8, quirk_tw686x_class); DECLARE_PCI_FIXUP_CLASS_EARLY(0x1797, 0x6865, PCI_CLASS_NOT_DEFINED, 8, quirk_tw686x_class); DECLARE_PCI_FIXUP_CLASS_EARLY(0x1797, 0x6868, PCI_CLASS_NOT_DEFINED, 8, quirk_tw686x_class); DECLARE_PCI_FIXUP_CLASS_EARLY(0x1797, 0x6869, PCI_CLASS_NOT_DEFINED, 8, quirk_tw686x_class); /* * Some devices have problems with Transaction Layer Packets with the Relaxed * Ordering Attribute set. Such devices should mark themselves and other * device drivers should check before sending TLPs with RO set. */ static void quirk_relaxedordering_disable(struct pci_dev *dev) { dev->dev_flags |= PCI_DEV_FLAGS_NO_RELAXED_ORDERING; pci_info(dev, "Disable Relaxed Ordering Attributes to avoid PCIe Completion erratum\n"); } /* * Intel Xeon processors based on Broadwell/Haswell microarchitecture Root * Complex have a Flow Control Credit issue which can cause performance * problems with Upstream Transaction Layer Packets with Relaxed Ordering set. */ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f01, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f02, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f03, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f04, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f05, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f06, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f07, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f08, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f09, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0a, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0b, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0c, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0d, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f0e, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f01, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f02, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f03, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f04, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f05, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f06, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f07, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f08, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f09, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0a, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0b, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0c, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0d, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x2f0e, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); /* * The AMD ARM A1100 (aka "SEATTLE") SoC has a bug in its PCIe Root Complex * where Upstream Transaction Layer Packets with the Relaxed Ordering * Attribute clear are allowed to bypass earlier TLPs with Relaxed Ordering * set. This is a violation of the PCIe 3.0 Transaction Ordering Rules * outlined in Section 2.4.1 (PCI Express(r) Base Specification Revision 3.0 * November 10, 2010). As a result, on this platform we can't use Relaxed * Ordering for Upstream TLPs. */ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a00, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a01, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AMD, 0x1a02, PCI_CLASS_NOT_DEFINED, 8, quirk_relaxedordering_disable); /* * Per PCIe r3.0, sec 2.2.9, "Completion headers must supply the same * values for the Attribute as were supplied in the header of the * corresponding Request, except as explicitly allowed when IDO is used." * * If a non-compliant device generates a completion with a different * attribute than the request, the receiver may accept it (which itself * seems non-compliant based on sec 2.3.2), or it may handle it as a * Malformed TLP or an Unexpected Completion, which will probably lead to a * device access timeout. * * If the non-compliant device generates completions with zero attributes * (instead of copying the attributes from the request), we can work around * this by disabling the "Relaxed Ordering" and "No Snoop" attributes in * upstream devices so they always generate requests with zero attributes. * * This affects other devices under the same Root Port, but since these * attributes are performance hints, there should be no functional problem. * * Note that Configuration Space accesses are never supposed to have TLP * Attributes, so we're safe waiting till after any Configuration Space * accesses to do the Root Port fixup. */ static void quirk_disable_root_port_attributes(struct pci_dev *pdev) { struct pci_dev *root_port = pcie_find_root_port(pdev); if (!root_port) { pci_warn(pdev, "PCIe Completion erratum may cause device errors\n"); return; } pci_info(root_port, "Disabling No Snoop/Relaxed Ordering Attributes to avoid PCIe Completion erratum in %s\n", dev_name(&pdev->dev)); pcie_capability_clear_word(root_port, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN | PCI_EXP_DEVCTL_NOSNOOP_EN); } /* * The Chelsio T5 chip fails to copy TLP Attributes from a Request to the * Completion it generates. */ static void quirk_chelsio_T5_disable_root_port_attributes(struct pci_dev *pdev) { /* * This mask/compare operation selects for Physical Function 4 on a * T5. We only need to fix up the Root Port once for any of the * PFs. PF[0..3] have PCI Device IDs of 0x50xx, but PF4 is uniquely * 0x54xx so we use that one. */ if ((pdev->device & 0xff00) == 0x5400) quirk_disable_root_port_attributes(pdev); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID, quirk_chelsio_T5_disable_root_port_attributes); /* * pci_acs_ctrl_enabled - compare desired ACS controls with those provided * by a device * @acs_ctrl_req: Bitmask of desired ACS controls * @acs_ctrl_ena: Bitmask of ACS controls enabled or provided implicitly by * the hardware design * * Return 1 if all ACS controls in the @acs_ctrl_req bitmask are included * in @acs_ctrl_ena, i.e., the device provides all the access controls the * caller desires. Return 0 otherwise. */ static int pci_acs_ctrl_enabled(u16 acs_ctrl_req, u16 acs_ctrl_ena) { if ((acs_ctrl_req & acs_ctrl_ena) == acs_ctrl_req) return 1; return 0; } /* * AMD has indicated that the devices below do not support peer-to-peer * in any system where they are found in the southbridge with an AMD * IOMMU in the system. Multifunction devices that do not support * peer-to-peer between functions can claim to support a subset of ACS. * Such devices effectively enable request redirect (RR) and completion * redirect (CR) since all transactions are redirected to the upstream * root complex. * * https://lore.kernel.org/r/201207111426.q6BEQTbh002928@mail.maya.org/ * https://lore.kernel.org/r/20120711165854.GM25282@amd.com/ * https://lore.kernel.org/r/20121005130857.GX4009@amd.com/ * * 1002:4385 SBx00 SMBus Controller * 1002:439c SB7x0/SB8x0/SB9x0 IDE Controller * 1002:4383 SBx00 Azalia (Intel HDA) * 1002:439d SB7x0/SB8x0/SB9x0 LPC host controller * 1002:4384 SBx00 PCI to PCI Bridge * 1002:4399 SB7x0/SB8x0/SB9x0 USB OHCI2 Controller * * https://bugzilla.kernel.org/show_bug.cgi?id=81841#c15 * * 1022:780f [AMD] FCH PCI Bridge * 1022:7809 [AMD] FCH USB OHCI Controller */ static int pci_quirk_amd_sb_acs(struct pci_dev *dev, u16 acs_flags) { #ifdef CONFIG_ACPI struct acpi_table_header *header = NULL; acpi_status status; /* Targeting multifunction devices on the SB (appears on root bus) */ if (!dev->multifunction || !pci_is_root_bus(dev->bus)) return -ENODEV; /* The IVRS table describes the AMD IOMMU */ status = acpi_get_table("IVRS", 0, &header); if (ACPI_FAILURE(status)) return -ENODEV; acpi_put_table(header); /* Filter out flags not applicable to multifunction */ acs_flags &= (PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC | PCI_ACS_DT); return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_RR | PCI_ACS_CR); #else return -ENODEV; #endif } static bool pci_quirk_cavium_acs_match(struct pci_dev *dev) { if (!pci_is_pcie(dev) || pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) return false; switch (dev->device) { /* * Effectively selects all downstream ports for whole ThunderX1 * (which represents 8 SoCs). */ case 0xa000 ... 0xa7ff: /* ThunderX1 */ case 0xaf84: /* ThunderX2 */ case 0xb884: /* ThunderX3 */ return true; default: return false; } } static int pci_quirk_cavium_acs(struct pci_dev *dev, u16 acs_flags) { if (!pci_quirk_cavium_acs_match(dev)) return -ENOTTY; /* * Cavium Root Ports don't advertise an ACS capability. However, * the RTL internally implements similar protection as if ACS had * Source Validation, Request Redirection, Completion Redirection, * and Upstream Forwarding features enabled. Assert that the * hardware implements and enables equivalent ACS functionality for * these flags. */ return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); } static int pci_quirk_xgene_acs(struct pci_dev *dev, u16 acs_flags) { /* * X-Gene Root Ports matching this quirk do not allow peer-to-peer * transactions with others, allowing masking out these bits as if they * were unimplemented in the ACS capability. */ return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); } /* * Many Zhaoxin Root Ports and Switch Downstream Ports have no ACS capability. * But the implementation could block peer-to-peer transactions between them * and provide ACS-like functionality. */ static int pci_quirk_zhaoxin_pcie_ports_acs(struct pci_dev *dev, u16 acs_flags) { if (!pci_is_pcie(dev) || ((pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && (pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM))) return -ENOTTY; /* * Future Zhaoxin Root Ports and Switch Downstream Ports will * implement ACS capability in accordance with the PCIe Spec. */ switch (dev->device) { case 0x0710 ... 0x071e: case 0x0721: case 0x0723 ... 0x0752: return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); } return false; } /* * Many Intel PCH Root Ports do provide ACS-like features to disable peer * transactions and validate bus numbers in requests, but do not provide an * actual PCIe ACS capability. This is the list of device IDs known to fall * into that category as provided by Intel in Red Hat bugzilla 1037684. */ static const u16 pci_quirk_intel_pch_acs_ids[] = { /* Ibexpeak PCH */ 0x3b42, 0x3b43, 0x3b44, 0x3b45, 0x3b46, 0x3b47, 0x3b48, 0x3b49, 0x3b4a, 0x3b4b, 0x3b4c, 0x3b4d, 0x3b4e, 0x3b4f, 0x3b50, 0x3b51, /* Cougarpoint PCH */ 0x1c10, 0x1c11, 0x1c12, 0x1c13, 0x1c14, 0x1c15, 0x1c16, 0x1c17, 0x1c18, 0x1c19, 0x1c1a, 0x1c1b, 0x1c1c, 0x1c1d, 0x1c1e, 0x1c1f, /* Pantherpoint PCH */ 0x1e10, 0x1e11, 0x1e12, 0x1e13, 0x1e14, 0x1e15, 0x1e16, 0x1e17, 0x1e18, 0x1e19, 0x1e1a, 0x1e1b, 0x1e1c, 0x1e1d, 0x1e1e, 0x1e1f, /* Lynxpoint-H PCH */ 0x8c10, 0x8c11, 0x8c12, 0x8c13, 0x8c14, 0x8c15, 0x8c16, 0x8c17, 0x8c18, 0x8c19, 0x8c1a, 0x8c1b, 0x8c1c, 0x8c1d, 0x8c1e, 0x8c1f, /* Lynxpoint-LP PCH */ 0x9c10, 0x9c11, 0x9c12, 0x9c13, 0x9c14, 0x9c15, 0x9c16, 0x9c17, 0x9c18, 0x9c19, 0x9c1a, 0x9c1b, /* Wildcat PCH */ 0x9c90, 0x9c91, 0x9c92, 0x9c93, 0x9c94, 0x9c95, 0x9c96, 0x9c97, 0x9c98, 0x9c99, 0x9c9a, 0x9c9b, /* Patsburg (X79) PCH */ 0x1d10, 0x1d12, 0x1d14, 0x1d16, 0x1d18, 0x1d1a, 0x1d1c, 0x1d1e, /* Wellsburg (X99) PCH */ 0x8d10, 0x8d11, 0x8d12, 0x8d13, 0x8d14, 0x8d15, 0x8d16, 0x8d17, 0x8d18, 0x8d19, 0x8d1a, 0x8d1b, 0x8d1c, 0x8d1d, 0x8d1e, /* Lynx Point (9 series) PCH */ 0x8c90, 0x8c92, 0x8c94, 0x8c96, 0x8c98, 0x8c9a, 0x8c9c, 0x8c9e, }; static bool pci_quirk_intel_pch_acs_match(struct pci_dev *dev) { int i; /* Filter out a few obvious non-matches first */ if (!pci_is_pcie(dev) || pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) return false; for (i = 0; i < ARRAY_SIZE(pci_quirk_intel_pch_acs_ids); i++) if (pci_quirk_intel_pch_acs_ids[i] == dev->device) return true; return false; } static int pci_quirk_intel_pch_acs(struct pci_dev *dev, u16 acs_flags) { if (!pci_quirk_intel_pch_acs_match(dev)) return -ENOTTY; if (dev->dev_flags & PCI_DEV_FLAGS_ACS_ENABLED_QUIRK) return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); return pci_acs_ctrl_enabled(acs_flags, 0); } /* * These QCOM Root Ports do provide ACS-like features to disable peer * transactions and validate bus numbers in requests, but do not provide an * actual PCIe ACS capability. Hardware supports source validation but it * will report the issue as Completer Abort instead of ACS Violation. * Hardware doesn't support peer-to-peer and each Root Port is a Root * Complex with unique segment numbers. It is not possible for one Root * Port to pass traffic to another Root Port. All PCIe transactions are * terminated inside the Root Port. */ static int pci_quirk_qcom_rp_acs(struct pci_dev *dev, u16 acs_flags) { return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); } /* * Each of these NXP Root Ports is in a Root Complex with a unique segment * number and does provide isolation features to disable peer transactions * and validate bus numbers in requests, but does not provide an ACS * capability. */ static int pci_quirk_nxp_rp_acs(struct pci_dev *dev, u16 acs_flags) { return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); } static int pci_quirk_al_acs(struct pci_dev *dev, u16 acs_flags) { if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) return -ENOTTY; /* * Amazon's Annapurna Labs root ports don't include an ACS capability, * but do include ACS-like functionality. The hardware doesn't support * peer-to-peer transactions via the root port and each has a unique * segment number. * * Additionally, the root ports cannot send traffic to each other. */ acs_flags &= ~(PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); return acs_flags ? 0 : 1; } /* * Sunrise Point PCH root ports implement ACS, but unfortunately as shown in * the datasheet (Intel 100 Series Chipset Family PCH Datasheet, Vol. 2, * 12.1.46, 12.1.47)[1] this chipset uses dwords for the ACS capability and * control registers whereas the PCIe spec packs them into words (Rev 3.0, * 7.16 ACS Extended Capability). The bit definitions are correct, but the * control register is at offset 8 instead of 6 and we should probably use * dword accesses to them. This applies to the following PCI Device IDs, as * found in volume 1 of the datasheet[2]: * * 0xa110-0xa11f Sunrise Point-H PCI Express Root Port #{0-16} * 0xa167-0xa16a Sunrise Point-H PCI Express Root Port #{17-20} * * N.B. This doesn't fix what lspci shows. * * The 100 series chipset specification update includes this as errata #23[3]. * * The 200 series chipset (Union Point) has the same bug according to the * specification update (Intel 200 Series Chipset Family Platform Controller * Hub, Specification Update, January 2017, Revision 001, Document# 335194-001, * Errata 22)[4]. Per the datasheet[5], root port PCI Device IDs for this * chipset include: * * 0xa290-0xa29f PCI Express Root port #{0-16} * 0xa2e7-0xa2ee PCI Express Root port #{17-24} * * Mobile chipsets are also affected, 7th & 8th Generation * Specification update confirms ACS errata 22, status no fix: (7th Generation * Intel Processor Family I/O for U/Y Platforms and 8th Generation Intel * Processor Family I/O for U Quad Core Platforms Specification Update, * August 2017, Revision 002, Document#: 334660-002)[6] * Device IDs from I/O datasheet: (7th Generation Intel Processor Family I/O * for U/Y Platforms and 8th Generation Intel ® Processor Family I/O for U * Quad Core Platforms, Vol 1 of 2, August 2017, Document#: 334658-003)[7] * * 0x9d10-0x9d1b PCI Express Root port #{1-12} * * [1] https://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-2.html * [2] https://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-1.html * [3] https://www.intel.com/content/www/us/en/chipsets/100-series-chipset-spec-update.html * [4] https://www.intel.com/content/www/us/en/chipsets/200-series-chipset-pch-spec-update.html * [5] https://www.intel.com/content/www/us/en/chipsets/200-series-chipset-pch-datasheet-vol-1.html * [6] https://www.intel.com/content/www/us/en/processors/core/7th-gen-core-family-mobile-u-y-processor-lines-i-o-spec-update.html * [7] https://www.intel.com/content/www/us/en/processors/core/7th-gen-core-family-mobile-u-y-processor-lines-i-o-datasheet-vol-1.html */ static bool pci_quirk_intel_spt_pch_acs_match(struct pci_dev *dev) { if (!pci_is_pcie(dev) || pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) return false; switch (dev->device) { case 0xa110 ... 0xa11f: case 0xa167 ... 0xa16a: /* Sunrise Point */ case 0xa290 ... 0xa29f: case 0xa2e7 ... 0xa2ee: /* Union Point */ case 0x9d10 ... 0x9d1b: /* 7th & 8th Gen Mobile */ return true; } return false; } #define INTEL_SPT_ACS_CTRL (PCI_ACS_CAP + 4) static int pci_quirk_intel_spt_pch_acs(struct pci_dev *dev, u16 acs_flags) { int pos; u32 cap, ctrl; if (!pci_quirk_intel_spt_pch_acs_match(dev)) return -ENOTTY; pos = dev->acs_cap; if (!pos) return -ENOTTY; /* see pci_acs_flags_enabled() */ pci_read_config_dword(dev, pos + PCI_ACS_CAP, &cap); acs_flags &= (cap | PCI_ACS_EC); pci_read_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, &ctrl); return pci_acs_ctrl_enabled(acs_flags, ctrl); } static int pci_quirk_mf_endpoint_acs(struct pci_dev *dev, u16 acs_flags) { /* * SV, TB, and UF are not relevant to multifunction endpoints. * * Multifunction devices are only required to implement RR, CR, and DT * in their ACS capability if they support peer-to-peer transactions. * Devices matching this quirk have been verified by the vendor to not * perform peer-to-peer with other functions, allowing us to mask out * these bits as if they were unimplemented in the ACS capability. */ return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_TB | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_DT); } static int pci_quirk_rciep_acs(struct pci_dev *dev, u16 acs_flags) { /* * Intel RCiEP's are required to allow p2p only on translated * addresses. Refer to Intel VT-d specification, r3.1, sec 3.16, * "Root-Complex Peer to Peer Considerations". */ if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_END) return -ENOTTY; return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); } static int pci_quirk_brcm_acs(struct pci_dev *dev, u16 acs_flags) { /* * iProc PAXB Root Ports don't advertise an ACS capability, but * they do not allow peer-to-peer transactions between Root Ports. * Allow each Root Port to be in a separate IOMMU group by masking * SV/RR/CR/UF bits. */ return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); } /* * Wangxun 40G/25G/10G/1G NICs have no ACS capability, but on * multi-function devices, the hardware isolates the functions by * directing all peer-to-peer traffic upstream as though PCI_ACS_RR and * PCI_ACS_CR were set. * SFxxx 1G NICs(em). * RP1000/RP2000 10G NICs(sp). * FF5xxx 40G/25G/10G NICs(aml). */ static int pci_quirk_wangxun_nic_acs(struct pci_dev *dev, u16 acs_flags) { switch (dev->device) { case 0x0100 ... 0x010F: /* EM */ case 0x1001: case 0x2001: /* SP */ case 0x5010: case 0x5025: case 0x5040: /* AML */ case 0x5110: case 0x5125: case 0x5140: /* AML */ return pci_acs_ctrl_enabled(acs_flags, PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); } return false; } static const struct pci_dev_acs_enabled { u16 vendor; u16 device; int (*acs_enabled)(struct pci_dev *dev, u16 acs_flags); } pci_dev_acs_enabled[] = { { PCI_VENDOR_ID_ATI, 0x4385, pci_quirk_amd_sb_acs }, { PCI_VENDOR_ID_ATI, 0x439c, pci_quirk_amd_sb_acs }, { PCI_VENDOR_ID_ATI, 0x4383, pci_quirk_amd_sb_acs }, { PCI_VENDOR_ID_ATI, 0x439d, pci_quirk_amd_sb_acs }, { PCI_VENDOR_ID_ATI, 0x4384, pci_quirk_amd_sb_acs }, { PCI_VENDOR_ID_ATI, 0x4399, pci_quirk_amd_sb_acs }, { PCI_VENDOR_ID_AMD, 0x780f, pci_quirk_amd_sb_acs }, { PCI_VENDOR_ID_AMD, 0x7809, pci_quirk_amd_sb_acs }, { PCI_VENDOR_ID_SOLARFLARE, 0x0903, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_SOLARFLARE, 0x0923, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_SOLARFLARE, 0x0A03, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10C6, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10DB, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10DD, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10E1, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10F1, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10F7, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10F8, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10F9, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10FA, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10FB, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10FC, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1507, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1514, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x151C, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1529, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x152A, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x154D, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x154F, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1551, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1558, pci_quirk_mf_endpoint_acs }, /* 82580 */ { PCI_VENDOR_ID_INTEL, 0x1509, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x150E, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x150F, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1510, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1511, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1516, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1527, pci_quirk_mf_endpoint_acs }, /* 82576 */ { PCI_VENDOR_ID_INTEL, 0x10C9, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10E6, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10E7, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10E8, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x150A, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x150D, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1518, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1526, pci_quirk_mf_endpoint_acs }, /* 82575 */ { PCI_VENDOR_ID_INTEL, 0x10A7, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10A9, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10D6, pci_quirk_mf_endpoint_acs }, /* I350 */ { PCI_VENDOR_ID_INTEL, 0x1521, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1522, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1523, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1524, pci_quirk_mf_endpoint_acs }, /* 82571 (Quads omitted due to non-ACS switch) */ { PCI_VENDOR_ID_INTEL, 0x105E, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x105F, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1060, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x10D9, pci_quirk_mf_endpoint_acs }, /* I219 */ { PCI_VENDOR_ID_INTEL, 0x15b7, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x15b8, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_rciep_acs }, /* QCOM QDF2xxx root ports */ { PCI_VENDOR_ID_QCOM, 0x0400, pci_quirk_qcom_rp_acs }, { PCI_VENDOR_ID_QCOM, 0x0401, pci_quirk_qcom_rp_acs }, /* QCOM SA8775P root port */ { PCI_VENDOR_ID_QCOM, 0x0115, pci_quirk_qcom_rp_acs }, /* HXT SD4800 root ports. The ACS design is same as QCOM QDF2xxx */ { PCI_VENDOR_ID_HXT, 0x0401, pci_quirk_qcom_rp_acs }, /* Intel PCH root ports */ { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_intel_pch_acs }, { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_intel_spt_pch_acs }, { 0x19a2, 0x710, pci_quirk_mf_endpoint_acs }, /* Emulex BE3-R */ { 0x10df, 0x720, pci_quirk_mf_endpoint_acs }, /* Emulex Skyhawk-R */ /* Cavium ThunderX */ { PCI_VENDOR_ID_CAVIUM, PCI_ANY_ID, pci_quirk_cavium_acs }, /* Cavium multi-function devices */ { PCI_VENDOR_ID_CAVIUM, 0xA026, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_CAVIUM, 0xA059, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_CAVIUM, 0xA060, pci_quirk_mf_endpoint_acs }, /* APM X-Gene */ { PCI_VENDOR_ID_AMCC, 0xE004, pci_quirk_xgene_acs }, /* Ampere Computing */ { PCI_VENDOR_ID_AMPERE, 0xE005, pci_quirk_xgene_acs }, { PCI_VENDOR_ID_AMPERE, 0xE006, pci_quirk_xgene_acs }, { PCI_VENDOR_ID_AMPERE, 0xE007, pci_quirk_xgene_acs }, { PCI_VENDOR_ID_AMPERE, 0xE008, pci_quirk_xgene_acs }, { PCI_VENDOR_ID_AMPERE, 0xE009, pci_quirk_xgene_acs }, { PCI_VENDOR_ID_AMPERE, 0xE00A, pci_quirk_xgene_acs }, { PCI_VENDOR_ID_AMPERE, 0xE00B, pci_quirk_xgene_acs }, { PCI_VENDOR_ID_AMPERE, 0xE00C, pci_quirk_xgene_acs }, /* Broadcom multi-function device */ { PCI_VENDOR_ID_BROADCOM, 0x16D7, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1750, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1751, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1752, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1760, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1761, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1762, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0x1763, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_BROADCOM, 0xD714, pci_quirk_brcm_acs }, /* Amazon Annapurna Labs */ { PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS, 0x0031, pci_quirk_al_acs }, /* Zhaoxin multi-function devices */ { PCI_VENDOR_ID_ZHAOXIN, 0x3038, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_ZHAOXIN, 0x3104, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_ZHAOXIN, 0x9083, pci_quirk_mf_endpoint_acs }, /* NXP root ports, xx=16, 12, or 08 cores */ /* LX2xx0A : without security features + CAN-FD */ { PCI_VENDOR_ID_NXP, 0x8d81, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8da1, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8d83, pci_quirk_nxp_rp_acs }, /* LX2xx0C : security features + CAN-FD */ { PCI_VENDOR_ID_NXP, 0x8d80, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8da0, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8d82, pci_quirk_nxp_rp_acs }, /* LX2xx0E : security features + CAN */ { PCI_VENDOR_ID_NXP, 0x8d90, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8db0, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8d92, pci_quirk_nxp_rp_acs }, /* LX2xx0N : without security features + CAN */ { PCI_VENDOR_ID_NXP, 0x8d91, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8db1, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8d93, pci_quirk_nxp_rp_acs }, /* LX2xx2A : without security features + CAN-FD */ { PCI_VENDOR_ID_NXP, 0x8d89, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8da9, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8d8b, pci_quirk_nxp_rp_acs }, /* LX2xx2C : security features + CAN-FD */ { PCI_VENDOR_ID_NXP, 0x8d88, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8da8, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8d8a, pci_quirk_nxp_rp_acs }, /* LX2xx2E : security features + CAN */ { PCI_VENDOR_ID_NXP, 0x8d98, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8db8, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8d9a, pci_quirk_nxp_rp_acs }, /* LX2xx2N : without security features + CAN */ { PCI_VENDOR_ID_NXP, 0x8d99, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8db9, pci_quirk_nxp_rp_acs }, { PCI_VENDOR_ID_NXP, 0x8d9b, pci_quirk_nxp_rp_acs }, /* Zhaoxin Root/Downstream Ports */ { PCI_VENDOR_ID_ZHAOXIN, PCI_ANY_ID, pci_quirk_zhaoxin_pcie_ports_acs }, /* Wangxun nics */ { PCI_VENDOR_ID_WANGXUN, PCI_ANY_ID, pci_quirk_wangxun_nic_acs }, { 0 } }; /* * pci_dev_specific_acs_enabled - check whether device provides ACS controls * @dev: PCI device * @acs_flags: Bitmask of desired ACS controls * * Returns: * -ENOTTY: No quirk applies to this device; we can't tell whether the * device provides the desired controls * 0: Device does not provide all the desired controls * >0: Device provides all the controls in @acs_flags */ int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags) { const struct pci_dev_acs_enabled *i; int ret; /* * Allow devices that do not expose standard PCIe ACS capabilities * or control to indicate their support here. Multi-function express * devices which do not allow internal peer-to-peer between functions, * but do not implement PCIe ACS may wish to return true here. */ for (i = pci_dev_acs_enabled; i->acs_enabled; i++) { if ((i->vendor == dev->vendor || i->vendor == (u16)PCI_ANY_ID) && (i->device == dev->device || i->device == (u16)PCI_ANY_ID)) { ret = i->acs_enabled(dev, acs_flags); if (ret >= 0) return ret; } } return -ENOTTY; } /* Config space offset of Root Complex Base Address register */ #define INTEL_LPC_RCBA_REG 0xf0 /* 31:14 RCBA address */ #define INTEL_LPC_RCBA_MASK 0xffffc000 /* RCBA Enable */ #define INTEL_LPC_RCBA_ENABLE (1 << 0) /* Backbone Scratch Pad Register */ #define INTEL_BSPR_REG 0x1104 /* Backbone Peer Non-Posted Disable */ #define INTEL_BSPR_REG_BPNPD (1 << 8) /* Backbone Peer Posted Disable */ #define INTEL_BSPR_REG_BPPD (1 << 9) /* Upstream Peer Decode Configuration Register */ #define INTEL_UPDCR_REG 0x1014 /* 5:0 Peer Decode Enable bits */ #define INTEL_UPDCR_REG_MASK 0x3f static int pci_quirk_enable_intel_lpc_acs(struct pci_dev *dev) { u32 rcba, bspr, updcr; void __iomem *rcba_mem; /* * Read the RCBA register from the LPC (D31:F0). PCH root ports * are D28:F* and therefore get probed before LPC, thus we can't * use pci_get_slot()/pci_read_config_dword() here. */ pci_bus_read_config_dword(dev->bus, PCI_DEVFN(31, 0), INTEL_LPC_RCBA_REG, &rcba); if (!(rcba & INTEL_LPC_RCBA_ENABLE)) return -EINVAL; rcba_mem = ioremap(rcba & INTEL_LPC_RCBA_MASK, PAGE_ALIGN(INTEL_UPDCR_REG)); if (!rcba_mem) return -ENOMEM; /* * The BSPR can disallow peer cycles, but it's set by soft strap and * therefore read-only. If both posted and non-posted peer cycles are * disallowed, we're ok. If either are allowed, then we need to use * the UPDCR to disable peer decodes for each port. This provides the * PCIe ACS equivalent of PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF */ bspr = readl(rcba_mem + INTEL_BSPR_REG); bspr &= INTEL_BSPR_REG_BPNPD | INTEL_BSPR_REG_BPPD; if (bspr != (INTEL_BSPR_REG_BPNPD | INTEL_BSPR_REG_BPPD)) { updcr = readl(rcba_mem + INTEL_UPDCR_REG); if (updcr & INTEL_UPDCR_REG_MASK) { pci_info(dev, "Disabling UPDCR peer decodes\n"); updcr &= ~INTEL_UPDCR_REG_MASK; writel(updcr, rcba_mem + INTEL_UPDCR_REG); } } iounmap(rcba_mem); return 0; } /* Miscellaneous Port Configuration register */ #define INTEL_MPC_REG 0xd8 /* MPC: Invalid Receive Bus Number Check Enable */ #define INTEL_MPC_REG_IRBNCE (1 << 26) static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev) { u32 mpc; /* * When enabled, the IRBNCE bit of the MPC register enables the * equivalent of PCI ACS Source Validation (PCI_ACS_SV), which * ensures that requester IDs fall within the bus number range * of the bridge. Enable if not already. */ pci_read_config_dword(dev, INTEL_MPC_REG, &mpc); if (!(mpc & INTEL_MPC_REG_IRBNCE)) { pci_info(dev, "Enabling MPC IRBNCE\n"); mpc |= INTEL_MPC_REG_IRBNCE; pci_write_config_word(dev, INTEL_MPC_REG, mpc); } } /* * Currently this quirk does the equivalent of * PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF * * TODO: This quirk also needs to do equivalent of PCI_ACS_TB, * if dev->external_facing || dev->untrusted */ static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev) { if (!pci_quirk_intel_pch_acs_match(dev)) return -ENOTTY; if (pci_quirk_enable_intel_lpc_acs(dev)) { pci_warn(dev, "Failed to enable Intel PCH ACS quirk\n"); return 0; } pci_quirk_enable_intel_rp_mpc_acs(dev); dev->dev_flags |= PCI_DEV_FLAGS_ACS_ENABLED_QUIRK; pci_info(dev, "Intel PCH root port ACS workaround enabled\n"); return 0; } static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev) { int pos; u32 cap, ctrl; if (!pci_quirk_intel_spt_pch_acs_match(dev)) return -ENOTTY; pos = dev->acs_cap; if (!pos) return -ENOTTY; pci_read_config_dword(dev, pos + PCI_ACS_CAP, &cap); pci_read_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, &ctrl); ctrl |= (cap & PCI_ACS_SV); ctrl |= (cap & PCI_ACS_RR); ctrl |= (cap & PCI_ACS_CR); ctrl |= (cap & PCI_ACS_UF); if (pci_ats_disabled() || dev->external_facing || dev->untrusted) ctrl |= (cap & PCI_ACS_TB); pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl); pci_info(dev, "Intel SPT PCH root port ACS workaround enabled\n"); return 0; } static int pci_quirk_disable_intel_spt_pch_acs_redir(struct pci_dev *dev) { int pos; u32 cap, ctrl; if (!pci_quirk_intel_spt_pch_acs_match(dev)) return -ENOTTY; pos = dev->acs_cap; if (!pos) return -ENOTTY; pci_read_config_dword(dev, pos + PCI_ACS_CAP, &cap); pci_read_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, &ctrl); ctrl &= ~(PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC); pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl); pci_info(dev, "Intel SPT PCH root port workaround: disabled ACS redirect\n"); return 0; } static const struct pci_dev_acs_ops { u16 vendor; u16 device; int (*enable_acs)(struct pci_dev *dev); int (*disable_acs_redir)(struct pci_dev *dev); } pci_dev_acs_ops[] = { { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, .enable_acs = pci_quirk_enable_intel_pch_acs, }, { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, .enable_acs = pci_quirk_enable_intel_spt_pch_acs, .disable_acs_redir = pci_quirk_disable_intel_spt_pch_acs_redir, }, }; int pci_dev_specific_enable_acs(struct pci_dev *dev) { const struct pci_dev_acs_ops *p; int i, ret; for (i = 0; i < ARRAY_SIZE(pci_dev_acs_ops); i++) { p = &pci_dev_acs_ops[i]; if ((p->vendor == dev->vendor || p->vendor == (u16)PCI_ANY_ID) && (p->device == dev->device || p->device == (u16)PCI_ANY_ID) && p->enable_acs) { ret = p->enable_acs(dev); if (ret >= 0) return ret; } } return -ENOTTY; } int pci_dev_specific_disable_acs_redir(struct pci_dev *dev) { const struct pci_dev_acs_ops *p; int i, ret; for (i = 0; i < ARRAY_SIZE(pci_dev_acs_ops); i++) { p = &pci_dev_acs_ops[i]; if ((p->vendor == dev->vendor || p->vendor == (u16)PCI_ANY_ID) && (p->device == dev->device || p->device == (u16)PCI_ANY_ID) && p->disable_acs_redir) { ret = p->disable_acs_redir(dev); if (ret >= 0) return ret; } } return -ENOTTY; } /* * The PCI capabilities list for Intel DH895xCC VFs (device ID 0x0443) with * QuickAssist Technology (QAT) is prematurely terminated in hardware. The * Next Capability pointer in the MSI Capability Structure should point to * the PCIe Capability Structure but is incorrectly hardwired as 0 terminating * the list. */ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev) { int pos, i = 0, ret; u8 next_cap; u16 reg16, *cap; struct pci_cap_saved_state *state; /* Bail if the hardware bug is fixed */ if (pdev->pcie_cap || pci_find_capability(pdev, PCI_CAP_ID_EXP)) return; /* Bail if MSI Capability Structure is not found for some reason */ pos = pci_find_capability(pdev, PCI_CAP_ID_MSI); if (!pos) return; /* * Bail if Next Capability pointer in the MSI Capability Structure * is not the expected incorrect 0x00. */ pci_read_config_byte(pdev, pos + 1, &next_cap); if (next_cap) return; /* * PCIe Capability Structure is expected to be at 0x50 and should * terminate the list (Next Capability pointer is 0x00). Verify * Capability Id and Next Capability pointer is as expected. * Open-code some of set_pcie_port_type() and pci_cfg_space_size_ext() * to correctly set kernel data structures which have already been * set incorrectly due to the hardware bug. */ pos = 0x50; pci_read_config_word(pdev, pos, ®16); if (reg16 == (0x0000 | PCI_CAP_ID_EXP)) { u32 status; #ifndef PCI_EXP_SAVE_REGS #define PCI_EXP_SAVE_REGS 7 #endif int size = PCI_EXP_SAVE_REGS * sizeof(u16); pdev->pcie_cap = pos; pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, ®16); pdev->pcie_flags_reg = reg16; pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, ®16); pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD; pdev->cfg_size = PCI_CFG_SPACE_EXP_SIZE; ret = pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status); if ((ret != PCIBIOS_SUCCESSFUL) || (PCI_POSSIBLE_ERROR(status))) pdev->cfg_size = PCI_CFG_SPACE_SIZE; if (pci_find_saved_cap(pdev, PCI_CAP_ID_EXP)) return; /* Save PCIe cap */ state = kzalloc(sizeof(*state) + size, GFP_KERNEL); if (!state) return; state->cap.cap_nr = PCI_CAP_ID_EXP; state->cap.cap_extended = 0; state->cap.size = size; cap = (u16 *)&state->cap.data[0]; pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap[i++]); pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &cap[i++]); pcie_capability_read_word(pdev, PCI_EXP_SLTCTL, &cap[i++]); pcie_capability_read_word(pdev, PCI_EXP_RTCTL, &cap[i++]); pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &cap[i++]); pcie_capability_read_word(pdev, PCI_EXP_LNKCTL2, &cap[i++]); pcie_capability_read_word(pdev, PCI_EXP_SLTCTL2, &cap[i++]); hlist_add_head(&state->next, &pdev->saved_cap_space); } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); /* * FLR may cause the following to devices to hang: * * AMD Starship/Matisse HD Audio Controller 0x1487 * AMD Starship USB 3.0 Host Controller 0x148c * AMD Matisse USB 3.0 Host Controller 0x149c * Intel 82579LM Gigabit Ethernet Controller 0x1502 * Intel 82579V Gigabit Ethernet Controller 0x1503 * Mediatek MT7922 802.11ax PCI Express Wireless Network Adapter */ static void quirk_no_flr(struct pci_dev *dev) { dev->dev_flags |= PCI_DEV_FLAGS_NO_FLR_RESET; } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x1487, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x148c, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x149c, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr); /* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */ static void quirk_no_flr_snet(struct pci_dev *dev) { if (dev->revision == 0x1) quirk_no_flr(dev); } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SOLIDRUN, 0x1000, quirk_no_flr_snet); static void quirk_no_ext_tags(struct pci_dev *pdev) { struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus); if (!bridge) return; bridge->no_ext_tags = 1; pci_info(pdev, "disabling Extended Tags (this device can't handle them)\n"); pci_walk_bus(bridge->bus, pci_configure_extended_tags, NULL); } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_3WARE, 0x1004, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0132, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0140, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0141, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0142, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0144, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0420, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0422, quirk_no_ext_tags); #ifdef CONFIG_PCI_ATS static void quirk_no_ats(struct pci_dev *pdev) { pci_info(pdev, "disabling ATS\n"); pdev->ats_cap = 0; } /* * Some devices require additional driver setup to enable ATS. Don't use * ATS for those devices as ATS will be enabled before the driver has had a * chance to load and configure the device. */ static void quirk_amd_harvest_no_ats(struct pci_dev *pdev) { if (pdev->device == 0x15d8) { if (pdev->revision == 0xcf && pdev->subsystem_vendor == 0xea50 && (pdev->subsystem_device == 0xce19 || pdev->subsystem_device == 0xcc10 || pdev->subsystem_device == 0xcc08)) quirk_no_ats(pdev); } else { quirk_no_ats(pdev); } } /* AMD Stoney platform GPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_amd_harvest_no_ats); /* AMD Iceland dGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6900, quirk_amd_harvest_no_ats); /* AMD Navi10 dGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7310, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7312, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7318, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7319, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x731a, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x731b, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x731e, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x731f, quirk_amd_harvest_no_ats); /* AMD Navi14 dGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7340, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7341, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7347, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x734f, quirk_amd_harvest_no_ats); /* AMD Raven platform iGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x15d8, quirk_amd_harvest_no_ats); /* * Intel IPU E2000 revisions before C0 implement incorrect endianness * in ATS Invalidate Request message body. Disable ATS for those devices. */ static void quirk_intel_e2000_no_ats(struct pci_dev *pdev) { if (pdev->revision < 0x20) quirk_no_ats(pdev); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1451, quirk_intel_e2000_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1452, quirk_intel_e2000_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1453, quirk_intel_e2000_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1454, quirk_intel_e2000_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1455, quirk_intel_e2000_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1457, quirk_intel_e2000_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1459, quirk_intel_e2000_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x145a, quirk_intel_e2000_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x145c, quirk_intel_e2000_no_ats); #endif /* CONFIG_PCI_ATS */ /* Freescale PCIe doesn't support MSI in RC mode */ static void quirk_fsl_no_msi(struct pci_dev *pdev) { if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) pdev->no_msi = 1; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, quirk_fsl_no_msi); /* * Although not allowed by the spec, some multi-function devices have * dependencies of one function (consumer) on another (supplier). For the * consumer to work in D0, the supplier must also be in D0. Create a * device link from the consumer to the supplier to enforce this * dependency. Runtime PM is allowed by default on the consumer to prevent * it from permanently keeping the supplier awake. */ static void pci_create_device_link(struct pci_dev *pdev, unsigned int consumer, unsigned int supplier, unsigned int class, unsigned int class_shift) { struct pci_dev *supplier_pdev; if (PCI_FUNC(pdev->devfn) != consumer) return; supplier_pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), pdev->bus->number, PCI_DEVFN(PCI_SLOT(pdev->devfn), supplier)); if (!supplier_pdev || (supplier_pdev->class >> class_shift) != class) { pci_dev_put(supplier_pdev); return; } if (device_link_add(&pdev->dev, &supplier_pdev->dev, DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME)) pci_info(pdev, "D0 power state depends on %s\n", pci_name(supplier_pdev)); else pci_err(pdev, "Cannot enforce power dependency on %s\n", pci_name(supplier_pdev)); pm_runtime_allow(&pdev->dev); pci_dev_put(supplier_pdev); } /* * Create device link for GPUs with integrated HDA controller for streaming * audio to attached displays. */ static void quirk_gpu_hda(struct pci_dev *hda) { pci_create_device_link(hda, 1, 0, PCI_BASE_CLASS_DISPLAY, 16); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_ATI, PCI_ANY_ID, PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda); DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_AMD, PCI_ANY_ID, PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda); DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda); /* * Create device link for GPUs with integrated USB xHCI Host * controller to VGA. */ static void quirk_gpu_usb(struct pci_dev *usb) { pci_create_device_link(usb, 2, 0, PCI_BASE_CLASS_DISPLAY, 16); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, PCI_CLASS_SERIAL_USB, 8, quirk_gpu_usb); DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_ATI, PCI_ANY_ID, PCI_CLASS_SERIAL_USB, 8, quirk_gpu_usb); /* * Create device link for GPUs with integrated Type-C UCSI controller * to VGA. Currently there is no class code defined for UCSI device over PCI * so using UNKNOWN class for now and it will be updated when UCSI * over PCI gets a class code. */ #define PCI_CLASS_SERIAL_UNKNOWN 0x0c80 static void quirk_gpu_usb_typec_ucsi(struct pci_dev *ucsi) { pci_create_device_link(ucsi, 3, 0, PCI_BASE_CLASS_DISPLAY, 16); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, PCI_CLASS_SERIAL_UNKNOWN, 8, quirk_gpu_usb_typec_ucsi); DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_ATI, PCI_ANY_ID, PCI_CLASS_SERIAL_UNKNOWN, 8, quirk_gpu_usb_typec_ucsi); /* * Enable the NVIDIA GPU integrated HDA controller if the BIOS left it * disabled. https://devtalk.nvidia.com/default/topic/1024022 */ static void quirk_nvidia_hda(struct pci_dev *gpu) { u8 hdr_type; u32 val; /* There was no integrated HDA controller before MCP89 */ if (gpu->device < PCI_DEVICE_ID_NVIDIA_GEFORCE_320M) return; /* Bit 25 at offset 0x488 enables the HDA controller */ pci_read_config_dword(gpu, 0x488, &val); if (val & BIT(25)) return; pci_info(gpu, "Enabling HDA controller\n"); pci_write_config_dword(gpu, 0x488, val | BIT(25)); /* The GPU becomes a multi-function device when the HDA is enabled */ pci_read_config_byte(gpu, PCI_HEADER_TYPE, &hdr_type); gpu->multifunction = FIELD_GET(PCI_HEADER_TYPE_MFD, hdr_type); } DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, PCI_BASE_CLASS_DISPLAY, 16, quirk_nvidia_hda); DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, PCI_BASE_CLASS_DISPLAY, 16, quirk_nvidia_hda); /* * Some IDT switches incorrectly flag an ACS Source Validation error on * completions for config read requests even though PCIe r4.0, sec * 6.12.1.1, says that completions are never affected by ACS Source * Validation. Here's the text of IDT 89H32H8G3-YC, erratum #36: * * Item #36 - Downstream port applies ACS Source Validation to Completions * Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that * completions are never affected by ACS Source Validation. However, * completions received by a downstream port of the PCIe switch from a * device that has not yet captured a PCIe bus number are incorrectly * dropped by ACS Source Validation by the switch downstream port. * * The workaround suggested by IDT is to issue a config write to the * downstream device before issuing the first config read. This allows the * downstream device to capture its bus and device numbers (see PCIe r4.0, * sec 2.2.9), thus avoiding the ACS error on the completion. * * However, we don't know when the device is ready to accept the config * write, so we do config reads until we receive a non-Config Request Retry * Status, then do the config write. * * To avoid hitting the erratum when doing the config reads, we disable ACS * SV around this process. */ int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *l, int timeout) { int pos; u16 ctrl = 0; bool found; struct pci_dev *bridge = bus->self; pos = bridge->acs_cap; /* Disable ACS SV before initial config reads */ if (pos) { pci_read_config_word(bridge, pos + PCI_ACS_CTRL, &ctrl); if (ctrl & PCI_ACS_SV) pci_write_config_word(bridge, pos + PCI_ACS_CTRL, ctrl & ~PCI_ACS_SV); } found = pci_bus_generic_read_dev_vendor_id(bus, devfn, l, timeout); /* Write Vendor ID (read-only) so the endpoint latches its bus/dev */ if (found) pci_bus_write_config_word(bus, devfn, PCI_VENDOR_ID, 0); /* Re-enable ACS_SV if it was previously enabled */ if (ctrl & PCI_ACS_SV) pci_write_config_word(bridge, pos + PCI_ACS_CTRL, ctrl); return found; } /* * Microsemi Switchtec NTB uses devfn proxy IDs to move TLPs between * NT endpoints via the internal switch fabric. These IDs replace the * originating Requester ID TLPs which access host memory on peer NTB * ports. Therefore, all proxy IDs must be aliased to the NTB device * to permit access when the IOMMU is turned on. */ static void quirk_switchtec_ntb_dma_alias(struct pci_dev *pdev) { void __iomem *mmio; struct ntb_info_regs __iomem *mmio_ntb; struct ntb_ctrl_regs __iomem *mmio_ctrl; u64 partition_map; u8 partition; int pp; if (pci_enable_device(pdev)) { pci_err(pdev, "Cannot enable Switchtec device\n"); return; } mmio = pci_iomap(pdev, 0, 0); if (mmio == NULL) { pci_disable_device(pdev); pci_err(pdev, "Cannot iomap Switchtec device\n"); return; } pci_info(pdev, "Setting Switchtec proxy ID aliases\n"); mmio_ntb = mmio + SWITCHTEC_GAS_NTB_OFFSET; mmio_ctrl = (void __iomem *) mmio_ntb + SWITCHTEC_NTB_REG_CTRL_OFFSET; partition = ioread8(&mmio_ntb->partition_id); partition_map = ioread32(&mmio_ntb->ep_map); partition_map |= ((u64) ioread32(&mmio_ntb->ep_map + 4)) << 32; partition_map &= ~(1ULL << partition); for (pp = 0; pp < (sizeof(partition_map) * 8); pp++) { struct ntb_ctrl_regs __iomem *mmio_peer_ctrl; u32 table_sz = 0; int te; if (!(partition_map & (1ULL << pp))) continue; pci_dbg(pdev, "Processing partition %d\n", pp); mmio_peer_ctrl = &mmio_ctrl[pp]; table_sz = ioread16(&mmio_peer_ctrl->req_id_table_size); if (!table_sz) { pci_warn(pdev, "Partition %d table_sz 0\n", pp); continue; } if (table_sz > 512) { pci_warn(pdev, "Invalid Switchtec partition %d table_sz %d\n", pp, table_sz); continue; } for (te = 0; te < table_sz; te++) { u32 rid_entry; u8 devfn; rid_entry = ioread32(&mmio_peer_ctrl->req_id_table[te]); devfn = (rid_entry >> 1) & 0xFF; pci_dbg(pdev, "Aliasing Partition %d Proxy ID %02x.%d\n", pp, PCI_SLOT(devfn), PCI_FUNC(devfn)); pci_add_dma_alias(pdev, devfn, 1); } } pci_iounmap(pdev, mmio); pci_disable_device(pdev); } #define SWITCHTEC_QUIRK(vid) \ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_MICROSEMI, vid, \ PCI_CLASS_BRIDGE_OTHER, 8, quirk_switchtec_ntb_dma_alias) SWITCHTEC_QUIRK(0x8531); /* PFX 24xG3 */ SWITCHTEC_QUIRK(0x8532); /* PFX 32xG3 */ SWITCHTEC_QUIRK(0x8533); /* PFX 48xG3 */ SWITCHTEC_QUIRK(0x8534); /* PFX 64xG3 */ SWITCHTEC_QUIRK(0x8535); /* PFX 80xG3 */ SWITCHTEC_QUIRK(0x8536); /* PFX 96xG3 */ SWITCHTEC_QUIRK(0x8541); /* PSX 24xG3 */ SWITCHTEC_QUIRK(0x8542); /* PSX 32xG3 */ SWITCHTEC_QUIRK(0x8543); /* PSX 48xG3 */ SWITCHTEC_QUIRK(0x8544); /* PSX 64xG3 */ SWITCHTEC_QUIRK(0x8545); /* PSX 80xG3 */ SWITCHTEC_QUIRK(0x8546); /* PSX 96xG3 */ SWITCHTEC_QUIRK(0x8551); /* PAX 24XG3 */ SWITCHTEC_QUIRK(0x8552); /* PAX 32XG3 */ SWITCHTEC_QUIRK(0x8553); /* PAX 48XG3 */ SWITCHTEC_QUIRK(0x8554); /* PAX 64XG3 */ SWITCHTEC_QUIRK(0x8555); /* PAX 80XG3 */ SWITCHTEC_QUIRK(0x8556); /* PAX 96XG3 */ SWITCHTEC_QUIRK(0x8561); /* PFXL 24XG3 */ SWITCHTEC_QUIRK(0x8562); /* PFXL 32XG3 */ SWITCHTEC_QUIRK(0x8563); /* PFXL 48XG3 */ SWITCHTEC_QUIRK(0x8564); /* PFXL 64XG3 */ SWITCHTEC_QUIRK(0x8565); /* PFXL 80XG3 */ SWITCHTEC_QUIRK(0x8566); /* PFXL 96XG3 */ SWITCHTEC_QUIRK(0x8571); /* PFXI 24XG3 */ SWITCHTEC_QUIRK(0x8572); /* PFXI 32XG3 */ SWITCHTEC_QUIRK(0x8573); /* PFXI 48XG3 */ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ SWITCHTEC_QUIRK(0x4000); /* PFX 100XG4 */ SWITCHTEC_QUIRK(0x4084); /* PFX 84XG4 */ SWITCHTEC_QUIRK(0x4068); /* PFX 68XG4 */ SWITCHTEC_QUIRK(0x4052); /* PFX 52XG4 */ SWITCHTEC_QUIRK(0x4036); /* PFX 36XG4 */ SWITCHTEC_QUIRK(0x4028); /* PFX 28XG4 */ SWITCHTEC_QUIRK(0x4100); /* PSX 100XG4 */ SWITCHTEC_QUIRK(0x4184); /* PSX 84XG4 */ SWITCHTEC_QUIRK(0x4168); /* PSX 68XG4 */ SWITCHTEC_QUIRK(0x4152); /* PSX 52XG4 */ SWITCHTEC_QUIRK(0x4136); /* PSX 36XG4 */ SWITCHTEC_QUIRK(0x4128); /* PSX 28XG4 */ SWITCHTEC_QUIRK(0x4200); /* PAX 100XG4 */ SWITCHTEC_QUIRK(0x4284); /* PAX 84XG4 */ SWITCHTEC_QUIRK(0x4268); /* PAX 68XG4 */ SWITCHTEC_QUIRK(0x4252); /* PAX 52XG4 */ SWITCHTEC_QUIRK(0x4236); /* PAX 36XG4 */ SWITCHTEC_QUIRK(0x4228); /* PAX 28XG4 */ SWITCHTEC_QUIRK(0x4352); /* PFXA 52XG4 */ SWITCHTEC_QUIRK(0x4336); /* PFXA 36XG4 */ SWITCHTEC_QUIRK(0x4328); /* PFXA 28XG4 */ SWITCHTEC_QUIRK(0x4452); /* PSXA 52XG4 */ SWITCHTEC_QUIRK(0x4436); /* PSXA 36XG4 */ SWITCHTEC_QUIRK(0x4428); /* PSXA 28XG4 */ SWITCHTEC_QUIRK(0x4552); /* PAXA 52XG4 */ SWITCHTEC_QUIRK(0x4536); /* PAXA 36XG4 */ SWITCHTEC_QUIRK(0x4528); /* PAXA 28XG4 */ SWITCHTEC_QUIRK(0x5000); /* PFX 100XG5 */ SWITCHTEC_QUIRK(0x5084); /* PFX 84XG5 */ SWITCHTEC_QUIRK(0x5068); /* PFX 68XG5 */ SWITCHTEC_QUIRK(0x5052); /* PFX 52XG5 */ SWITCHTEC_QUIRK(0x5036); /* PFX 36XG5 */ SWITCHTEC_QUIRK(0x5028); /* PFX 28XG5 */ SWITCHTEC_QUIRK(0x5100); /* PSX 100XG5 */ SWITCHTEC_QUIRK(0x5184); /* PSX 84XG5 */ SWITCHTEC_QUIRK(0x5168); /* PSX 68XG5 */ SWITCHTEC_QUIRK(0x5152); /* PSX 52XG5 */ SWITCHTEC_QUIRK(0x5136); /* PSX 36XG5 */ SWITCHTEC_QUIRK(0x5128); /* PSX 28XG5 */ SWITCHTEC_QUIRK(0x5200); /* PAX 100XG5 */ SWITCHTEC_QUIRK(0x5284); /* PAX 84XG5 */ SWITCHTEC_QUIRK(0x5268); /* PAX 68XG5 */ SWITCHTEC_QUIRK(0x5252); /* PAX 52XG5 */ SWITCHTEC_QUIRK(0x5236); /* PAX 36XG5 */ SWITCHTEC_QUIRK(0x5228); /* PAX 28XG5 */ SWITCHTEC_QUIRK(0x5300); /* PFXA 100XG5 */ SWITCHTEC_QUIRK(0x5384); /* PFXA 84XG5 */ SWITCHTEC_QUIRK(0x5368); /* PFXA 68XG5 */ SWITCHTEC_QUIRK(0x5352); /* PFXA 52XG5 */ SWITCHTEC_QUIRK(0x5336); /* PFXA 36XG5 */ SWITCHTEC_QUIRK(0x5328); /* PFXA 28XG5 */ SWITCHTEC_QUIRK(0x5400); /* PSXA 100XG5 */ SWITCHTEC_QUIRK(0x5484); /* PSXA 84XG5 */ SWITCHTEC_QUIRK(0x5468); /* PSXA 68XG5 */ SWITCHTEC_QUIRK(0x5452); /* PSXA 52XG5 */ SWITCHTEC_QUIRK(0x5436); /* PSXA 36XG5 */ SWITCHTEC_QUIRK(0x5428); /* PSXA 28XG5 */ SWITCHTEC_QUIRK(0x5500); /* PAXA 100XG5 */ SWITCHTEC_QUIRK(0x5584); /* PAXA 84XG5 */ SWITCHTEC_QUIRK(0x5568); /* PAXA 68XG5 */ SWITCHTEC_QUIRK(0x5552); /* PAXA 52XG5 */ SWITCHTEC_QUIRK(0x5536); /* PAXA 36XG5 */ SWITCHTEC_QUIRK(0x5528); /* PAXA 28XG5 */ #define SWITCHTEC_PCI100X_QUIRK(vid) \ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_EFAR, vid, \ PCI_CLASS_BRIDGE_OTHER, 8, quirk_switchtec_ntb_dma_alias) SWITCHTEC_PCI100X_QUIRK(0x1001); /* PCI1001XG4 */ SWITCHTEC_PCI100X_QUIRK(0x1002); /* PCI1002XG4 */ SWITCHTEC_PCI100X_QUIRK(0x1003); /* PCI1003XG4 */ SWITCHTEC_PCI100X_QUIRK(0x1004); /* PCI1004XG4 */ SWITCHTEC_PCI100X_QUIRK(0x1005); /* PCI1005XG4 */ SWITCHTEC_PCI100X_QUIRK(0x1006); /* PCI1006XG4 */ /* * The PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. * These IDs are used to forward responses to the originator on the other * side of the NTB. Alias all possible IDs to the NTB to permit access when * the IOMMU is turned on. */ static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev) { pci_info(pdev, "Setting PLX NTB proxy ID aliases\n"); /* PLX NTB may use all 256 devfns */ pci_add_dma_alias(pdev, 0, 256); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias); /* * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does * not always reset the secondary Nvidia GPU between reboots if the system * is configured to use Hybrid Graphics mode. This results in the GPU * being left in whatever state it was in during the *previous* boot, which * causes spurious interrupts from the GPU, which in turn causes us to * disable the wrong IRQ and end up breaking the touchpad. Unsurprisingly, * this also completely breaks nouveau. * * Luckily, it seems a simple reset of the Nvidia GPU brings it back to a * clean state and fixes all these issues. * * When the machine is configured in Dedicated display mode, the issue * doesn't occur. Fortunately the GPU advertises NoReset+ when in this * mode, so we can detect that and avoid resetting it. */ static void quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev) { void __iomem *map; int ret; if (pdev->subsystem_vendor != PCI_VENDOR_ID_LENOVO || pdev->subsystem_device != 0x222e || !pci_reset_supported(pdev)) return; if (pci_enable_device_mem(pdev)) return; /* * Based on nvkm_device_ctor() in * drivers/gpu/drm/nouveau/nvkm/engine/device/base.c */ map = pci_iomap(pdev, 0, 0x23000); if (!map) { pci_err(pdev, "Can't map MMIO space\n"); goto out_disable; } /* * Make sure the GPU looks like it's been POSTed before resetting * it. */ if (ioread32(map + 0x2240c) & 0x2) { pci_info(pdev, FW_BUG "GPU left initialized by EFI, resetting\n"); ret = pci_reset_bus(pdev); if (ret < 0) pci_err(pdev, "Failed to reset GPU: %d\n", ret); } iounmap(map); out_disable: pci_disable_device(pdev); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1, PCI_CLASS_DISPLAY_VGA, 8, quirk_reset_lenovo_thinkpad_p50_nvgpu); /* * Device [1b21:2142] * When in D0, PME# doesn't get asserted when plugging USB 3.0 device. */ static void pci_fixup_no_d0_pme(struct pci_dev *dev) { pci_info(dev, "PME# does not work under D0, disabling it\n"); dev->pme_support &= ~(PCI_PM_CAP_PME_D0 >> PCI_PM_CAP_PME_SHIFT); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x2142, pci_fixup_no_d0_pme); /* * Device 12d8:0x400e [OHCI] and 12d8:0x400f [EHCI] * * These devices advertise PME# support in all power states but don't * reliably assert it. * * These devices also advertise MSI, but documentation (PI7C9X440SL.pdf) * says "The MSI Function is not implemented on this device" in chapters * 7.3.27, 7.3.29-7.3.31. */ static void pci_fixup_no_msi_no_pme(struct pci_dev *dev) { #ifdef CONFIG_PCI_MSI pci_info(dev, "MSI is not implemented on this device, disabling it\n"); dev->no_msi = 1; #endif pci_info(dev, "PME# is unreliable, disabling it\n"); dev->pme_support = 0; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_PERICOM, 0x400e, pci_fixup_no_msi_no_pme); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_PERICOM, 0x400f, pci_fixup_no_msi_no_pme); static void apex_pci_fixup_class(struct pci_dev *pdev) { pdev->class = (PCI_CLASS_SYSTEM_OTHER << 8) | pdev->class; } DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a, PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); /* * Pericom PI7C9X2G404/PI7C9X2G304/PI7C9X2G303 switch erratum E5 - * ACS P2P Request Redirect is not functional * * When ACS P2P Request Redirect is enabled and bandwidth is not balanced * between upstream and downstream ports, packets are queued in an internal * buffer until CPLD packet. The workaround is to use the switch in store and * forward mode. */ #define PI7C9X2Gxxx_MODE_REG 0x74 #define PI7C9X2Gxxx_STORE_FORWARD_MODE BIT(0) static void pci_fixup_pericom_acs_store_forward(struct pci_dev *pdev) { struct pci_dev *upstream; u16 val; /* Downstream ports only */ if (pci_pcie_type(pdev) != PCI_EXP_TYPE_DOWNSTREAM) return; /* Check for ACS P2P Request Redirect use */ if (!pdev->acs_cap) return; pci_read_config_word(pdev, pdev->acs_cap + PCI_ACS_CTRL, &val); if (!(val & PCI_ACS_RR)) return; upstream = pci_upstream_bridge(pdev); if (!upstream) return; pci_read_config_word(upstream, PI7C9X2Gxxx_MODE_REG, &val); if (!(val & PI7C9X2Gxxx_STORE_FORWARD_MODE)) { pci_info(upstream, "Setting PI7C9X2Gxxx store-forward mode to avoid ACS erratum\n"); pci_write_config_word(upstream, PI7C9X2Gxxx_MODE_REG, val | PI7C9X2Gxxx_STORE_FORWARD_MODE); } } /* * Apply fixup on enable and on resume, in order to apply the fix up whenever * ACS configuration changes or switch mode is reset */ DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_PERICOM, 0x2404, pci_fixup_pericom_acs_store_forward); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_PERICOM, 0x2404, pci_fixup_pericom_acs_store_forward); DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_PERICOM, 0x2304, pci_fixup_pericom_acs_store_forward); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_PERICOM, 0x2304, pci_fixup_pericom_acs_store_forward); DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_PERICOM, 0x2303, pci_fixup_pericom_acs_store_forward); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_PERICOM, 0x2303, pci_fixup_pericom_acs_store_forward); static void nvidia_ion_ahci_fixup(struct pci_dev *pdev) { pdev->dev_flags |= PCI_DEV_FLAGS_HAS_MSI_MASKING; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, 0x0ab8, nvidia_ion_ahci_fixup); static void rom_bar_overlap_defect(struct pci_dev *dev) { pci_info(dev, "working around ROM BAR overlap defect\n"); dev->rom_bar_overlap = 1; } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1533, rom_bar_overlap_defect); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, rom_bar_overlap_defect); #ifdef CONFIG_PCIEASPM /* * Several Intel DG2 graphics devices advertise that they can only tolerate * 1us latency when transitioning from L1 to L0, which may prevent ASPM L1 * from being enabled. But in fact these devices can tolerate unlimited * latency. Override their Device Capabilities value to allow ASPM L1 to * be enabled. */ static void aspm_l1_acceptable_latency(struct pci_dev *dev) { u32 l1_lat = FIELD_GET(PCI_EXP_DEVCAP_L1, dev->devcap); if (l1_lat < 7) { dev->devcap |= FIELD_PREP(PCI_EXP_DEVCAP_L1, 7); pci_info(dev, "ASPM: overriding L1 acceptable latency from %#x to 0x7\n", l1_lat); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f80, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f81, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f82, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f83, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f84, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f85, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f86, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f87, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f88, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5690, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5691, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5692, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5693, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5694, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5695, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a0, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a1, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a2, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a3, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a4, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a5, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a6, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency); #endif #ifdef CONFIG_PCIE_DPC /* * Intel Ice Lake, Tiger Lake and Alder Lake BIOS has a bug that clears * the DPC RP PIO Log Size of the integrated Thunderbolt PCIe Root * Ports. */ static void dpc_log_size(struct pci_dev *dev) { u16 dpc, val; dpc = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC); if (!dpc) return; pci_read_config_word(dev, dpc + PCI_EXP_DPC_CAP, &val); if (!(val & PCI_EXP_DPC_CAP_RP_EXT)) return; if (FIELD_GET(PCI_EXP_DPC_RP_PIO_LOG_SIZE, val) == 0) { pci_info(dev, "Overriding RP PIO Log Size to %d\n", PCIE_STD_NUM_TLP_HEADERLOG); dev->dpc_rp_log_size = PCIE_STD_NUM_TLP_HEADERLOG; } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x461f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x462f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x463f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x466e, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8a1d, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8a1f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8a21, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8a23, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a23, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a25, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a27, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a29, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2b, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2d, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa72f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa73f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa76e, dpc_log_size); #endif /* * For a PCI device with multiple downstream devices, its driver may use * a flattened device tree to describe the downstream devices. * To overlay the flattened device tree, the PCI device and all its ancestor * devices need to have device tree nodes on system base device tree. Thus, * before driver probing, it might need to add a device tree node as the final * fixup. */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REDHAT, 0x0005, of_pci_make_dev_node); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_EFAR, 0x9660, of_pci_make_dev_node); /* * Devices known to require a longer delay before first config space access * after reset recovery or resume from D3cold: * * VideoPropulsion (aka Genroco) Torrent QN16e MPEG QAM Modulator */ static void pci_fixup_d3cold_delay_1sec(struct pci_dev *pdev) { pdev->d3cold_delay = 1000; } DECLARE_PCI_FIXUP_FINAL(0x5555, 0x0004, pci_fixup_d3cold_delay_1sec); #ifdef CONFIG_PCIEAER static void pci_mask_replay_timer_timeout(struct pci_dev *pdev) { struct pci_dev *parent = pci_upstream_bridge(pdev); u32 val; if (!parent || !parent->aer_cap) return; pci_info(parent, "mask Replay Timer Timeout Correctable Errors due to %s hardware defect", pci_name(pdev)); pci_read_config_dword(parent, parent->aer_cap + PCI_ERR_COR_MASK, &val); val |= PCI_ERR_COR_REP_TIMER; pci_write_config_dword(parent, parent->aer_cap + PCI_ERR_COR_MASK, val); } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9750, pci_mask_replay_timer_timeout); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9755, pci_mask_replay_timer_timeout); #endif |
| 16 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 | /* SPDX-License-Identifier: GPL-2.0 */ #undef TRACE_SYSTEM #define TRACE_SYSTEM ksm #if !defined(_TRACE_KSM_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_KSM_H #include <linux/tracepoint.h> /** * ksm_scan_template - called for start / stop scan * * @seq: sequence number of scan * @rmap_entries: actual number of rmap entries * * Allows to trace the start / stop of a ksm scan. */ DECLARE_EVENT_CLASS(ksm_scan_template, TP_PROTO(int seq, u32 rmap_entries), TP_ARGS(seq, rmap_entries), TP_STRUCT__entry( __field(int, seq) __field(u32, rmap_entries) ), TP_fast_assign( __entry->seq = seq; __entry->rmap_entries = rmap_entries; ), TP_printk("seq %d rmap size %d", __entry->seq, __entry->rmap_entries) ); /** * ksm_start_scan - called after a new ksm scan is started * * @seq: sequence number of scan * @rmap_entries: actual number of rmap entries * * Allows to trace the start of a ksm scan. */ DEFINE_EVENT(ksm_scan_template, ksm_start_scan, TP_PROTO(int seq, u32 rmap_entries), TP_ARGS(seq, rmap_entries) ); /** * ksm_stop_scan - called after a new ksm scan has completed * * @seq: sequence number of scan * @rmap_entries: actual number of rmap entries * * Allows to trace the completion of a ksm scan. */ DEFINE_EVENT(ksm_scan_template, ksm_stop_scan, TP_PROTO(int seq, u32 rmap_entries), TP_ARGS(seq, rmap_entries) ); /** * ksm_enter - called after a new process has been added / removed from ksm * * @mm: address of the mm object of the process * * Allows to trace the when a process has been added or removed from ksm. */ DECLARE_EVENT_CLASS(ksm_enter_exit_template, TP_PROTO(void *mm), TP_ARGS(mm), TP_STRUCT__entry( __field(void *, mm) ), TP_fast_assign( __entry->mm = mm; ), TP_printk("mm %p", __entry->mm) ); /** * ksm_enter - called after a new process has been added to ksm * * @mm: address of the mm object of the process * * Allows to trace the when a process has been added to ksm. */ DEFINE_EVENT(ksm_enter_exit_template, ksm_enter, TP_PROTO(void *mm), TP_ARGS(mm) ); /** * ksm_exit - called after a new process has been removed from ksm * * @mm: address of the mm object of the process * * Allows to trace the when a process has been removed from ksm. */ DEFINE_EVENT(ksm_enter_exit_template, ksm_exit, TP_PROTO(void *mm), TP_ARGS(mm) ); /** * ksm_merge_one_page - called after a page has been merged * * @pfn: page frame number of ksm page * @rmap_item: address of rmap_item object * @mm: address of the process mm struct * @err: success * * Allows to trace the ksm merging of individual pages. */ TRACE_EVENT(ksm_merge_one_page, TP_PROTO(unsigned long pfn, void *rmap_item, void *mm, int err), TP_ARGS(pfn, rmap_item, mm, err), TP_STRUCT__entry( __field(unsigned long, pfn) __field(void *, rmap_item) __field(void *, mm) __field(int, err) ), TP_fast_assign( __entry->pfn = pfn; __entry->rmap_item = rmap_item; __entry->mm = mm; __entry->err = err; ), TP_printk("ksm pfn %lu rmap_item %p mm %p error %d", __entry->pfn, __entry->rmap_item, __entry->mm, __entry->err) ); /** * ksm_merge_with_ksm_page - called after a page has been merged with a ksm page * * @ksm_page: address ksm page * @pfn: page frame number of ksm page * @rmap_item: address of rmap_item object * @mm: address of the mm object of the process * @err: success * * Allows to trace the merging of a page with a ksm page. */ TRACE_EVENT(ksm_merge_with_ksm_page, TP_PROTO(void *ksm_page, unsigned long pfn, void *rmap_item, void *mm, int err), TP_ARGS(ksm_page, pfn, rmap_item, mm, err), TP_STRUCT__entry( __field(void *, ksm_page) __field(unsigned long, pfn) __field(void *, rmap_item) __field(void *, mm) __field(int, err) ), TP_fast_assign( __entry->ksm_page = ksm_page; __entry->pfn = pfn; __entry->rmap_item = rmap_item; __entry->mm = mm; __entry->err = err; ), TP_printk("%spfn %lu rmap_item %p mm %p error %d", (__entry->ksm_page ? "ksm " : ""), __entry->pfn, __entry->rmap_item, __entry->mm, __entry->err) ); /** * ksm_remove_ksm_page - called after a ksm page has been removed * * @pfn: page frame number of ksm page * * Allows to trace the removing of stable ksm pages. */ TRACE_EVENT(ksm_remove_ksm_page, TP_PROTO(unsigned long pfn), TP_ARGS(pfn), TP_STRUCT__entry( __field(unsigned long, pfn) ), TP_fast_assign( __entry->pfn = pfn; ), TP_printk("pfn %lu", __entry->pfn) ); /** * ksm_remove_rmap_item - called after a rmap_item has been removed from the * stable tree * * @pfn: page frame number of ksm page * @rmap_item: address of rmap_item object * @mm: address of the process mm struct * * Allows to trace the removal of pages from the stable tree list. */ TRACE_EVENT(ksm_remove_rmap_item, TP_PROTO(unsigned long pfn, void *rmap_item, void *mm), TP_ARGS(pfn, rmap_item, mm), TP_STRUCT__entry( __field(unsigned long, pfn) __field(void *, rmap_item) __field(void *, mm) ), TP_fast_assign( __entry->pfn = pfn; __entry->rmap_item = rmap_item; __entry->mm = mm; ), TP_printk("pfn %lu rmap_item %p mm %p", __entry->pfn, __entry->rmap_item, __entry->mm) ); /** * ksm_advisor - called after the advisor has run * * @scan_time: scan time in seconds * @pages_to_scan: new pages_to_scan value * @cpu_percent: cpu usage in percent * * Allows to trace the ksm advisor. */ TRACE_EVENT(ksm_advisor, TP_PROTO(s64 scan_time, unsigned long pages_to_scan, unsigned int cpu_percent), TP_ARGS(scan_time, pages_to_scan, cpu_percent), TP_STRUCT__entry( __field(s64, scan_time) __field(unsigned long, pages_to_scan) __field(unsigned int, cpu_percent) ), TP_fast_assign( __entry->scan_time = scan_time; __entry->pages_to_scan = pages_to_scan; __entry->cpu_percent = cpu_percent; ), TP_printk("ksm scan time %lld pages_to_scan %lu cpu percent %u", __entry->scan_time, __entry->pages_to_scan, __entry->cpu_percent) ); #endif /* _TRACE_KSM_H */ /* This part must be outside protection */ #include <trace/define_trace.h> |
| 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | /* SPDX-License-Identifier: GPL-2.0 */ #undef TRACE_SYSTEM #define TRACE_INCLUDE_PATH ../../drivers/dma-buf #define TRACE_SYSTEM sync_trace #if !defined(_TRACE_SYNC_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_SYNC_H #include "sync_debug.h" #include <linux/tracepoint.h> TRACE_EVENT(sync_timeline, TP_PROTO(struct sync_timeline *timeline), TP_ARGS(timeline), TP_STRUCT__entry( __string(name, timeline->name) __field(u32, value) ), TP_fast_assign( __assign_str(name); __entry->value = timeline->value; ), TP_printk("name=%s value=%d", __get_str(name), __entry->value) ); #endif /* if !defined(_TRACE_SYNC_H) || defined(TRACE_HEADER_MULTI_READ) */ /* This part must be outside protection */ #include <trace/define_trace.h> |
| 3 3 3 3 3 3 5 5 3 5 5 5 5 5 5 5 3 3 3 3 12 6 6 6 7 7 5 5 5 7 55 56 56 56 56 51 1 1 1 2 2 12 49 49 49 5 5 5 5 5 5 5 49 888 887 10 6 6 3 3 3 3 1 1 1 1 1 1 3 3 3 3 3 3 3 3 6 5 1 1 3 9 3 146 9 9 146 9 3 6 16 3 13 3 3 73 5 5 72 5 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 | // SPDX-License-Identifier: GPL-2.0-or-later /* Copyright (c) 2014 Mahesh Bandewar <maheshb@google.com> */ #include <linux/ethtool.h> #include <net/netdev_lock.h> #include "ipvlan.h" static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, struct netlink_ext_ack *extack) { struct ipvl_dev *ipvlan; unsigned int flags; int err; ASSERT_RTNL(); if (port->mode != nval) { list_for_each_entry(ipvlan, &port->ipvlans, pnode) { flags = ipvlan->dev->flags; if (nval == IPVLAN_MODE_L3 || nval == IPVLAN_MODE_L3S) { err = dev_change_flags(ipvlan->dev, flags | IFF_NOARP, extack); } else { err = dev_change_flags(ipvlan->dev, flags & ~IFF_NOARP, extack); } if (unlikely(err)) goto fail; } if (nval == IPVLAN_MODE_L3S) { /* New mode is L3S */ err = ipvlan_l3s_register(port); if (err) goto fail; } else if (port->mode == IPVLAN_MODE_L3S) { /* Old mode was L3S */ ipvlan_l3s_unregister(port); } port->mode = nval; } return 0; fail: /* Undo the flags changes that have been done so far. */ list_for_each_entry_continue_reverse(ipvlan, &port->ipvlans, pnode) { flags = ipvlan->dev->flags; if (port->mode == IPVLAN_MODE_L3 || port->mode == IPVLAN_MODE_L3S) dev_change_flags(ipvlan->dev, flags | IFF_NOARP, NULL); else dev_change_flags(ipvlan->dev, flags & ~IFF_NOARP, NULL); } return err; } static int ipvlan_port_create(struct net_device *dev) { struct ipvl_port *port; int err, idx; port = kzalloc(sizeof(struct ipvl_port), GFP_KERNEL); if (!port) return -ENOMEM; write_pnet(&port->pnet, dev_net(dev)); port->dev = dev; port->mode = IPVLAN_MODE_L3; INIT_LIST_HEAD(&port->ipvlans); for (idx = 0; idx < IPVLAN_HASH_SIZE; idx++) INIT_HLIST_HEAD(&port->hlhead[idx]); skb_queue_head_init(&port->backlog); INIT_WORK(&port->wq, ipvlan_process_multicast); ida_init(&port->ida); port->dev_id_start = 1; err = netdev_rx_handler_register(dev, ipvlan_handle_frame, port); if (err) goto err; netdev_hold(dev, &port->dev_tracker, GFP_KERNEL); return 0; err: kfree(port); return err; } static void ipvlan_port_destroy(struct net_device *dev) { struct ipvl_port *port = ipvlan_port_get_rtnl(dev); struct sk_buff *skb; netdev_put(dev, &port->dev_tracker); if (port->mode == IPVLAN_MODE_L3S) ipvlan_l3s_unregister(port); netdev_rx_handler_unregister(dev); cancel_work_sync(&port->wq); while ((skb = __skb_dequeue(&port->backlog)) != NULL) { dev_put(skb->dev); kfree_skb(skb); } ida_destroy(&port->ida); kfree(port); } #define IPVLAN_ALWAYS_ON_OFLOADS \ (NETIF_F_SG | NETIF_F_HW_CSUM | \ NETIF_F_GSO_ROBUST | NETIF_F_GSO_SOFTWARE | NETIF_F_GSO_ENCAP_ALL) #define IPVLAN_ALWAYS_ON \ (IPVLAN_ALWAYS_ON_OFLOADS | NETIF_F_VLAN_CHALLENGED) #define IPVLAN_FEATURES \ (NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | \ NETIF_F_GSO | NETIF_F_ALL_TSO | NETIF_F_GSO_ROBUST | \ NETIF_F_GRO | NETIF_F_RXCSUM | \ NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_STAG_FILTER) /* NETIF_F_GSO_ENCAP_ALL NETIF_F_GSO_SOFTWARE Newly added */ #define IPVLAN_STATE_MASK \ ((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT)) static int ipvlan_init(struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct net_device *phy_dev = ipvlan->phy_dev; struct ipvl_port *port; int err; dev->state = (dev->state & ~IPVLAN_STATE_MASK) | (phy_dev->state & IPVLAN_STATE_MASK); dev->features = phy_dev->features & IPVLAN_FEATURES; dev->features |= IPVLAN_ALWAYS_ON; dev->vlan_features = phy_dev->vlan_features & IPVLAN_FEATURES; dev->vlan_features |= IPVLAN_ALWAYS_ON_OFLOADS; dev->hw_enc_features |= dev->features; dev->lltx = true; netif_inherit_tso_max(dev, phy_dev); dev->hard_header_len = phy_dev->hard_header_len; netdev_lockdep_set_classes(dev); ipvlan->pcpu_stats = netdev_alloc_pcpu_stats(struct ipvl_pcpu_stats); if (!ipvlan->pcpu_stats) return -ENOMEM; if (!netif_is_ipvlan_port(phy_dev)) { err = ipvlan_port_create(phy_dev); if (err < 0) { free_percpu(ipvlan->pcpu_stats); return err; } } port = ipvlan_port_get_rtnl(phy_dev); port->count += 1; return 0; } static void ipvlan_uninit(struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct net_device *phy_dev = ipvlan->phy_dev; struct ipvl_port *port; free_percpu(ipvlan->pcpu_stats); port = ipvlan_port_get_rtnl(phy_dev); port->count -= 1; if (!port->count) ipvlan_port_destroy(port->dev); } static int ipvlan_open(struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct ipvl_addr *addr; if (ipvlan->port->mode == IPVLAN_MODE_L3 || ipvlan->port->mode == IPVLAN_MODE_L3S) dev->flags |= IFF_NOARP; else dev->flags &= ~IFF_NOARP; rcu_read_lock(); list_for_each_entry_rcu(addr, &ipvlan->addrs, anode) ipvlan_ht_addr_add(ipvlan, addr); rcu_read_unlock(); return 0; } static int ipvlan_stop(struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct net_device *phy_dev = ipvlan->phy_dev; struct ipvl_addr *addr; dev_uc_unsync(phy_dev, dev); dev_mc_unsync(phy_dev, dev); rcu_read_lock(); list_for_each_entry_rcu(addr, &ipvlan->addrs, anode) ipvlan_ht_addr_del(addr); rcu_read_unlock(); return 0; } static netdev_tx_t ipvlan_start_xmit(struct sk_buff *skb, struct net_device *dev) { const struct ipvl_dev *ipvlan = netdev_priv(dev); int skblen = skb->len; int ret; ret = ipvlan_queue_xmit(skb, dev); if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) { struct ipvl_pcpu_stats *pcptr; pcptr = this_cpu_ptr(ipvlan->pcpu_stats); u64_stats_update_begin(&pcptr->syncp); u64_stats_inc(&pcptr->tx_pkts); u64_stats_add(&pcptr->tx_bytes, skblen); u64_stats_update_end(&pcptr->syncp); } else { this_cpu_inc(ipvlan->pcpu_stats->tx_drps); } return ret; } static netdev_features_t ipvlan_fix_features(struct net_device *dev, netdev_features_t features) { struct ipvl_dev *ipvlan = netdev_priv(dev); features |= NETIF_F_ALL_FOR_ALL; features &= (ipvlan->sfeatures | ~IPVLAN_FEATURES); features = netdev_increment_features(ipvlan->phy_dev->features, features, features); features |= IPVLAN_ALWAYS_ON; features &= (IPVLAN_FEATURES | IPVLAN_ALWAYS_ON); return features; } static void ipvlan_change_rx_flags(struct net_device *dev, int change) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct net_device *phy_dev = ipvlan->phy_dev; if (change & IFF_ALLMULTI) dev_set_allmulti(phy_dev, dev->flags & IFF_ALLMULTI? 1 : -1); } static void ipvlan_set_multicast_mac_filter(struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); if (dev->flags & (IFF_PROMISC | IFF_ALLMULTI)) { bitmap_fill(ipvlan->mac_filters, IPVLAN_MAC_FILTER_SIZE); } else { struct netdev_hw_addr *ha; DECLARE_BITMAP(mc_filters, IPVLAN_MAC_FILTER_SIZE); bitmap_zero(mc_filters, IPVLAN_MAC_FILTER_SIZE); netdev_for_each_mc_addr(ha, dev) __set_bit(ipvlan_mac_hash(ha->addr), mc_filters); /* Turn-on broadcast bit irrespective of address family, * since broadcast is deferred to a work-queue, hence no * impact on fast-path processing. */ __set_bit(ipvlan_mac_hash(dev->broadcast), mc_filters); bitmap_copy(ipvlan->mac_filters, mc_filters, IPVLAN_MAC_FILTER_SIZE); } dev_uc_sync(ipvlan->phy_dev, dev); dev_mc_sync(ipvlan->phy_dev, dev); } static void ipvlan_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *s) { struct ipvl_dev *ipvlan = netdev_priv(dev); if (ipvlan->pcpu_stats) { struct ipvl_pcpu_stats *pcptr; u64 rx_pkts, rx_bytes, rx_mcast, tx_pkts, tx_bytes; u32 rx_errs = 0, tx_drps = 0; u32 strt; int idx; for_each_possible_cpu(idx) { pcptr = per_cpu_ptr(ipvlan->pcpu_stats, idx); do { strt = u64_stats_fetch_begin(&pcptr->syncp); rx_pkts = u64_stats_read(&pcptr->rx_pkts); rx_bytes = u64_stats_read(&pcptr->rx_bytes); rx_mcast = u64_stats_read(&pcptr->rx_mcast); tx_pkts = u64_stats_read(&pcptr->tx_pkts); tx_bytes = u64_stats_read(&pcptr->tx_bytes); } while (u64_stats_fetch_retry(&pcptr->syncp, strt)); s->rx_packets += rx_pkts; s->rx_bytes += rx_bytes; s->multicast += rx_mcast; s->tx_packets += tx_pkts; s->tx_bytes += tx_bytes; /* u32 values are updated without syncp protection. */ rx_errs += READ_ONCE(pcptr->rx_errs); tx_drps += READ_ONCE(pcptr->tx_drps); } s->rx_errors = rx_errs; s->rx_dropped = rx_errs; s->tx_dropped = tx_drps; } s->tx_errors = DEV_STATS_READ(dev, tx_errors); } static int ipvlan_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct net_device *phy_dev = ipvlan->phy_dev; return vlan_vid_add(phy_dev, proto, vid); } static int ipvlan_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct net_device *phy_dev = ipvlan->phy_dev; vlan_vid_del(phy_dev, proto, vid); return 0; } static int ipvlan_get_iflink(const struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); return READ_ONCE(ipvlan->phy_dev->ifindex); } static const struct net_device_ops ipvlan_netdev_ops = { .ndo_init = ipvlan_init, .ndo_uninit = ipvlan_uninit, .ndo_open = ipvlan_open, .ndo_stop = ipvlan_stop, .ndo_start_xmit = ipvlan_start_xmit, .ndo_fix_features = ipvlan_fix_features, .ndo_change_rx_flags = ipvlan_change_rx_flags, .ndo_set_rx_mode = ipvlan_set_multicast_mac_filter, .ndo_get_stats64 = ipvlan_get_stats64, .ndo_vlan_rx_add_vid = ipvlan_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = ipvlan_vlan_rx_kill_vid, .ndo_get_iflink = ipvlan_get_iflink, }; static int ipvlan_hard_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, const void *daddr, const void *saddr, unsigned len) { const struct ipvl_dev *ipvlan = netdev_priv(dev); struct net_device *phy_dev = ipvlan->phy_dev; /* TODO Probably use a different field than dev_addr so that the * mac-address on the virtual device is portable and can be carried * while the packets use the mac-addr on the physical device. */ return dev_hard_header(skb, phy_dev, type, daddr, saddr ? : phy_dev->dev_addr, len); } static const struct header_ops ipvlan_header_ops = { .create = ipvlan_hard_header, .parse = eth_header_parse, .cache = eth_header_cache, .cache_update = eth_header_cache_update, .parse_protocol = eth_header_parse_protocol, }; static void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct net_device *dev) { ipvlan->dev->mtu = dev->mtu; } static bool netif_is_ipvlan(const struct net_device *dev) { /* both ipvlan and ipvtap devices use the same netdev_ops */ return dev->netdev_ops == &ipvlan_netdev_ops; } static int ipvlan_ethtool_get_link_ksettings(struct net_device *dev, struct ethtool_link_ksettings *cmd) { const struct ipvl_dev *ipvlan = netdev_priv(dev); return __ethtool_get_link_ksettings(ipvlan->phy_dev, cmd); } static void ipvlan_ethtool_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *drvinfo) { strscpy(drvinfo->driver, IPVLAN_DRV, sizeof(drvinfo->driver)); strscpy(drvinfo->version, IPV_DRV_VER, sizeof(drvinfo->version)); } static u32 ipvlan_ethtool_get_msglevel(struct net_device *dev) { const struct ipvl_dev *ipvlan = netdev_priv(dev); return ipvlan->msg_enable; } static void ipvlan_ethtool_set_msglevel(struct net_device *dev, u32 value) { struct ipvl_dev *ipvlan = netdev_priv(dev); ipvlan->msg_enable = value; } static const struct ethtool_ops ipvlan_ethtool_ops = { .get_link = ethtool_op_get_link, .get_link_ksettings = ipvlan_ethtool_get_link_ksettings, .get_drvinfo = ipvlan_ethtool_get_drvinfo, .get_msglevel = ipvlan_ethtool_get_msglevel, .set_msglevel = ipvlan_ethtool_set_msglevel, }; static int ipvlan_nl_changelink(struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct ipvl_port *port = ipvlan_port_get_rtnl(ipvlan->phy_dev); int err = 0; if (!data) return 0; if (!ns_capable(dev_net(ipvlan->phy_dev)->user_ns, CAP_NET_ADMIN)) return -EPERM; if (data[IFLA_IPVLAN_MODE]) { u16 nmode = nla_get_u16(data[IFLA_IPVLAN_MODE]); err = ipvlan_set_port_mode(port, nmode, extack); } if (!err && data[IFLA_IPVLAN_FLAGS]) { u16 flags = nla_get_u16(data[IFLA_IPVLAN_FLAGS]); if (flags & IPVLAN_F_PRIVATE) ipvlan_mark_private(port); else ipvlan_clear_private(port); if (flags & IPVLAN_F_VEPA) ipvlan_mark_vepa(port); else ipvlan_clear_vepa(port); } return err; } static size_t ipvlan_nl_getsize(const struct net_device *dev) { return (0 + nla_total_size(2) /* IFLA_IPVLAN_MODE */ + nla_total_size(2) /* IFLA_IPVLAN_FLAGS */ ); } static int ipvlan_nl_validate(struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { if (!data) return 0; if (data[IFLA_IPVLAN_MODE]) { u16 mode = nla_get_u16(data[IFLA_IPVLAN_MODE]); if (mode >= IPVLAN_MODE_MAX) return -EINVAL; } if (data[IFLA_IPVLAN_FLAGS]) { u16 flags = nla_get_u16(data[IFLA_IPVLAN_FLAGS]); /* Only two bits are used at this moment. */ if (flags & ~(IPVLAN_F_PRIVATE | IPVLAN_F_VEPA)) return -EINVAL; /* Also both flags can't be active at the same time. */ if ((flags & (IPVLAN_F_PRIVATE | IPVLAN_F_VEPA)) == (IPVLAN_F_PRIVATE | IPVLAN_F_VEPA)) return -EINVAL; } return 0; } static int ipvlan_nl_fillinfo(struct sk_buff *skb, const struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct ipvl_port *port = ipvlan_port_get_rtnl(ipvlan->phy_dev); int ret = -EINVAL; if (!port) goto err; ret = -EMSGSIZE; if (nla_put_u16(skb, IFLA_IPVLAN_MODE, port->mode)) goto err; if (nla_put_u16(skb, IFLA_IPVLAN_FLAGS, port->flags)) goto err; return 0; err: return ret; } int ipvlan_link_new(struct net_device *dev, struct rtnl_newlink_params *params, struct netlink_ext_ack *extack) { struct net *link_net = rtnl_newlink_link_net(params); struct ipvl_dev *ipvlan = netdev_priv(dev); struct nlattr **data = params->data; struct nlattr **tb = params->tb; struct ipvl_port *port; struct net_device *phy_dev; int err; u16 mode = IPVLAN_MODE_L3; if (!tb[IFLA_LINK]) return -EINVAL; phy_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!phy_dev) return -ENODEV; if (netif_is_ipvlan(phy_dev)) { struct ipvl_dev *tmp = netdev_priv(phy_dev); phy_dev = tmp->phy_dev; if (!ns_capable(dev_net(phy_dev)->user_ns, CAP_NET_ADMIN)) return -EPERM; } else if (!netif_is_ipvlan_port(phy_dev)) { /* Exit early if the underlying link is invalid or busy */ if (phy_dev->type != ARPHRD_ETHER || phy_dev->flags & IFF_LOOPBACK) { netdev_err(phy_dev, "Master is either lo or non-ether device\n"); return -EINVAL; } if (netdev_is_rx_handler_busy(phy_dev)) { netdev_err(phy_dev, "Device is already in use.\n"); return -EBUSY; } } ipvlan->phy_dev = phy_dev; ipvlan->dev = dev; ipvlan->sfeatures = IPVLAN_FEATURES; if (!tb[IFLA_MTU]) ipvlan_adjust_mtu(ipvlan, phy_dev); INIT_LIST_HEAD(&ipvlan->addrs); spin_lock_init(&ipvlan->addrs_lock); /* TODO Probably put random address here to be presented to the * world but keep using the physical-dev address for the outgoing * packets. */ eth_hw_addr_set(dev, phy_dev->dev_addr); dev->priv_flags |= IFF_NO_RX_HANDLER; err = register_netdevice(dev); if (err < 0) return err; /* ipvlan_init() would have created the port, if required */ port = ipvlan_port_get_rtnl(phy_dev); ipvlan->port = port; /* If the port-id base is at the MAX value, then wrap it around and * begin from 0x1 again. This may be due to a busy system where lots * of slaves are getting created and deleted. */ if (port->dev_id_start == 0xFFFE) port->dev_id_start = 0x1; /* Since L2 address is shared among all IPvlan slaves including * master, use unique 16 bit dev-ids to differentiate among them. * Assign IDs between 0x1 and 0xFFFE (used by the master) to each * slave link [see addrconf_ifid_eui48()]. */ err = ida_alloc_range(&port->ida, port->dev_id_start, 0xFFFD, GFP_KERNEL); if (err < 0) err = ida_alloc_range(&port->ida, 0x1, port->dev_id_start - 1, GFP_KERNEL); if (err < 0) goto unregister_netdev; dev->dev_id = err; /* Increment id-base to the next slot for the future assignment */ port->dev_id_start = err + 1; err = netdev_upper_dev_link(phy_dev, dev, extack); if (err) goto remove_ida; /* Flags are per port and latest update overrides. User has * to be consistent in setting it just like the mode attribute. */ if (data && data[IFLA_IPVLAN_FLAGS]) port->flags = nla_get_u16(data[IFLA_IPVLAN_FLAGS]); if (data && data[IFLA_IPVLAN_MODE]) mode = nla_get_u16(data[IFLA_IPVLAN_MODE]); err = ipvlan_set_port_mode(port, mode, extack); if (err) goto unlink_netdev; list_add_tail_rcu(&ipvlan->pnode, &port->ipvlans); netif_stacked_transfer_operstate(phy_dev, dev); return 0; unlink_netdev: netdev_upper_dev_unlink(phy_dev, dev); remove_ida: ida_free(&port->ida, dev->dev_id); unregister_netdev: unregister_netdevice(dev); return err; } EXPORT_SYMBOL_GPL(ipvlan_link_new); void ipvlan_link_delete(struct net_device *dev, struct list_head *head) { struct ipvl_dev *ipvlan = netdev_priv(dev); struct ipvl_addr *addr, *next; spin_lock_bh(&ipvlan->addrs_lock); list_for_each_entry_safe(addr, next, &ipvlan->addrs, anode) { ipvlan_ht_addr_del(addr); list_del_rcu(&addr->anode); kfree_rcu(addr, rcu); } spin_unlock_bh(&ipvlan->addrs_lock); ida_free(&ipvlan->port->ida, dev->dev_id); list_del_rcu(&ipvlan->pnode); unregister_netdevice_queue(dev, head); netdev_upper_dev_unlink(ipvlan->phy_dev, dev); } EXPORT_SYMBOL_GPL(ipvlan_link_delete); void ipvlan_link_setup(struct net_device *dev) { ether_setup(dev); dev->max_mtu = ETH_MAX_MTU; dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING); dev->priv_flags |= IFF_UNICAST_FLT | IFF_NO_QUEUE; dev->netdev_ops = &ipvlan_netdev_ops; dev->needs_free_netdev = true; dev->header_ops = &ipvlan_header_ops; dev->ethtool_ops = &ipvlan_ethtool_ops; } EXPORT_SYMBOL_GPL(ipvlan_link_setup); static const struct nla_policy ipvlan_nl_policy[IFLA_IPVLAN_MAX + 1] = { [IFLA_IPVLAN_MODE] = { .type = NLA_U16 }, [IFLA_IPVLAN_FLAGS] = { .type = NLA_U16 }, }; static struct net *ipvlan_get_link_net(const struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); return dev_net(ipvlan->phy_dev); } static struct rtnl_link_ops ipvlan_link_ops = { .kind = "ipvlan", .priv_size = sizeof(struct ipvl_dev), .setup = ipvlan_link_setup, .newlink = ipvlan_link_new, .dellink = ipvlan_link_delete, .get_link_net = ipvlan_get_link_net, }; int ipvlan_link_register(struct rtnl_link_ops *ops) { ops->get_size = ipvlan_nl_getsize; ops->policy = ipvlan_nl_policy; ops->validate = ipvlan_nl_validate; ops->fill_info = ipvlan_nl_fillinfo; ops->changelink = ipvlan_nl_changelink; ops->maxtype = IFLA_IPVLAN_MAX; return rtnl_link_register(ops); } EXPORT_SYMBOL_GPL(ipvlan_link_register); static int ipvlan_device_event(struct notifier_block *unused, unsigned long event, void *ptr) { struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(ptr); struct netdev_notifier_pre_changeaddr_info *prechaddr_info; struct net_device *dev = netdev_notifier_info_to_dev(ptr); struct ipvl_dev *ipvlan, *next; struct ipvl_port *port; LIST_HEAD(lst_kill); int err; if (!netif_is_ipvlan_port(dev)) return NOTIFY_DONE; port = ipvlan_port_get_rtnl(dev); switch (event) { case NETDEV_UP: case NETDEV_DOWN: case NETDEV_CHANGE: list_for_each_entry(ipvlan, &port->ipvlans, pnode) netif_stacked_transfer_operstate(ipvlan->phy_dev, ipvlan->dev); break; case NETDEV_REGISTER: { struct net *oldnet, *newnet = dev_net(dev); oldnet = read_pnet(&port->pnet); if (net_eq(newnet, oldnet)) break; write_pnet(&port->pnet, newnet); if (port->mode == IPVLAN_MODE_L3S) ipvlan_migrate_l3s_hook(oldnet, newnet); break; } case NETDEV_UNREGISTER: if (dev->reg_state != NETREG_UNREGISTERING) break; list_for_each_entry_safe(ipvlan, next, &port->ipvlans, pnode) ipvlan->dev->rtnl_link_ops->dellink(ipvlan->dev, &lst_kill); unregister_netdevice_many(&lst_kill); break; case NETDEV_FEAT_CHANGE: list_for_each_entry(ipvlan, &port->ipvlans, pnode) { netif_inherit_tso_max(ipvlan->dev, dev); netdev_update_features(ipvlan->dev); } break; case NETDEV_CHANGEMTU: list_for_each_entry(ipvlan, &port->ipvlans, pnode) ipvlan_adjust_mtu(ipvlan, dev); break; case NETDEV_PRE_CHANGEADDR: prechaddr_info = ptr; list_for_each_entry(ipvlan, &port->ipvlans, pnode) { err = dev_pre_changeaddr_notify(ipvlan->dev, prechaddr_info->dev_addr, extack); if (err) return notifier_from_errno(err); } break; case NETDEV_CHANGEADDR: list_for_each_entry(ipvlan, &port->ipvlans, pnode) { eth_hw_addr_set(ipvlan->dev, dev->dev_addr); call_netdevice_notifiers(NETDEV_CHANGEADDR, ipvlan->dev); } break; case NETDEV_PRE_TYPE_CHANGE: /* Forbid underlying device to change its type. */ return NOTIFY_BAD; case NETDEV_NOTIFY_PEERS: case NETDEV_BONDING_FAILOVER: case NETDEV_RESEND_IGMP: list_for_each_entry(ipvlan, &port->ipvlans, pnode) call_netdevice_notifiers(event, ipvlan->dev); } return NOTIFY_DONE; } /* the caller must held the addrs lock */ static int ipvlan_add_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6) { struct ipvl_addr *addr; addr = kzalloc(sizeof(struct ipvl_addr), GFP_ATOMIC); if (!addr) return -ENOMEM; addr->master = ipvlan; if (!is_v6) { memcpy(&addr->ip4addr, iaddr, sizeof(struct in_addr)); addr->atype = IPVL_IPV4; #if IS_ENABLED(CONFIG_IPV6) } else { memcpy(&addr->ip6addr, iaddr, sizeof(struct in6_addr)); addr->atype = IPVL_IPV6; #endif } list_add_tail_rcu(&addr->anode, &ipvlan->addrs); /* If the interface is not up, the address will be added to the hash * list by ipvlan_open. */ if (netif_running(ipvlan->dev)) ipvlan_ht_addr_add(ipvlan, addr); return 0; } static void ipvlan_del_addr(struct ipvl_dev *ipvlan, void *iaddr, bool is_v6) { struct ipvl_addr *addr; spin_lock_bh(&ipvlan->addrs_lock); addr = ipvlan_find_addr(ipvlan, iaddr, is_v6); if (!addr) { spin_unlock_bh(&ipvlan->addrs_lock); return; } ipvlan_ht_addr_del(addr); list_del_rcu(&addr->anode); spin_unlock_bh(&ipvlan->addrs_lock); kfree_rcu(addr, rcu); } static bool ipvlan_is_valid_dev(const struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); if (!netif_is_ipvlan(dev)) return false; if (!ipvlan || !ipvlan->port) return false; return true; } #if IS_ENABLED(CONFIG_IPV6) static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, struct in6_addr *ip6_addr) { int ret = -EINVAL; spin_lock_bh(&ipvlan->addrs_lock); if (ipvlan_addr_busy(ipvlan->port, ip6_addr, true)) netif_err(ipvlan, ifup, ipvlan->dev, "Failed to add IPv6=%pI6c addr for %s intf\n", ip6_addr, ipvlan->dev->name); else ret = ipvlan_add_addr(ipvlan, ip6_addr, true); spin_unlock_bh(&ipvlan->addrs_lock); return ret; } static void ipvlan_del_addr6(struct ipvl_dev *ipvlan, struct in6_addr *ip6_addr) { return ipvlan_del_addr(ipvlan, ip6_addr, true); } static int ipvlan_addr6_event(struct notifier_block *unused, unsigned long event, void *ptr) { struct inet6_ifaddr *if6 = (struct inet6_ifaddr *)ptr; struct net_device *dev = (struct net_device *)if6->idev->dev; struct ipvl_dev *ipvlan = netdev_priv(dev); if (!ipvlan_is_valid_dev(dev)) return NOTIFY_DONE; switch (event) { case NETDEV_UP: if (ipvlan_add_addr6(ipvlan, &if6->addr)) return NOTIFY_BAD; break; case NETDEV_DOWN: ipvlan_del_addr6(ipvlan, &if6->addr); break; } return NOTIFY_OK; } static int ipvlan_addr6_validator_event(struct notifier_block *unused, unsigned long event, void *ptr) { struct in6_validator_info *i6vi = (struct in6_validator_info *)ptr; struct net_device *dev = (struct net_device *)i6vi->i6vi_dev->dev; struct ipvl_dev *ipvlan = netdev_priv(dev); if (!ipvlan_is_valid_dev(dev)) return NOTIFY_DONE; switch (event) { case NETDEV_UP: if (ipvlan_addr_busy(ipvlan->port, &i6vi->i6vi_addr, true)) { NL_SET_ERR_MSG(i6vi->extack, "Address already assigned to an ipvlan device"); return notifier_from_errno(-EADDRINUSE); } break; } return NOTIFY_OK; } #endif static int ipvlan_add_addr4(struct ipvl_dev *ipvlan, struct in_addr *ip4_addr) { int ret = -EINVAL; spin_lock_bh(&ipvlan->addrs_lock); if (ipvlan_addr_busy(ipvlan->port, ip4_addr, false)) netif_err(ipvlan, ifup, ipvlan->dev, "Failed to add IPv4=%pI4 on %s intf.\n", ip4_addr, ipvlan->dev->name); else ret = ipvlan_add_addr(ipvlan, ip4_addr, false); spin_unlock_bh(&ipvlan->addrs_lock); return ret; } static void ipvlan_del_addr4(struct ipvl_dev *ipvlan, struct in_addr *ip4_addr) { return ipvlan_del_addr(ipvlan, ip4_addr, false); } static int ipvlan_addr4_event(struct notifier_block *unused, unsigned long event, void *ptr) { struct in_ifaddr *if4 = (struct in_ifaddr *)ptr; struct net_device *dev = (struct net_device *)if4->ifa_dev->dev; struct ipvl_dev *ipvlan = netdev_priv(dev); struct in_addr ip4_addr; if (!ipvlan_is_valid_dev(dev)) return NOTIFY_DONE; switch (event) { case NETDEV_UP: ip4_addr.s_addr = if4->ifa_address; if (ipvlan_add_addr4(ipvlan, &ip4_addr)) return NOTIFY_BAD; break; case NETDEV_DOWN: ip4_addr.s_addr = if4->ifa_address; ipvlan_del_addr4(ipvlan, &ip4_addr); break; } return NOTIFY_OK; } static int ipvlan_addr4_validator_event(struct notifier_block *unused, unsigned long event, void *ptr) { struct in_validator_info *ivi = (struct in_validator_info *)ptr; struct net_device *dev = (struct net_device *)ivi->ivi_dev->dev; struct ipvl_dev *ipvlan = netdev_priv(dev); if (!ipvlan_is_valid_dev(dev)) return NOTIFY_DONE; switch (event) { case NETDEV_UP: if (ipvlan_addr_busy(ipvlan->port, &ivi->ivi_addr, false)) { NL_SET_ERR_MSG(ivi->extack, "Address already assigned to an ipvlan device"); return notifier_from_errno(-EADDRINUSE); } break; } return NOTIFY_OK; } static struct notifier_block ipvlan_addr4_notifier_block __read_mostly = { .notifier_call = ipvlan_addr4_event, }; static struct notifier_block ipvlan_addr4_vtor_notifier_block __read_mostly = { .notifier_call = ipvlan_addr4_validator_event, }; static struct notifier_block ipvlan_notifier_block __read_mostly = { .notifier_call = ipvlan_device_event, }; #if IS_ENABLED(CONFIG_IPV6) static struct notifier_block ipvlan_addr6_notifier_block __read_mostly = { .notifier_call = ipvlan_addr6_event, }; static struct notifier_block ipvlan_addr6_vtor_notifier_block __read_mostly = { .notifier_call = ipvlan_addr6_validator_event, }; #endif static int __init ipvlan_init_module(void) { int err; ipvlan_init_secret(); register_netdevice_notifier(&ipvlan_notifier_block); #if IS_ENABLED(CONFIG_IPV6) register_inet6addr_notifier(&ipvlan_addr6_notifier_block); register_inet6addr_validator_notifier( &ipvlan_addr6_vtor_notifier_block); #endif register_inetaddr_notifier(&ipvlan_addr4_notifier_block); register_inetaddr_validator_notifier(&ipvlan_addr4_vtor_notifier_block); err = ipvlan_l3s_init(); if (err < 0) goto error; err = ipvlan_link_register(&ipvlan_link_ops); if (err < 0) { ipvlan_l3s_cleanup(); goto error; } return 0; error: unregister_inetaddr_notifier(&ipvlan_addr4_notifier_block); unregister_inetaddr_validator_notifier( &ipvlan_addr4_vtor_notifier_block); #if IS_ENABLED(CONFIG_IPV6) unregister_inet6addr_notifier(&ipvlan_addr6_notifier_block); unregister_inet6addr_validator_notifier( &ipvlan_addr6_vtor_notifier_block); #endif unregister_netdevice_notifier(&ipvlan_notifier_block); return err; } static void __exit ipvlan_cleanup_module(void) { rtnl_link_unregister(&ipvlan_link_ops); ipvlan_l3s_cleanup(); unregister_netdevice_notifier(&ipvlan_notifier_block); unregister_inetaddr_notifier(&ipvlan_addr4_notifier_block); unregister_inetaddr_validator_notifier( &ipvlan_addr4_vtor_notifier_block); #if IS_ENABLED(CONFIG_IPV6) unregister_inet6addr_notifier(&ipvlan_addr6_notifier_block); unregister_inet6addr_validator_notifier( &ipvlan_addr6_vtor_notifier_block); #endif } module_init(ipvlan_init_module); module_exit(ipvlan_cleanup_module); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Mahesh Bandewar <maheshb@google.com>"); MODULE_DESCRIPTION("Driver for L3 (IPv6/IPv4) based VLANs"); MODULE_ALIAS_RTNL_LINK("ipvlan"); |
| 2 4 1 1 1 1 1 2 2 1 2 2 1 1 1 1 1 1 2 4 4 3 2 2 2 6 5 4 4 4 3 3 3 2 4 6 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 | // SPDX-License-Identifier: GPL-2.0-only /* * Media device * * Copyright (C) 2010 Nokia Corporation * * Contacts: Laurent Pinchart <laurent.pinchart@ideasonboard.com> * Sakari Ailus <sakari.ailus@iki.fi> */ #include <linux/compat.h> #include <linux/export.h> #include <linux/idr.h> #include <linux/ioctl.h> #include <linux/media.h> #include <linux/slab.h> #include <linux/types.h> #include <linux/pci.h> #include <linux/usb.h> #include <linux/version.h> #include <media/media-device.h> #include <media/media-devnode.h> #include <media/media-entity.h> #include <media/media-request.h> #ifdef CONFIG_MEDIA_CONTROLLER /* * Legacy defines from linux/media.h. This is the only place we need this * so we just define it here. The media.h header doesn't expose it to the * kernel to prevent it from being used by drivers, but here (and only here!) * we need it to handle the legacy behavior. */ #define MEDIA_ENT_SUBTYPE_MASK 0x0000ffff #define MEDIA_ENT_T_DEVNODE_UNKNOWN (MEDIA_ENT_F_OLD_BASE | \ MEDIA_ENT_SUBTYPE_MASK) /* ----------------------------------------------------------------------------- * Userspace API */ static inline void __user *media_get_uptr(__u64 arg) { return (void __user *)(uintptr_t)arg; } static int media_device_open(struct file *filp) { return 0; } static int media_device_close(struct file *filp) { return 0; } static long media_device_get_info(struct media_device *dev, void *arg) { struct media_device_info *info = arg; memset(info, 0, sizeof(*info)); if (dev->driver_name[0]) strscpy(info->driver, dev->driver_name, sizeof(info->driver)); else strscpy(info->driver, dev->dev->driver->name, sizeof(info->driver)); strscpy(info->model, dev->model, sizeof(info->model)); strscpy(info->serial, dev->serial, sizeof(info->serial)); strscpy(info->bus_info, dev->bus_info, sizeof(info->bus_info)); info->media_version = LINUX_VERSION_CODE; info->driver_version = info->media_version; info->hw_revision = dev->hw_revision; return 0; } static struct media_entity *find_entity(struct media_device *mdev, u32 id) { struct media_entity *entity; int next = id & MEDIA_ENT_ID_FLAG_NEXT; id &= ~MEDIA_ENT_ID_FLAG_NEXT; media_device_for_each_entity(entity, mdev) { if (((media_entity_id(entity) == id) && !next) || ((media_entity_id(entity) > id) && next)) { return entity; } } return NULL; } static long media_device_enum_entities(struct media_device *mdev, void *arg) { struct media_entity_desc *entd = arg; struct media_entity *ent; ent = find_entity(mdev, entd->id); if (ent == NULL) return -EINVAL; memset(entd, 0, sizeof(*entd)); entd->id = media_entity_id(ent); if (ent->name) strscpy(entd->name, ent->name, sizeof(entd->name)); entd->type = ent->function; entd->revision = 0; /* Unused */ entd->flags = ent->flags; entd->group_id = 0; /* Unused */ entd->pads = ent->num_pads; entd->links = ent->num_links - ent->num_backlinks; /* * Workaround for a bug at media-ctl <= v1.10 that makes it to * do the wrong thing if the entity function doesn't belong to * either MEDIA_ENT_F_OLD_BASE or MEDIA_ENT_F_OLD_SUBDEV_BASE * Ranges. * * Non-subdevices are expected to be at the MEDIA_ENT_F_OLD_BASE, * or, otherwise, will be silently ignored by media-ctl when * printing the graphviz diagram. So, map them into the devnode * old range. */ if (ent->function < MEDIA_ENT_F_OLD_BASE || ent->function > MEDIA_ENT_F_TUNER) { if (is_media_entity_v4l2_subdev(ent)) entd->type = MEDIA_ENT_F_V4L2_SUBDEV_UNKNOWN; else if (ent->function != MEDIA_ENT_F_IO_V4L) entd->type = MEDIA_ENT_T_DEVNODE_UNKNOWN; } memcpy(&entd->raw, &ent->info, sizeof(ent->info)); return 0; } static void media_device_kpad_to_upad(const struct media_pad *kpad, struct media_pad_desc *upad) { upad->entity = media_entity_id(kpad->entity); upad->index = kpad->index; upad->flags = kpad->flags; } static long media_device_enum_links(struct media_device *mdev, void *arg) { struct media_links_enum *links = arg; struct media_entity *entity; entity = find_entity(mdev, links->entity); if (entity == NULL) return -EINVAL; if (links->pads) { unsigned int p; for (p = 0; p < entity->num_pads; p++) { struct media_pad_desc pad; memset(&pad, 0, sizeof(pad)); media_device_kpad_to_upad(&entity->pads[p], &pad); if (copy_to_user(&links->pads[p], &pad, sizeof(pad))) return -EFAULT; } } if (links->links) { struct media_link *link; struct media_link_desc __user *ulink_desc = links->links; list_for_each_entry(link, &entity->links, list) { struct media_link_desc klink_desc; /* Ignore backlinks. */ if (link->source->entity != entity) continue; memset(&klink_desc, 0, sizeof(klink_desc)); media_device_kpad_to_upad(link->source, &klink_desc.source); media_device_kpad_to_upad(link->sink, &klink_desc.sink); klink_desc.flags = link->flags; if (copy_to_user(ulink_desc, &klink_desc, sizeof(*ulink_desc))) return -EFAULT; ulink_desc++; } } memset(links->reserved, 0, sizeof(links->reserved)); return 0; } static long media_device_setup_link(struct media_device *mdev, void *arg) { struct media_link_desc *linkd = arg; struct media_link *link = NULL; struct media_entity *source; struct media_entity *sink; /* Find the source and sink entities and link. */ source = find_entity(mdev, linkd->source.entity); sink = find_entity(mdev, linkd->sink.entity); if (source == NULL || sink == NULL) return -EINVAL; if (linkd->source.index >= source->num_pads || linkd->sink.index >= sink->num_pads) return -EINVAL; link = media_entity_find_link(&source->pads[linkd->source.index], &sink->pads[linkd->sink.index]); if (link == NULL) return -EINVAL; memset(linkd->reserved, 0, sizeof(linkd->reserved)); /* Setup the link on both entities. */ return __media_entity_setup_link(link, linkd->flags); } static long media_device_get_topology(struct media_device *mdev, void *arg) { struct media_v2_topology *topo = arg; struct media_entity *entity; struct media_interface *intf; struct media_pad *pad; struct media_link *link; struct media_v2_entity kentity, __user *uentity; struct media_v2_interface kintf, __user *uintf; struct media_v2_pad kpad, __user *upad; struct media_v2_link klink, __user *ulink; unsigned int i; int ret = 0; topo->topology_version = mdev->topology_version; /* Get entities and number of entities */ i = 0; uentity = media_get_uptr(topo->ptr_entities); media_device_for_each_entity(entity, mdev) { i++; if (ret || !uentity) continue; if (i > topo->num_entities) { ret = -ENOSPC; continue; } /* Copy fields to userspace struct if not error */ memset(&kentity, 0, sizeof(kentity)); kentity.id = entity->graph_obj.id; kentity.function = entity->function; kentity.flags = entity->flags; strscpy(kentity.name, entity->name, sizeof(kentity.name)); if (copy_to_user(uentity, &kentity, sizeof(kentity))) ret = -EFAULT; uentity++; } topo->num_entities = i; topo->reserved1 = 0; /* Get interfaces and number of interfaces */ i = 0; uintf = media_get_uptr(topo->ptr_interfaces); media_device_for_each_intf(intf, mdev) { i++; if (ret || !uintf) continue; if (i > topo->num_interfaces) { ret = -ENOSPC; continue; } memset(&kintf, 0, sizeof(kintf)); /* Copy intf fields to userspace struct */ kintf.id = intf->graph_obj.id; kintf.intf_type = intf->type; kintf.flags = intf->flags; if (media_type(&intf->graph_obj) == MEDIA_GRAPH_INTF_DEVNODE) { struct media_intf_devnode *devnode; devnode = intf_to_devnode(intf); kintf.devnode.major = devnode->major; kintf.devnode.minor = devnode->minor; } if (copy_to_user(uintf, &kintf, sizeof(kintf))) ret = -EFAULT; uintf++; } topo->num_interfaces = i; topo->reserved2 = 0; /* Get pads and number of pads */ i = 0; upad = media_get_uptr(topo->ptr_pads); media_device_for_each_pad(pad, mdev) { i++; if (ret || !upad) continue; if (i > topo->num_pads) { ret = -ENOSPC; continue; } memset(&kpad, 0, sizeof(kpad)); /* Copy pad fields to userspace struct */ kpad.id = pad->graph_obj.id; kpad.entity_id = pad->entity->graph_obj.id; kpad.flags = pad->flags; kpad.index = pad->index; if (copy_to_user(upad, &kpad, sizeof(kpad))) ret = -EFAULT; upad++; } topo->num_pads = i; topo->reserved3 = 0; /* Get links and number of links */ i = 0; ulink = media_get_uptr(topo->ptr_links); media_device_for_each_link(link, mdev) { if (link->is_backlink) continue; i++; if (ret || !ulink) continue; if (i > topo->num_links) { ret = -ENOSPC; continue; } memset(&klink, 0, sizeof(klink)); /* Copy link fields to userspace struct */ klink.id = link->graph_obj.id; klink.source_id = link->gobj0->id; klink.sink_id = link->gobj1->id; klink.flags = link->flags; if (copy_to_user(ulink, &klink, sizeof(klink))) ret = -EFAULT; ulink++; } topo->num_links = i; topo->reserved4 = 0; return ret; } static long media_device_request_alloc(struct media_device *mdev, void *arg) { int *alloc_fd = arg; if (!mdev->ops || !mdev->ops->req_validate || !mdev->ops->req_queue) return -ENOTTY; return media_request_alloc(mdev, alloc_fd); } static long copy_arg_from_user(void *karg, void __user *uarg, unsigned int cmd) { if ((_IOC_DIR(cmd) & _IOC_WRITE) && copy_from_user(karg, uarg, _IOC_SIZE(cmd))) return -EFAULT; return 0; } static long copy_arg_to_user(void __user *uarg, void *karg, unsigned int cmd) { if ((_IOC_DIR(cmd) & _IOC_READ) && copy_to_user(uarg, karg, _IOC_SIZE(cmd))) return -EFAULT; return 0; } /* Do acquire the graph mutex */ #define MEDIA_IOC_FL_GRAPH_MUTEX BIT(0) #define MEDIA_IOC_ARG(__cmd, func, fl, from_user, to_user) \ [_IOC_NR(MEDIA_IOC_##__cmd)] = { \ .cmd = MEDIA_IOC_##__cmd, \ .fn = func, \ .flags = fl, \ .arg_from_user = from_user, \ .arg_to_user = to_user, \ } #define MEDIA_IOC(__cmd, func, fl) \ MEDIA_IOC_ARG(__cmd, func, fl, copy_arg_from_user, copy_arg_to_user) /* the table is indexed by _IOC_NR(cmd) */ struct media_ioctl_info { unsigned int cmd; unsigned short flags; long (*fn)(struct media_device *dev, void *arg); long (*arg_from_user)(void *karg, void __user *uarg, unsigned int cmd); long (*arg_to_user)(void __user *uarg, void *karg, unsigned int cmd); }; static const struct media_ioctl_info ioctl_info[] = { MEDIA_IOC(DEVICE_INFO, media_device_get_info, MEDIA_IOC_FL_GRAPH_MUTEX), MEDIA_IOC(ENUM_ENTITIES, media_device_enum_entities, MEDIA_IOC_FL_GRAPH_MUTEX), MEDIA_IOC(ENUM_LINKS, media_device_enum_links, MEDIA_IOC_FL_GRAPH_MUTEX), MEDIA_IOC(SETUP_LINK, media_device_setup_link, MEDIA_IOC_FL_GRAPH_MUTEX), MEDIA_IOC(G_TOPOLOGY, media_device_get_topology, MEDIA_IOC_FL_GRAPH_MUTEX), MEDIA_IOC(REQUEST_ALLOC, media_device_request_alloc, 0), }; static long media_device_ioctl(struct file *filp, unsigned int cmd, unsigned long __arg) { struct media_devnode *devnode = media_devnode_data(filp); struct media_device *dev = devnode->media_dev; const struct media_ioctl_info *info; void __user *arg = (void __user *)__arg; char __karg[256], *karg = __karg; long ret; if (_IOC_NR(cmd) >= ARRAY_SIZE(ioctl_info) || ioctl_info[_IOC_NR(cmd)].cmd != cmd) return -ENOIOCTLCMD; info = &ioctl_info[_IOC_NR(cmd)]; if (_IOC_SIZE(info->cmd) > sizeof(__karg)) { karg = kmalloc(_IOC_SIZE(info->cmd), GFP_KERNEL); if (!karg) return -ENOMEM; } if (info->arg_from_user) { ret = info->arg_from_user(karg, arg, cmd); if (ret) goto out_free; } if (info->flags & MEDIA_IOC_FL_GRAPH_MUTEX) mutex_lock(&dev->graph_mutex); ret = info->fn(dev, karg); if (info->flags & MEDIA_IOC_FL_GRAPH_MUTEX) mutex_unlock(&dev->graph_mutex); if (!ret && info->arg_to_user) ret = info->arg_to_user(arg, karg, cmd); out_free: if (karg != __karg) kfree(karg); return ret; } #ifdef CONFIG_COMPAT struct media_links_enum32 { __u32 entity; compat_uptr_t pads; /* struct media_pad_desc * */ compat_uptr_t links; /* struct media_link_desc * */ __u32 reserved[4]; }; static long media_device_enum_links32(struct media_device *mdev, struct media_links_enum32 __user *ulinks) { struct media_links_enum links; compat_uptr_t pads_ptr, links_ptr; int ret; memset(&links, 0, sizeof(links)); if (get_user(links.entity, &ulinks->entity) || get_user(pads_ptr, &ulinks->pads) || get_user(links_ptr, &ulinks->links)) return -EFAULT; links.pads = compat_ptr(pads_ptr); links.links = compat_ptr(links_ptr); ret = media_device_enum_links(mdev, &links); if (ret) return ret; if (copy_to_user(ulinks->reserved, links.reserved, sizeof(ulinks->reserved))) return -EFAULT; return 0; } #define MEDIA_IOC_ENUM_LINKS32 _IOWR('|', 0x02, struct media_links_enum32) static long media_device_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct media_devnode *devnode = media_devnode_data(filp); struct media_device *dev = devnode->media_dev; long ret; switch (cmd) { case MEDIA_IOC_ENUM_LINKS32: mutex_lock(&dev->graph_mutex); ret = media_device_enum_links32(dev, (struct media_links_enum32 __user *)arg); mutex_unlock(&dev->graph_mutex); break; default: return media_device_ioctl(filp, cmd, arg); } return ret; } #endif /* CONFIG_COMPAT */ static const struct media_file_operations media_device_fops = { .owner = THIS_MODULE, .open = media_device_open, .ioctl = media_device_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = media_device_compat_ioctl, #endif /* CONFIG_COMPAT */ .release = media_device_close, }; /* ----------------------------------------------------------------------------- * sysfs */ static ssize_t model_show(struct device *cd, struct device_attribute *attr, char *buf) { struct media_devnode *devnode = to_media_devnode(cd); struct media_device *mdev = devnode->media_dev; return sprintf(buf, "%.*s\n", (int)sizeof(mdev->model), mdev->model); } static DEVICE_ATTR_RO(model); /* ----------------------------------------------------------------------------- * Registration/unregistration */ static void media_device_release(struct media_devnode *devnode) { dev_dbg(devnode->parent, "Media device released\n"); } static void __media_device_unregister_entity(struct media_entity *entity) { struct media_device *mdev = entity->graph_obj.mdev; struct media_link *link, *tmp; struct media_interface *intf; struct media_pad *iter; ida_free(&mdev->entity_internal_idx, entity->internal_idx); /* Remove all interface links pointing to this entity */ list_for_each_entry(intf, &mdev->interfaces, graph_obj.list) { list_for_each_entry_safe(link, tmp, &intf->links, list) { if (link->entity == entity) __media_remove_intf_link(link); } } /* Remove all data links that belong to this entity */ __media_entity_remove_links(entity); /* Remove all pads that belong to this entity */ media_entity_for_each_pad(entity, iter) media_gobj_destroy(&iter->graph_obj); /* Remove the entity */ media_gobj_destroy(&entity->graph_obj); /* invoke entity_notify callbacks to handle entity removal?? */ } int __must_check media_device_register_entity(struct media_device *mdev, struct media_entity *entity) { struct media_entity_notify *notify, *next; struct media_pad *iter; int ret; if (entity->function == MEDIA_ENT_F_V4L2_SUBDEV_UNKNOWN || entity->function == MEDIA_ENT_F_UNKNOWN) dev_warn(mdev->dev, "Entity type for entity %s was not initialized!\n", entity->name); /* Warn if we apparently re-register an entity */ WARN_ON(entity->graph_obj.mdev != NULL); entity->graph_obj.mdev = mdev; INIT_LIST_HEAD(&entity->links); entity->num_links = 0; entity->num_backlinks = 0; ret = ida_alloc_min(&mdev->entity_internal_idx, 1, GFP_KERNEL); if (ret < 0) return ret; entity->internal_idx = ret; mutex_lock(&mdev->graph_mutex); mdev->entity_internal_idx_max = max(mdev->entity_internal_idx_max, entity->internal_idx); /* Initialize media_gobj embedded at the entity */ media_gobj_create(mdev, MEDIA_GRAPH_ENTITY, &entity->graph_obj); /* Initialize objects at the pads */ media_entity_for_each_pad(entity, iter) media_gobj_create(mdev, MEDIA_GRAPH_PAD, &iter->graph_obj); /* invoke entity_notify callbacks */ list_for_each_entry_safe(notify, next, &mdev->entity_notify, list) notify->notify(entity, notify->notify_data); if (mdev->entity_internal_idx_max >= mdev->pm_count_walk.ent_enum.idx_max) { struct media_graph new = { .top = 0 }; /* * Initialise the new graph walk before cleaning up * the old one in order not to spoil the graph walk * object of the media device if graph walk init fails. */ ret = media_graph_walk_init(&new, mdev); if (ret) { __media_device_unregister_entity(entity); mutex_unlock(&mdev->graph_mutex); return ret; } media_graph_walk_cleanup(&mdev->pm_count_walk); mdev->pm_count_walk = new; } mutex_unlock(&mdev->graph_mutex); return 0; } EXPORT_SYMBOL_GPL(media_device_register_entity); void media_device_unregister_entity(struct media_entity *entity) { struct media_device *mdev = entity->graph_obj.mdev; if (mdev == NULL) return; mutex_lock(&mdev->graph_mutex); __media_device_unregister_entity(entity); mutex_unlock(&mdev->graph_mutex); } EXPORT_SYMBOL_GPL(media_device_unregister_entity); void media_device_init(struct media_device *mdev) { INIT_LIST_HEAD(&mdev->entities); INIT_LIST_HEAD(&mdev->interfaces); INIT_LIST_HEAD(&mdev->pads); INIT_LIST_HEAD(&mdev->links); INIT_LIST_HEAD(&mdev->entity_notify); mutex_init(&mdev->req_queue_mutex); mutex_init(&mdev->graph_mutex); ida_init(&mdev->entity_internal_idx); atomic_set(&mdev->request_id, 0); if (!*mdev->bus_info) media_set_bus_info(mdev->bus_info, sizeof(mdev->bus_info), mdev->dev); dev_dbg(mdev->dev, "Media device initialized\n"); } EXPORT_SYMBOL_GPL(media_device_init); void media_device_cleanup(struct media_device *mdev) { ida_destroy(&mdev->entity_internal_idx); mdev->entity_internal_idx_max = 0; media_graph_walk_cleanup(&mdev->pm_count_walk); mutex_destroy(&mdev->graph_mutex); mutex_destroy(&mdev->req_queue_mutex); } EXPORT_SYMBOL_GPL(media_device_cleanup); int __must_check __media_device_register(struct media_device *mdev, struct module *owner) { struct media_devnode *devnode; int ret; devnode = kzalloc(sizeof(*devnode), GFP_KERNEL); if (!devnode) return -ENOMEM; /* Register the device node. */ mdev->devnode = devnode; devnode->fops = &media_device_fops; devnode->parent = mdev->dev; devnode->release = media_device_release; /* Set version 0 to indicate user-space that the graph is static */ mdev->topology_version = 0; ret = media_devnode_register(mdev, devnode, owner); if (ret < 0) { /* devnode free is handled in media_devnode_*() */ mdev->devnode = NULL; return ret; } ret = device_create_file(&devnode->dev, &dev_attr_model); if (ret < 0) { /* devnode free is handled in media_devnode_*() */ mdev->devnode = NULL; media_devnode_unregister_prepare(devnode); media_devnode_unregister(devnode); return ret; } dev_dbg(mdev->dev, "Media device registered\n"); return 0; } EXPORT_SYMBOL_GPL(__media_device_register); void media_device_register_entity_notify(struct media_device *mdev, struct media_entity_notify *nptr) { mutex_lock(&mdev->graph_mutex); list_add_tail(&nptr->list, &mdev->entity_notify); mutex_unlock(&mdev->graph_mutex); } EXPORT_SYMBOL_GPL(media_device_register_entity_notify); /* * Note: Should be called with mdev->lock held. */ static void __media_device_unregister_entity_notify(struct media_device *mdev, struct media_entity_notify *nptr) { list_del(&nptr->list); } void media_device_unregister_entity_notify(struct media_device *mdev, struct media_entity_notify *nptr) { mutex_lock(&mdev->graph_mutex); __media_device_unregister_entity_notify(mdev, nptr); mutex_unlock(&mdev->graph_mutex); } EXPORT_SYMBOL_GPL(media_device_unregister_entity_notify); void media_device_unregister(struct media_device *mdev) { struct media_entity *entity; struct media_entity *next; struct media_interface *intf, *tmp_intf; struct media_entity_notify *notify, *nextp; if (mdev == NULL) return; mutex_lock(&mdev->graph_mutex); /* Check if mdev was ever registered at all */ if (!media_devnode_is_registered(mdev->devnode)) { mutex_unlock(&mdev->graph_mutex); return; } /* Clear the devnode register bit to avoid races with media dev open */ media_devnode_unregister_prepare(mdev->devnode); /* Remove all entities from the media device */ list_for_each_entry_safe(entity, next, &mdev->entities, graph_obj.list) __media_device_unregister_entity(entity); /* Remove all entity_notify callbacks from the media device */ list_for_each_entry_safe(notify, nextp, &mdev->entity_notify, list) __media_device_unregister_entity_notify(mdev, notify); /* Remove all interfaces from the media device */ list_for_each_entry_safe(intf, tmp_intf, &mdev->interfaces, graph_obj.list) { /* * Unlink the interface, but don't free it here; the * module which created it is responsible for freeing * it */ __media_remove_intf_links(intf); media_gobj_destroy(&intf->graph_obj); } mutex_unlock(&mdev->graph_mutex); dev_dbg(mdev->dev, "Media device unregistered\n"); device_remove_file(&mdev->devnode->dev, &dev_attr_model); media_devnode_unregister(mdev->devnode); /* devnode free is handled in media_devnode_*() */ mdev->devnode = NULL; } EXPORT_SYMBOL_GPL(media_device_unregister); #if IS_ENABLED(CONFIG_PCI) void media_device_pci_init(struct media_device *mdev, struct pci_dev *pci_dev, const char *name) { mdev->dev = &pci_dev->dev; if (name) strscpy(mdev->model, name, sizeof(mdev->model)); else strscpy(mdev->model, pci_name(pci_dev), sizeof(mdev->model)); sprintf(mdev->bus_info, "PCI:%s", pci_name(pci_dev)); mdev->hw_revision = (pci_dev->subsystem_vendor << 16) | pci_dev->subsystem_device; media_device_init(mdev); } EXPORT_SYMBOL_GPL(media_device_pci_init); #endif #if IS_ENABLED(CONFIG_USB) void __media_device_usb_init(struct media_device *mdev, struct usb_device *udev, const char *board_name, const char *driver_name) { mdev->dev = &udev->dev; if (driver_name) strscpy(mdev->driver_name, driver_name, sizeof(mdev->driver_name)); if (board_name) strscpy(mdev->model, board_name, sizeof(mdev->model)); else if (udev->product) strscpy(mdev->model, udev->product, sizeof(mdev->model)); else strscpy(mdev->model, "unknown model", sizeof(mdev->model)); if (udev->serial) strscpy(mdev->serial, udev->serial, sizeof(mdev->serial)); usb_make_path(udev, mdev->bus_info, sizeof(mdev->bus_info)); mdev->hw_revision = le16_to_cpu(udev->descriptor.bcdDevice); media_device_init(mdev); } EXPORT_SYMBOL_GPL(__media_device_usb_init); #endif #endif /* CONFIG_MEDIA_CONTROLLER */ |
| 228 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef __LINUX_MROUTE_H #define __LINUX_MROUTE_H #include <linux/in.h> #include <linux/pim.h> #include <net/fib_rules.h> #include <net/fib_notifier.h> #include <uapi/linux/mroute.h> #include <linux/mroute_base.h> #include <linux/sockptr.h> #ifdef CONFIG_IP_MROUTE static inline int ip_mroute_opt(int opt) { return opt >= MRT_BASE && opt <= MRT_MAX; } int ip_mroute_setsockopt(struct sock *, int, sockptr_t, unsigned int); int ip_mroute_getsockopt(struct sock *, int, sockptr_t, sockptr_t); int ipmr_ioctl(struct sock *sk, int cmd, void *arg); int ipmr_compat_ioctl(struct sock *sk, unsigned int cmd, void __user *arg); int ip_mr_init(void); bool ipmr_rule_default(const struct fib_rule *rule); int ipmr_sk_ioctl(struct sock *sk, unsigned int cmd, void __user *arg); #else static inline int ip_mroute_setsockopt(struct sock *sock, int optname, sockptr_t optval, unsigned int optlen) { return -ENOPROTOOPT; } static inline int ip_mroute_getsockopt(struct sock *sk, int optname, sockptr_t optval, sockptr_t optlen) { return -ENOPROTOOPT; } static inline int ipmr_ioctl(struct sock *sk, int cmd, void *arg) { return -ENOIOCTLCMD; } static inline int ip_mr_init(void) { return 0; } static inline int ip_mroute_opt(int opt) { return 0; } static inline bool ipmr_rule_default(const struct fib_rule *rule) { return true; } static inline int ipmr_sk_ioctl(struct sock *sk, unsigned int cmd, void __user *arg) { return 1; } #endif #define VIFF_STATIC 0x8000 struct mfc_cache_cmp_arg { __be32 mfc_mcastgrp; __be32 mfc_origin; }; /** * struct mfc_cache - multicast routing entries * @_c: Common multicast routing information; has to be first [for casting] * @mfc_mcastgrp: destination multicast group address * @mfc_origin: source address * @cmparg: used for rhashtable comparisons */ struct mfc_cache { struct mr_mfc _c; union { struct { __be32 mfc_mcastgrp; __be32 mfc_origin; }; struct mfc_cache_cmp_arg cmparg; }; }; struct rtmsg; int ipmr_get_route(struct net *net, struct sk_buff *skb, __be32 saddr, __be32 daddr, struct rtmsg *rtm, u32 portid); #endif |
| 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | /* * Copyright (c) 2006 Oracle. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include <linux/percpu.h> #include <linux/seq_file.h> #include <linux/proc_fs.h> #include "rds.h" #include "tcp.h" DEFINE_PER_CPU(struct rds_tcp_statistics, rds_tcp_stats) ____cacheline_aligned; static const char * const rds_tcp_stat_names[] = { "tcp_data_ready_calls", "tcp_write_space_calls", "tcp_sndbuf_full", "tcp_connect_raced", "tcp_listen_closed_stale", }; unsigned int rds_tcp_stats_info_copy(struct rds_info_iterator *iter, unsigned int avail) { struct rds_tcp_statistics stats = {0, }; uint64_t *src; uint64_t *sum; size_t i; int cpu; if (avail < ARRAY_SIZE(rds_tcp_stat_names)) goto out; for_each_online_cpu(cpu) { src = (uint64_t *)&(per_cpu(rds_tcp_stats, cpu)); sum = (uint64_t *)&stats; for (i = 0; i < sizeof(stats) / sizeof(uint64_t); i++) *(sum++) += *(src++); } rds_stats_info_copy(iter, (uint64_t *)&stats, rds_tcp_stat_names, ARRAY_SIZE(rds_tcp_stat_names)); out: return ARRAY_SIZE(rds_tcp_stat_names); } |
| 785 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* * pm_wakeup.h - Power management wakeup interface * * Copyright (C) 2008 Alan Stern * Copyright (C) 2010 Rafael J. Wysocki, Novell Inc. */ #ifndef _LINUX_PM_WAKEUP_H #define _LINUX_PM_WAKEUP_H #ifndef _DEVICE_H_ # error "Please do not include this file directly." #endif #include <linux/types.h> struct wake_irq; /** * struct wakeup_source - Representation of wakeup sources * * @name: Name of the wakeup source * @id: Wakeup source id * @entry: Wakeup source list entry * @lock: Wakeup source lock * @wakeirq: Optional device specific wakeirq * @timer: Wakeup timer list * @timer_expires: Wakeup timer expiration * @total_time: Total time this wakeup source has been active. * @max_time: Maximum time this wakeup source has been continuously active. * @last_time: Monotonic clock when the wakeup source's was touched last time. * @prevent_sleep_time: Total time this source has been preventing autosleep. * @event_count: Number of signaled wakeup events. * @active_count: Number of times the wakeup source was activated. * @relax_count: Number of times the wakeup source was deactivated. * @expire_count: Number of times the wakeup source's timeout has expired. * @wakeup_count: Number of times the wakeup source might abort suspend. * @dev: Struct device for sysfs statistics about the wakeup source. * @active: Status of the wakeup source. * @autosleep_enabled: Autosleep is active, so update @prevent_sleep_time. */ struct wakeup_source { const char *name; int id; struct list_head entry; spinlock_t lock; struct wake_irq *wakeirq; struct timer_list timer; unsigned long timer_expires; ktime_t total_time; ktime_t max_time; ktime_t last_time; ktime_t start_prevent_time; ktime_t prevent_sleep_time; unsigned long event_count; unsigned long active_count; unsigned long relax_count; unsigned long expire_count; unsigned long wakeup_count; struct device *dev; bool active:1; bool autosleep_enabled:1; }; #define for_each_wakeup_source(ws) \ for ((ws) = wakeup_sources_walk_start(); \ (ws); \ (ws) = wakeup_sources_walk_next((ws))) #ifdef CONFIG_PM_SLEEP /* * Changes to device_may_wakeup take effect on the next pm state change. */ static inline bool device_can_wakeup(struct device *dev) { return dev->power.can_wakeup; } static inline bool device_may_wakeup(struct device *dev) { return dev->power.can_wakeup && !!dev->power.wakeup; } static inline bool device_wakeup_path(struct device *dev) { return dev->power.wakeup_path; } static inline void device_set_wakeup_path(struct device *dev) { dev->power.wakeup_path = true; } /* drivers/base/power/wakeup.c */ extern struct wakeup_source *wakeup_source_create(const char *name); extern void wakeup_source_destroy(struct wakeup_source *ws); extern void wakeup_source_add(struct wakeup_source *ws); extern void wakeup_source_remove(struct wakeup_source *ws); extern struct wakeup_source *wakeup_source_register(struct device *dev, const char *name); extern void wakeup_source_unregister(struct wakeup_source *ws); extern int wakeup_sources_read_lock(void); extern void wakeup_sources_read_unlock(int idx); extern struct wakeup_source *wakeup_sources_walk_start(void); extern struct wakeup_source *wakeup_sources_walk_next(struct wakeup_source *ws); extern int device_wakeup_enable(struct device *dev); extern void device_wakeup_disable(struct device *dev); extern void device_set_wakeup_capable(struct device *dev, bool capable); extern int device_set_wakeup_enable(struct device *dev, bool enable); extern void __pm_stay_awake(struct wakeup_source *ws); extern void pm_stay_awake(struct device *dev); extern void __pm_relax(struct wakeup_source *ws); extern void pm_relax(struct device *dev); extern void pm_wakeup_ws_event(struct wakeup_source *ws, unsigned int msec, bool hard); extern void pm_wakeup_dev_event(struct device *dev, unsigned int msec, bool hard); #else /* !CONFIG_PM_SLEEP */ static inline void device_set_wakeup_capable(struct device *dev, bool capable) { dev->power.can_wakeup = capable; } static inline bool device_can_wakeup(struct device *dev) { return dev->power.can_wakeup; } static inline struct wakeup_source *wakeup_source_create(const char *name) { return NULL; } static inline void wakeup_source_destroy(struct wakeup_source *ws) {} static inline void wakeup_source_add(struct wakeup_source *ws) {} static inline void wakeup_source_remove(struct wakeup_source *ws) {} static inline struct wakeup_source *wakeup_source_register(struct device *dev, const char *name) { return NULL; } static inline void wakeup_source_unregister(struct wakeup_source *ws) {} static inline int device_wakeup_enable(struct device *dev) { dev->power.should_wakeup = true; return 0; } static inline void device_wakeup_disable(struct device *dev) { dev->power.should_wakeup = false; } static inline int device_set_wakeup_enable(struct device *dev, bool enable) { dev->power.should_wakeup = enable; return 0; } static inline bool device_may_wakeup(struct device *dev) { return dev->power.can_wakeup && dev->power.should_wakeup; } static inline bool device_wakeup_path(struct device *dev) { return false; } static inline void device_set_wakeup_path(struct device *dev) {} static inline void __pm_stay_awake(struct wakeup_source *ws) {} static inline void pm_stay_awake(struct device *dev) {} static inline void __pm_relax(struct wakeup_source *ws) {} static inline void pm_relax(struct device *dev) {} static inline void pm_wakeup_ws_event(struct wakeup_source *ws, unsigned int msec, bool hard) {} static inline void pm_wakeup_dev_event(struct device *dev, unsigned int msec, bool hard) {} #endif /* !CONFIG_PM_SLEEP */ static inline bool device_awake_path(struct device *dev) { return device_wakeup_path(dev); } static inline void device_set_awake_path(struct device *dev) { device_set_wakeup_path(dev); } static inline void __pm_wakeup_event(struct wakeup_source *ws, unsigned int msec) { pm_wakeup_ws_event(ws, msec, false); } static inline void pm_wakeup_event(struct device *dev, unsigned int msec) { pm_wakeup_dev_event(dev, msec, false); } static inline void pm_wakeup_hard_event(struct device *dev) { pm_wakeup_dev_event(dev, 0, true); } /** * device_init_wakeup - Device wakeup initialization. * @dev: Device to handle. * @enable: Whether or not to enable @dev as a wakeup device. * * By default, most devices should leave wakeup disabled. The exceptions are * devices that everyone expects to be wakeup sources: keyboards, power buttons, * possibly network interfaces, etc. Also, devices that don't generate their * own wakeup requests but merely forward requests from one bus to another * (like PCI bridges) should have wakeup enabled by default. */ static inline int device_init_wakeup(struct device *dev, bool enable) { if (enable) { device_set_wakeup_capable(dev, true); return device_wakeup_enable(dev); } device_wakeup_disable(dev); device_set_wakeup_capable(dev, false); return 0; } static void device_disable_wakeup(void *dev) { device_init_wakeup(dev, false); } /** * devm_device_init_wakeup - Resource managed device wakeup initialization. * @dev: Device to handle. * * This function is the devm managed version of device_init_wakeup(dev, true). */ static inline int devm_device_init_wakeup(struct device *dev) { device_init_wakeup(dev, true); return devm_add_action_or_reset(dev, device_disable_wakeup, dev); } #endif /* _LINUX_PM_WAKEUP_H */ |
| 449 453 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _NET_FLOW_DISSECTOR_H #define _NET_FLOW_DISSECTOR_H #include <linux/types.h> #include <linux/in6.h> #include <linux/siphash.h> #include <linux/string.h> #include <uapi/linux/if_ether.h> #include <uapi/linux/pkt_cls.h> struct bpf_prog; struct net; struct sk_buff; /** * struct flow_dissector_key_control: * @thoff: Transport header offset * @addr_type: Type of key. One of FLOW_DISSECTOR_KEY_* * @flags: Key flags. * Any of FLOW_DIS_(IS_FRAGMENT|FIRST_FRAG|ENCAPSULATION|F_*) */ struct flow_dissector_key_control { u16 thoff; u16 addr_type; u32 flags; }; /* The control flags are kept in sync with TCA_FLOWER_KEY_FLAGS_*, as those * flags are exposed to userspace in some error paths, ie. unsupported flags. */ enum flow_dissector_ctrl_flags { FLOW_DIS_IS_FRAGMENT = TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT, FLOW_DIS_FIRST_FRAG = TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST, FLOW_DIS_F_TUNNEL_CSUM = TCA_FLOWER_KEY_FLAGS_TUNNEL_CSUM, FLOW_DIS_F_TUNNEL_DONT_FRAGMENT = TCA_FLOWER_KEY_FLAGS_TUNNEL_DONT_FRAGMENT, FLOW_DIS_F_TUNNEL_OAM = TCA_FLOWER_KEY_FLAGS_TUNNEL_OAM, FLOW_DIS_F_TUNNEL_CRIT_OPT = TCA_FLOWER_KEY_FLAGS_TUNNEL_CRIT_OPT, /* These flags are internal to the kernel */ FLOW_DIS_ENCAPSULATION = (TCA_FLOWER_KEY_FLAGS_MAX << 1), }; enum flow_dissect_ret { FLOW_DISSECT_RET_OUT_GOOD, FLOW_DISSECT_RET_OUT_BAD, FLOW_DISSECT_RET_PROTO_AGAIN, FLOW_DISSECT_RET_IPPROTO_AGAIN, FLOW_DISSECT_RET_CONTINUE, }; /** * struct flow_dissector_key_basic: * @n_proto: Network header protocol (eg. IPv4/IPv6) * @ip_proto: Transport header protocol (eg. TCP/UDP) * @padding: Unused */ struct flow_dissector_key_basic { __be16 n_proto; u8 ip_proto; u8 padding; }; struct flow_dissector_key_tags { u32 flow_label; }; struct flow_dissector_key_vlan { union { struct { u16 vlan_id:12, vlan_dei:1, vlan_priority:3; }; __be16 vlan_tci; }; __be16 vlan_tpid; __be16 vlan_eth_type; u16 padding; }; struct flow_dissector_mpls_lse { u32 mpls_ttl:8, mpls_bos:1, mpls_tc:3, mpls_label:20; }; #define FLOW_DIS_MPLS_MAX 7 struct flow_dissector_key_mpls { struct flow_dissector_mpls_lse ls[FLOW_DIS_MPLS_MAX]; /* Label Stack */ u8 used_lses; /* One bit set for each Label Stack Entry in use */ }; static inline void dissector_set_mpls_lse(struct flow_dissector_key_mpls *mpls, int lse_index) { mpls->used_lses |= 1 << lse_index; } #define FLOW_DIS_TUN_OPTS_MAX 255 /** * struct flow_dissector_key_enc_opts: * @data: tunnel option data * @len: length of tunnel option data * @dst_opt_type: tunnel option type */ struct flow_dissector_key_enc_opts { u8 data[FLOW_DIS_TUN_OPTS_MAX]; /* Using IP_TUNNEL_OPTS_MAX is desired * here but seems difficult to #include */ u8 len; u32 dst_opt_type; }; struct flow_dissector_key_keyid { __be32 keyid; }; /** * struct flow_dissector_key_ipv4_addrs: * @src: source ip address * @dst: destination ip address */ struct flow_dissector_key_ipv4_addrs { /* (src,dst) must be grouped, in the same way than in IP header */ __be32 src; __be32 dst; }; /** * struct flow_dissector_key_ipv6_addrs: * @src: source ip address * @dst: destination ip address */ struct flow_dissector_key_ipv6_addrs { /* (src,dst) must be grouped, in the same way than in IP header */ struct in6_addr src; struct in6_addr dst; }; /** * struct flow_dissector_key_tipc: * @key: source node address combined with selector */ struct flow_dissector_key_tipc { __be32 key; }; /** * struct flow_dissector_key_addrs: * @v4addrs: IPv4 addresses * @v6addrs: IPv6 addresses * @tipckey: TIPC key */ struct flow_dissector_key_addrs { union { struct flow_dissector_key_ipv4_addrs v4addrs; struct flow_dissector_key_ipv6_addrs v6addrs; struct flow_dissector_key_tipc tipckey; }; }; /** * struct flow_dissector_key_arp: * @sip: Sender IP address * @tip: Target IP address * @op: Operation * @sha: Sender hardware address * @tha: Target hardware address */ struct flow_dissector_key_arp { __u32 sip; __u32 tip; __u8 op; unsigned char sha[ETH_ALEN]; unsigned char tha[ETH_ALEN]; }; /** * struct flow_dissector_key_ports: * @ports: port numbers of Transport header * @src: source port number * @dst: destination port number */ struct flow_dissector_key_ports { union { __be32 ports; struct { __be16 src; __be16 dst; }; }; }; /** * struct flow_dissector_key_ports_range * @tp: port number from packet * @tp_min: min port number in range * @tp_max: max port number in range */ struct flow_dissector_key_ports_range { union { struct flow_dissector_key_ports tp; struct { struct flow_dissector_key_ports tp_min; struct flow_dissector_key_ports tp_max; }; }; }; /** * struct flow_dissector_key_icmp: * @type: ICMP type * @code: ICMP code * @id: Session identifier */ struct flow_dissector_key_icmp { struct { u8 type; u8 code; }; u16 id; }; /** * struct flow_dissector_key_eth_addrs: * @src: source Ethernet address * @dst: destination Ethernet address */ struct flow_dissector_key_eth_addrs { /* (dst,src) must be grouped, in the same way than in ETH header */ unsigned char dst[ETH_ALEN]; unsigned char src[ETH_ALEN]; }; /** * struct flow_dissector_key_tcp: * @flags: flags */ struct flow_dissector_key_tcp { __be16 flags; }; /** * struct flow_dissector_key_ip: * @tos: tos * @ttl: ttl */ struct flow_dissector_key_ip { __u8 tos; __u8 ttl; }; /** * struct flow_dissector_key_meta: * @ingress_ifindex: ingress ifindex * @ingress_iftype: ingress interface type * @l2_miss: packet did not match an L2 entry during forwarding */ struct flow_dissector_key_meta { int ingress_ifindex; u16 ingress_iftype; u8 l2_miss; }; /** * struct flow_dissector_key_ct: * @ct_state: conntrack state after converting with map * @ct_mark: conttrack mark * @ct_zone: conntrack zone * @ct_labels: conntrack labels */ struct flow_dissector_key_ct { u16 ct_state; u16 ct_zone; u32 ct_mark; u32 ct_labels[4]; }; /** * struct flow_dissector_key_hash: * @hash: hash value */ struct flow_dissector_key_hash { u32 hash; }; /** * struct flow_dissector_key_num_of_vlans: * @num_of_vlans: num_of_vlans value */ struct flow_dissector_key_num_of_vlans { u8 num_of_vlans; }; /** * struct flow_dissector_key_pppoe: * @session_id: pppoe session id * @ppp_proto: ppp protocol * @type: pppoe eth type */ struct flow_dissector_key_pppoe { __be16 session_id; __be16 ppp_proto; __be16 type; }; /** * struct flow_dissector_key_l2tpv3: * @session_id: identifier for a l2tp session */ struct flow_dissector_key_l2tpv3 { __be32 session_id; }; /** * struct flow_dissector_key_ipsec: * @spi: identifier for a ipsec connection */ struct flow_dissector_key_ipsec { __be32 spi; }; /** * struct flow_dissector_key_cfm * @mdl_ver: maintenance domain level (mdl) and cfm protocol version * @opcode: code specifying a type of cfm protocol packet * * See 802.1ag, ITU-T G.8013/Y.1731 * 1 2 * |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * | mdl | version | opcode | * +-----+---------+-+-+-+-+-+-+-+-+ */ struct flow_dissector_key_cfm { u8 mdl_ver; u8 opcode; }; #define FLOW_DIS_CFM_MDL_MASK GENMASK(7, 5) #define FLOW_DIS_CFM_MDL_MAX 7 enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */ FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */ FLOW_DISSECTOR_KEY_IPV4_ADDRS, /* struct flow_dissector_key_ipv4_addrs */ FLOW_DISSECTOR_KEY_IPV6_ADDRS, /* struct flow_dissector_key_ipv6_addrs */ FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_PORTS_RANGE, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_ICMP, /* struct flow_dissector_key_icmp */ FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */ FLOW_DISSECTOR_KEY_TIPC, /* struct flow_dissector_key_tipc */ FLOW_DISSECTOR_KEY_ARP, /* struct flow_dissector_key_arp */ FLOW_DISSECTOR_KEY_VLAN, /* struct flow_dissector_key_vlan */ FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_tags */ FLOW_DISSECTOR_KEY_GRE_KEYID, /* struct flow_dissector_key_keyid */ FLOW_DISSECTOR_KEY_MPLS_ENTROPY, /* struct flow_dissector_key_keyid */ FLOW_DISSECTOR_KEY_ENC_KEYID, /* struct flow_dissector_key_keyid */ FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS, /* struct flow_dissector_key_ipv4_addrs */ FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS, /* struct flow_dissector_key_ipv6_addrs */ FLOW_DISSECTOR_KEY_ENC_CONTROL, /* struct flow_dissector_key_control */ FLOW_DISSECTOR_KEY_ENC_PORTS, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_MPLS, /* struct flow_dissector_key_mpls */ FLOW_DISSECTOR_KEY_TCP, /* struct flow_dissector_key_tcp */ FLOW_DISSECTOR_KEY_IP, /* struct flow_dissector_key_ip */ FLOW_DISSECTOR_KEY_CVLAN, /* struct flow_dissector_key_vlan */ FLOW_DISSECTOR_KEY_ENC_IP, /* struct flow_dissector_key_ip */ FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */ FLOW_DISSECTOR_KEY_META, /* struct flow_dissector_key_meta */ FLOW_DISSECTOR_KEY_CT, /* struct flow_dissector_key_ct */ FLOW_DISSECTOR_KEY_HASH, /* struct flow_dissector_key_hash */ FLOW_DISSECTOR_KEY_NUM_OF_VLANS, /* struct flow_dissector_key_num_of_vlans */ FLOW_DISSECTOR_KEY_PPPOE, /* struct flow_dissector_key_pppoe */ FLOW_DISSECTOR_KEY_L2TPV3, /* struct flow_dissector_key_l2tpv3 */ FLOW_DISSECTOR_KEY_CFM, /* struct flow_dissector_key_cfm */ FLOW_DISSECTOR_KEY_IPSEC, /* struct flow_dissector_key_ipsec */ FLOW_DISSECTOR_KEY_MAX, }; #define FLOW_DISSECTOR_F_PARSE_1ST_FRAG BIT(0) #define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL BIT(1) #define FLOW_DISSECTOR_F_STOP_AT_ENCAP BIT(2) #define FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP BIT(3) struct flow_dissector_key { enum flow_dissector_key_id key_id; size_t offset; /* offset of struct flow_dissector_key_* in target the struct */ }; struct flow_dissector { unsigned long long used_keys; /* each bit represents presence of one key id */ unsigned short int offset[FLOW_DISSECTOR_KEY_MAX]; }; struct flow_keys_basic { struct flow_dissector_key_control control; struct flow_dissector_key_basic basic; }; struct flow_keys { struct flow_dissector_key_control control; #define FLOW_KEYS_HASH_START_FIELD basic struct flow_dissector_key_basic basic __aligned(SIPHASH_ALIGNMENT); struct flow_dissector_key_tags tags; struct flow_dissector_key_vlan vlan; struct flow_dissector_key_vlan cvlan; struct flow_dissector_key_keyid keyid; struct flow_dissector_key_ports ports; struct flow_dissector_key_icmp icmp; /* 'addrs' must be the last member */ struct flow_dissector_key_addrs addrs; }; #define FLOW_KEYS_HASH_OFFSET \ offsetof(struct flow_keys, FLOW_KEYS_HASH_START_FIELD) __be32 flow_get_u32_src(const struct flow_keys *flow); __be32 flow_get_u32_dst(const struct flow_keys *flow); extern struct flow_dissector flow_keys_dissector; extern struct flow_dissector flow_keys_basic_dissector; /* struct flow_keys_digest: * * This structure is used to hold a digest of the full flow keys. This is a * larger "hash" of a flow to allow definitively matching specific flows where * the 32 bit skb->hash is not large enough. The size is limited to 16 bytes so * that it can be used in CB of skb (see sch_choke for an example). */ #define FLOW_KEYS_DIGEST_LEN 16 struct flow_keys_digest { u8 data[FLOW_KEYS_DIGEST_LEN]; }; void make_flow_keys_digest(struct flow_keys_digest *digest, const struct flow_keys *flow); static inline bool flow_keys_have_l4(const struct flow_keys *keys) { return (keys->ports.ports || keys->tags.flow_label); } u32 flow_hash_from_keys(struct flow_keys *keys); u32 flow_hash_from_keys_seed(struct flow_keys *keys, const siphash_key_t *keyval); void skb_flow_get_icmp_tci(const struct sk_buff *skb, struct flow_dissector_key_icmp *key_icmp, const void *data, int thoff, int hlen); static inline bool dissector_uses_key(const struct flow_dissector *flow_dissector, enum flow_dissector_key_id key_id) { return flow_dissector->used_keys & (1ULL << key_id); } static inline void *skb_flow_dissector_target(struct flow_dissector *flow_dissector, enum flow_dissector_key_id key_id, void *target_container) { return ((char *)target_container) + flow_dissector->offset[key_id]; } struct bpf_flow_dissector { struct bpf_flow_keys *flow_keys; const struct sk_buff *skb; const void *data; const void *data_end; }; static inline void flow_dissector_init_keys(struct flow_dissector_key_control *key_control, struct flow_dissector_key_basic *key_basic) { memset(key_control, 0, sizeof(*key_control)); memset(key_basic, 0, sizeof(*key_basic)); } #ifdef CONFIG_BPF_SYSCALL int flow_dissector_bpf_prog_attach_check(struct net *net, struct bpf_prog *prog); #endif /* CONFIG_BPF_SYSCALL */ #endif |
| 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 | /* SPDX-License-Identifier: GPL-2.0 */ /* * linux/mii.h: definitions for MII-compatible transceivers * Originally drivers/net/sunhme.h. * * Copyright (C) 1996, 1999, 2001 David S. Miller (davem@redhat.com) */ #ifndef __LINUX_MII_H__ #define __LINUX_MII_H__ #include <linux/if.h> #include <linux/linkmode.h> #include <uapi/linux/mii.h> struct ethtool_cmd; struct mii_if_info { int phy_id; int advertising; int phy_id_mask; int reg_num_mask; unsigned int full_duplex : 1; /* is full duplex? */ unsigned int force_media : 1; /* is autoneg. disabled? */ unsigned int supports_gmii : 1; /* are GMII registers supported? */ struct net_device *dev; int (*mdio_read) (struct net_device *dev, int phy_id, int location); void (*mdio_write) (struct net_device *dev, int phy_id, int location, int val); }; extern int mii_link_ok (struct mii_if_info *mii); extern int mii_nway_restart (struct mii_if_info *mii); extern void mii_ethtool_gset(struct mii_if_info *mii, struct ethtool_cmd *ecmd); extern void mii_ethtool_get_link_ksettings( struct mii_if_info *mii, struct ethtool_link_ksettings *cmd); extern int mii_ethtool_sset(struct mii_if_info *mii, struct ethtool_cmd *ecmd); extern int mii_ethtool_set_link_ksettings( struct mii_if_info *mii, const struct ethtool_link_ksettings *cmd); extern int mii_check_gmii_support(struct mii_if_info *mii); extern void mii_check_link (struct mii_if_info *mii); extern unsigned int mii_check_media (struct mii_if_info *mii, unsigned int ok_to_print, unsigned int init_media); extern int generic_mii_ioctl(struct mii_if_info *mii_if, struct mii_ioctl_data *mii_data, int cmd, unsigned int *duplex_changed); static inline struct mii_ioctl_data *if_mii(struct ifreq *rq) { return (struct mii_ioctl_data *) &rq->ifr_ifru; } /** * mii_nway_result * @negotiated: value of MII ANAR and'd with ANLPAR * * Given a set of MII abilities, check each bit and returns the * currently supported media, in the priority order defined by * IEEE 802.3u. We use LPA_xxx constants but note this is not the * value of LPA solely, as described above. * * The one exception to IEEE 802.3u is that 100baseT4 is placed * between 100T-full and 100T-half. If your phy does not support * 100T4 this is fine. If your phy places 100T4 elsewhere in the * priority order, you will need to roll your own function. */ static inline unsigned int mii_nway_result (unsigned int negotiated) { unsigned int ret; if (negotiated & LPA_100FULL) ret = LPA_100FULL; else if (negotiated & LPA_100BASE4) ret = LPA_100BASE4; else if (negotiated & LPA_100HALF) ret = LPA_100HALF; else if (negotiated & LPA_10FULL) ret = LPA_10FULL; else ret = LPA_10HALF; return ret; } /** * mii_duplex * @duplex_lock: Non-zero if duplex is locked at full * @negotiated: value of MII ANAR and'd with ANLPAR * * A small helper function for a common case. Returns one * if the media is operating or locked at full duplex, and * returns zero otherwise. */ static inline unsigned int mii_duplex (unsigned int duplex_lock, unsigned int negotiated) { if (duplex_lock) return 1; if (mii_nway_result(negotiated) & LPA_DUPLEX) return 1; return 0; } /** * ethtool_adv_to_mii_adv_t * @ethadv: the ethtool advertisement settings * * A small helper function that translates ethtool advertisement * settings to phy autonegotiation advertisements for the * MII_ADVERTISE register. */ static inline u32 ethtool_adv_to_mii_adv_t(u32 ethadv) { u32 result = 0; if (ethadv & ADVERTISED_10baseT_Half) result |= ADVERTISE_10HALF; if (ethadv & ADVERTISED_10baseT_Full) result |= ADVERTISE_10FULL; if (ethadv & ADVERTISED_100baseT_Half) result |= ADVERTISE_100HALF; if (ethadv & ADVERTISED_100baseT_Full) result |= ADVERTISE_100FULL; if (ethadv & ADVERTISED_Pause) result |= ADVERTISE_PAUSE_CAP; if (ethadv & ADVERTISED_Asym_Pause) result |= ADVERTISE_PAUSE_ASYM; return result; } /** * linkmode_adv_to_mii_adv_t * @advertising: the linkmode advertisement settings * * A small helper function that translates linkmode advertisement * settings to phy autonegotiation advertisements for the * MII_ADVERTISE register. */ static inline u32 linkmode_adv_to_mii_adv_t(const unsigned long *advertising) { u32 result = 0; if (linkmode_test_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT, advertising)) result |= ADVERTISE_10HALF; if (linkmode_test_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT, advertising)) result |= ADVERTISE_10FULL; if (linkmode_test_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT, advertising)) result |= ADVERTISE_100HALF; if (linkmode_test_bit(ETHTOOL_LINK_MODE_100baseT_Full_BIT, advertising)) result |= ADVERTISE_100FULL; if (linkmode_test_bit(ETHTOOL_LINK_MODE_Pause_BIT, advertising)) result |= ADVERTISE_PAUSE_CAP; if (linkmode_test_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, advertising)) result |= ADVERTISE_PAUSE_ASYM; return result; } /** * mii_adv_to_ethtool_adv_t * @adv: value of the MII_ADVERTISE register * * A small helper function that translates MII_ADVERTISE bits * to ethtool advertisement settings. */ static inline u32 mii_adv_to_ethtool_adv_t(u32 adv) { u32 result = 0; if (adv & ADVERTISE_10HALF) result |= ADVERTISED_10baseT_Half; if (adv & ADVERTISE_10FULL) result |= ADVERTISED_10baseT_Full; if (adv & ADVERTISE_100HALF) result |= ADVERTISED_100baseT_Half; if (adv & ADVERTISE_100FULL) result |= ADVERTISED_100baseT_Full; if (adv & ADVERTISE_PAUSE_CAP) result |= ADVERTISED_Pause; if (adv & ADVERTISE_PAUSE_ASYM) result |= ADVERTISED_Asym_Pause; return result; } /** * ethtool_adv_to_mii_ctrl1000_t * @ethadv: the ethtool advertisement settings * * A small helper function that translates ethtool advertisement * settings to phy autonegotiation advertisements for the * MII_CTRL1000 register when in 1000T mode. */ static inline u32 ethtool_adv_to_mii_ctrl1000_t(u32 ethadv) { u32 result = 0; if (ethadv & ADVERTISED_1000baseT_Half) result |= ADVERTISE_1000HALF; if (ethadv & ADVERTISED_1000baseT_Full) result |= ADVERTISE_1000FULL; return result; } /** * linkmode_adv_to_mii_ctrl1000_t * @advertising: the linkmode advertisement settings * * A small helper function that translates linkmode advertisement * settings to phy autonegotiation advertisements for the * MII_CTRL1000 register when in 1000T mode. */ static inline u32 linkmode_adv_to_mii_ctrl1000_t(const unsigned long *advertising) { u32 result = 0; if (linkmode_test_bit(ETHTOOL_LINK_MODE_1000baseT_Half_BIT, advertising)) result |= ADVERTISE_1000HALF; if (linkmode_test_bit(ETHTOOL_LINK_MODE_1000baseT_Full_BIT, advertising)) result |= ADVERTISE_1000FULL; return result; } /** * mii_ctrl1000_to_ethtool_adv_t * @adv: value of the MII_CTRL1000 register * * A small helper function that translates MII_CTRL1000 * bits, when in 1000Base-T mode, to ethtool * advertisement settings. */ static inline u32 mii_ctrl1000_to_ethtool_adv_t(u32 adv) { u32 result = 0; if (adv & ADVERTISE_1000HALF) result |= ADVERTISED_1000baseT_Half; if (adv & ADVERTISE_1000FULL) result |= ADVERTISED_1000baseT_Full; return result; } /** * mii_lpa_to_ethtool_lpa_t * @adv: value of the MII_LPA register * * A small helper function that translates MII_LPA * bits, when in 1000Base-T mode, to ethtool * LP advertisement settings. */ static inline u32 mii_lpa_to_ethtool_lpa_t(u32 lpa) { u32 result = 0; if (lpa & LPA_LPACK) result |= ADVERTISED_Autoneg; return result | mii_adv_to_ethtool_adv_t(lpa); } /** * mii_stat1000_to_ethtool_lpa_t * @adv: value of the MII_STAT1000 register * * A small helper function that translates MII_STAT1000 * bits, when in 1000Base-T mode, to ethtool * advertisement settings. */ static inline u32 mii_stat1000_to_ethtool_lpa_t(u32 lpa) { u32 result = 0; if (lpa & LPA_1000HALF) result |= ADVERTISED_1000baseT_Half; if (lpa & LPA_1000FULL) result |= ADVERTISED_1000baseT_Full; return result; } /** * mii_stat1000_mod_linkmode_lpa_t * @advertising: target the linkmode advertisement settings * @adv: value of the MII_STAT1000 register * * A small helper function that translates MII_STAT1000 bits, when in * 1000Base-T mode, to linkmode advertisement settings. Other bits in * advertising are not changes. */ static inline void mii_stat1000_mod_linkmode_lpa_t(unsigned long *advertising, u32 lpa) { linkmode_mod_bit(ETHTOOL_LINK_MODE_1000baseT_Half_BIT, advertising, lpa & LPA_1000HALF); linkmode_mod_bit(ETHTOOL_LINK_MODE_1000baseT_Full_BIT, advertising, lpa & LPA_1000FULL); } /** * ethtool_adv_to_mii_adv_x * @ethadv: the ethtool advertisement settings * * A small helper function that translates ethtool advertisement * settings to phy autonegotiation advertisements for the * MII_CTRL1000 register when in 1000Base-X mode. */ static inline u32 ethtool_adv_to_mii_adv_x(u32 ethadv) { u32 result = 0; if (ethadv & ADVERTISED_1000baseT_Half) result |= ADVERTISE_1000XHALF; if (ethadv & ADVERTISED_1000baseT_Full) result |= ADVERTISE_1000XFULL; if (ethadv & ADVERTISED_Pause) result |= ADVERTISE_1000XPAUSE; if (ethadv & ADVERTISED_Asym_Pause) result |= ADVERTISE_1000XPSE_ASYM; return result; } /** * mii_adv_to_ethtool_adv_x * @adv: value of the MII_CTRL1000 register * * A small helper function that translates MII_CTRL1000 * bits, when in 1000Base-X mode, to ethtool * advertisement settings. */ static inline u32 mii_adv_to_ethtool_adv_x(u32 adv) { u32 result = 0; if (adv & ADVERTISE_1000XHALF) result |= ADVERTISED_1000baseT_Half; if (adv & ADVERTISE_1000XFULL) result |= ADVERTISED_1000baseT_Full; if (adv & ADVERTISE_1000XPAUSE) result |= ADVERTISED_Pause; if (adv & ADVERTISE_1000XPSE_ASYM) result |= ADVERTISED_Asym_Pause; return result; } /** * mii_adv_mod_linkmode_adv_t * @advertising:pointer to destination link mode. * @adv: value of the MII_ADVERTISE register * * A small helper function that translates MII_ADVERTISE bits to * linkmode advertisement settings. Leaves other bits unchanged. */ static inline void mii_adv_mod_linkmode_adv_t(unsigned long *advertising, u32 adv) { linkmode_mod_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT, advertising, adv & ADVERTISE_10HALF); linkmode_mod_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT, advertising, adv & ADVERTISE_10FULL); linkmode_mod_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT, advertising, adv & ADVERTISE_100HALF); linkmode_mod_bit(ETHTOOL_LINK_MODE_100baseT_Full_BIT, advertising, adv & ADVERTISE_100FULL); linkmode_mod_bit(ETHTOOL_LINK_MODE_Pause_BIT, advertising, adv & ADVERTISE_PAUSE_CAP); linkmode_mod_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, advertising, adv & ADVERTISE_PAUSE_ASYM); } /** * mii_adv_to_linkmode_adv_t * @advertising:pointer to destination link mode. * @adv: value of the MII_ADVERTISE register * * A small helper function that translates MII_ADVERTISE bits * to linkmode advertisement settings. Clears the old value * of advertising. */ static inline void mii_adv_to_linkmode_adv_t(unsigned long *advertising, u32 adv) { linkmode_zero(advertising); mii_adv_mod_linkmode_adv_t(advertising, adv); } /** * mii_lpa_to_linkmode_lpa_t * @adv: value of the MII_LPA register * * A small helper function that translates MII_LPA bits, when in * 1000Base-T mode, to linkmode LP advertisement settings. Clears the * old value of advertising */ static inline void mii_lpa_to_linkmode_lpa_t(unsigned long *lp_advertising, u32 lpa) { mii_adv_to_linkmode_adv_t(lp_advertising, lpa); if (lpa & LPA_LPACK) linkmode_set_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, lp_advertising); } /** * mii_lpa_mod_linkmode_lpa_t * @adv: value of the MII_LPA register * * A small helper function that translates MII_LPA bits, when in * 1000Base-T mode, to linkmode LP advertisement settings. Leaves * other bits unchanged. */ static inline void mii_lpa_mod_linkmode_lpa_t(unsigned long *lp_advertising, u32 lpa) { mii_adv_mod_linkmode_adv_t(lp_advertising, lpa); linkmode_mod_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, lp_advertising, lpa & LPA_LPACK); } static inline void mii_ctrl1000_mod_linkmode_adv_t(unsigned long *advertising, u32 ctrl1000) { linkmode_mod_bit(ETHTOOL_LINK_MODE_1000baseT_Half_BIT, advertising, ctrl1000 & ADVERTISE_1000HALF); linkmode_mod_bit(ETHTOOL_LINK_MODE_1000baseT_Full_BIT, advertising, ctrl1000 & ADVERTISE_1000FULL); } /** * linkmode_adv_to_lcl_adv_t * @advertising:pointer to linkmode advertising * * A small helper function that translates linkmode advertising to LVL * pause capabilities. */ static inline u32 linkmode_adv_to_lcl_adv_t(const unsigned long *advertising) { u32 lcl_adv = 0; if (linkmode_test_bit(ETHTOOL_LINK_MODE_Pause_BIT, advertising)) lcl_adv |= ADVERTISE_PAUSE_CAP; if (linkmode_test_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, advertising)) lcl_adv |= ADVERTISE_PAUSE_ASYM; return lcl_adv; } /** * mii_lpa_mod_linkmode_x - decode the link partner's config_reg to linkmodes * @linkmodes: link modes array * @lpa: config_reg word from link partner * @fd_bit: link mode for 1000XFULL bit */ static inline void mii_lpa_mod_linkmode_x(unsigned long *linkmodes, u16 lpa, int fd_bit) { linkmode_mod_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, linkmodes, lpa & LPA_LPACK); linkmode_mod_bit(ETHTOOL_LINK_MODE_Pause_BIT, linkmodes, lpa & LPA_1000XPAUSE); linkmode_mod_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, linkmodes, lpa & LPA_1000XPAUSE_ASYM); linkmode_mod_bit(fd_bit, linkmodes, lpa & LPA_1000XFULL); } /** * linkmode_adv_to_mii_adv_x - encode a linkmode to config_reg * @linkmodes: linkmodes * @fd_bit: full duplex bit */ static inline u16 linkmode_adv_to_mii_adv_x(const unsigned long *linkmodes, int fd_bit) { u16 adv = 0; if (linkmode_test_bit(fd_bit, linkmodes)) adv |= ADVERTISE_1000XFULL; if (linkmode_test_bit(ETHTOOL_LINK_MODE_Pause_BIT, linkmodes)) adv |= ADVERTISE_1000XPAUSE; if (linkmode_test_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, linkmodes)) adv |= ADVERTISE_1000XPSE_ASYM; return adv; } /** * mii_advertise_flowctrl - get flow control advertisement flags * @cap: Flow control capabilities (FLOW_CTRL_RX, FLOW_CTRL_TX or both) */ static inline u16 mii_advertise_flowctrl(int cap) { u16 adv = 0; if (cap & FLOW_CTRL_RX) adv = ADVERTISE_PAUSE_CAP | ADVERTISE_PAUSE_ASYM; if (cap & FLOW_CTRL_TX) adv ^= ADVERTISE_PAUSE_ASYM; return adv; } /** * mii_resolve_flowctrl_fdx * @lcladv: value of MII ADVERTISE register * @rmtadv: value of MII LPA register * * Resolve full duplex flow control as per IEEE 802.3-2005 table 28B-3 */ static inline u8 mii_resolve_flowctrl_fdx(u16 lcladv, u16 rmtadv) { u8 cap = 0; if (lcladv & rmtadv & ADVERTISE_PAUSE_CAP) { cap = FLOW_CTRL_TX | FLOW_CTRL_RX; } else if (lcladv & rmtadv & ADVERTISE_PAUSE_ASYM) { if (lcladv & ADVERTISE_PAUSE_CAP) cap = FLOW_CTRL_RX; else if (rmtadv & ADVERTISE_PAUSE_CAP) cap = FLOW_CTRL_TX; } return cap; } /** * mii_bmcr_encode_fixed - encode fixed speed/duplex settings to a BMCR value * @speed: a SPEED_* value * @duplex: a DUPLEX_* value * * Encode the speed and duplex to a BMCR value. 2500, 1000, 100 and 10 Mbps are * supported. 2500Mbps is encoded to 1000Mbps. Other speeds are encoded as 10 * Mbps. Unknown duplex values are encoded to half-duplex. */ static inline u16 mii_bmcr_encode_fixed(int speed, int duplex) { u16 bmcr; switch (speed) { case SPEED_2500: case SPEED_1000: bmcr = BMCR_SPEED1000; break; case SPEED_100: bmcr = BMCR_SPEED100; break; case SPEED_10: default: bmcr = BMCR_SPEED10; break; } if (duplex == DUPLEX_FULL) bmcr |= BMCR_FULLDPLX; return bmcr; } #endif /* __LINUX_MII_H__ */ |
| 14 22 32 26 25 25 4 38 28 2 16 5 3 3 25 12 9 9 4 25 4 4 4 2 4 10 9 9 3 3 1 15 2 3 2 3 3 12 10 8 15 156 162 154 161 159 6693 6700 6680 5841 3 2 5827 6 6695 6704 6776 1399 1405 1394 1232 1372 4 4 303 302 296 4 293 293 236 236 235 293 304 71 71 2466 2466 2494 2466 2458 1116 1391 1408 2407 1693 1679 2414 2418 1871 1888 1892 1796 758 1120 2965 2956 2990 2961 2949 1683 2958 1156 1880 1893 1511 1496 2710 2736 4456 2055 2048 1875 186 2009 2008 2034 2780 2773 2087 766 2516 2604 2753 460 453 461 461 219 217 220 220 437 439 35 435 250 185 400 400 413 581 577 573 338 332 239 242 304 364 368 214 583 578 577 580 568 567 254 343 458 420 546 551 426 427 423 423 417 417 415 185 244 300 222 400 407 411 455 451 365 94 161 310 249 437 434 406 403 331 75 187 217 209 390 390 136 139 135 136 54 131 29 29 27 26 12 28 420 399 130 41 33 8 26 8 5 3 609 592 607 608 585 19 602 602 610 606 598 605 2 1 604 455 458 146 580 428 427 431 424 5 424 581 207 615 21 594 588 14 14 13 12 12 10 10 7 9 3 8 14 17 17 17 7 7 4 4 4 4 4 4 4 4 3 3 2 3 4 3 21 22 21 21 20 6 14 19 5 14 19 17 3 1 2 3 2 1 159 159 2 158 158 158 158 158 156 152 54 150 151 156 156 156 152 94 21 19 19 16 20 15 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 | // SPDX-License-Identifier: GPL-2.0 /* * linux/fs/read_write.c * * Copyright (C) 1991, 1992 Linus Torvalds */ #include <linux/slab.h> #include <linux/stat.h> #include <linux/sched/xacct.h> #include <linux/fcntl.h> #include <linux/file.h> #include <linux/uio.h> #include <linux/fsnotify.h> #include <linux/security.h> #include <linux/export.h> #include <linux/syscalls.h> #include <linux/pagemap.h> #include <linux/splice.h> #include <linux/compat.h> #include <linux/mount.h> #include <linux/fs.h> #include "internal.h" #include <linux/uaccess.h> #include <asm/unistd.h> const struct file_operations generic_ro_fops = { .llseek = generic_file_llseek, .read_iter = generic_file_read_iter, .mmap = generic_file_readonly_mmap, .splice_read = filemap_splice_read, }; EXPORT_SYMBOL(generic_ro_fops); static inline bool unsigned_offsets(struct file *file) { return file->f_op->fop_flags & FOP_UNSIGNED_OFFSET; } /** * vfs_setpos_cookie - update the file offset for lseek and reset cookie * @file: file structure in question * @offset: file offset to seek to * @maxsize: maximum file size * @cookie: cookie to reset * * Update the file offset to the value specified by @offset if the given * offset is valid and it is not equal to the current file offset and * reset the specified cookie to indicate that a seek happened. * * Return the specified offset on success and -EINVAL on invalid offset. */ static loff_t vfs_setpos_cookie(struct file *file, loff_t offset, loff_t maxsize, u64 *cookie) { if (offset < 0 && !unsigned_offsets(file)) return -EINVAL; if (offset > maxsize) return -EINVAL; if (offset != file->f_pos) { file->f_pos = offset; if (cookie) *cookie = 0; } return offset; } /** * vfs_setpos - update the file offset for lseek * @file: file structure in question * @offset: file offset to seek to * @maxsize: maximum file size * * This is a low-level filesystem helper for updating the file offset to * the value specified by @offset if the given offset is valid and it is * not equal to the current file offset. * * Return the specified offset on success and -EINVAL on invalid offset. */ loff_t vfs_setpos(struct file *file, loff_t offset, loff_t maxsize) { return vfs_setpos_cookie(file, offset, maxsize, NULL); } EXPORT_SYMBOL(vfs_setpos); /** * must_set_pos - check whether f_pos has to be updated * @file: file to seek on * @offset: offset to use * @whence: type of seek operation * @eof: end of file * * Check whether f_pos needs to be updated and update @offset according * to @whence. * * Return: 0 if f_pos doesn't need to be updated, 1 if f_pos has to be * updated, and negative error code on failure. */ static int must_set_pos(struct file *file, loff_t *offset, int whence, loff_t eof) { switch (whence) { case SEEK_END: *offset += eof; break; case SEEK_CUR: /* * Here we special-case the lseek(fd, 0, SEEK_CUR) * position-querying operation. Avoid rewriting the "same" * f_pos value back to the file because a concurrent read(), * write() or lseek() might have altered it */ if (*offset == 0) { *offset = file->f_pos; return 0; } break; case SEEK_DATA: /* * In the generic case the entire file is data, so as long as * offset isn't at the end of the file then the offset is data. */ if ((unsigned long long)*offset >= eof) return -ENXIO; break; case SEEK_HOLE: /* * There is a virtual hole at the end of the file, so as long as * offset isn't i_size or larger, return i_size. */ if ((unsigned long long)*offset >= eof) return -ENXIO; *offset = eof; break; } return 1; } /** * generic_file_llseek_size - generic llseek implementation for regular files * @file: file structure to seek on * @offset: file offset to seek to * @whence: type of seek * @maxsize: max size of this file in file system * @eof: offset used for SEEK_END position * * This is a variant of generic_file_llseek that allows passing in a custom * maximum file size and a custom EOF position, for e.g. hashed directories * * Synchronization: * SEEK_SET and SEEK_END are unsynchronized (but atomic on 64bit platforms) * SEEK_CUR is synchronized against other SEEK_CURs, but not read/writes. * read/writes behave like SEEK_SET against seeks. */ loff_t generic_file_llseek_size(struct file *file, loff_t offset, int whence, loff_t maxsize, loff_t eof) { int ret; ret = must_set_pos(file, &offset, whence, eof); if (ret < 0) return ret; if (ret == 0) return offset; if (whence == SEEK_CUR) { /* * If the file requires locking via f_pos_lock we know * that mutual exclusion for SEEK_CUR on the same file * is guaranteed. If the file isn't locked, we take * f_lock to protect against f_pos races with other * SEEK_CURs. */ if (file_seek_cur_needs_f_lock(file)) { guard(spinlock)(&file->f_lock); return vfs_setpos(file, file->f_pos + offset, maxsize); } return vfs_setpos(file, file->f_pos + offset, maxsize); } return vfs_setpos(file, offset, maxsize); } EXPORT_SYMBOL(generic_file_llseek_size); /** * generic_llseek_cookie - versioned llseek implementation * @file: file structure to seek on * @offset: file offset to seek to * @whence: type of seek * @cookie: cookie to update * * See generic_file_llseek for a general description and locking assumptions. * * In contrast to generic_file_llseek, this function also resets a * specified cookie to indicate a seek took place. */ loff_t generic_llseek_cookie(struct file *file, loff_t offset, int whence, u64 *cookie) { struct inode *inode = file->f_mapping->host; loff_t maxsize = inode->i_sb->s_maxbytes; loff_t eof = i_size_read(inode); int ret; if (WARN_ON_ONCE(!cookie)) return -EINVAL; /* * Require that this is only used for directories that guarantee * synchronization between readdir and seek so that an update to * @cookie is correctly synchronized with concurrent readdir. */ if (WARN_ON_ONCE(!(file->f_mode & FMODE_ATOMIC_POS))) return -EINVAL; ret = must_set_pos(file, &offset, whence, eof); if (ret < 0) return ret; if (ret == 0) return offset; /* No need to hold f_lock because we know that f_pos_lock is held. */ if (whence == SEEK_CUR) return vfs_setpos_cookie(file, file->f_pos + offset, maxsize, cookie); return vfs_setpos_cookie(file, offset, maxsize, cookie); } EXPORT_SYMBOL(generic_llseek_cookie); /** * generic_file_llseek - generic llseek implementation for regular files * @file: file structure to seek on * @offset: file offset to seek to * @whence: type of seek * * This is a generic implemenation of ->llseek useable for all normal local * filesystems. It just updates the file offset to the value specified by * @offset and @whence. */ loff_t generic_file_llseek(struct file *file, loff_t offset, int whence) { struct inode *inode = file->f_mapping->host; return generic_file_llseek_size(file, offset, whence, inode->i_sb->s_maxbytes, i_size_read(inode)); } EXPORT_SYMBOL(generic_file_llseek); /** * fixed_size_llseek - llseek implementation for fixed-sized devices * @file: file structure to seek on * @offset: file offset to seek to * @whence: type of seek * @size: size of the file * */ loff_t fixed_size_llseek(struct file *file, loff_t offset, int whence, loff_t size) { switch (whence) { case SEEK_SET: case SEEK_CUR: case SEEK_END: return generic_file_llseek_size(file, offset, whence, size, size); default: return -EINVAL; } } EXPORT_SYMBOL(fixed_size_llseek); /** * no_seek_end_llseek - llseek implementation for fixed-sized devices * @file: file structure to seek on * @offset: file offset to seek to * @whence: type of seek * */ loff_t no_seek_end_llseek(struct file *file, loff_t offset, int whence) { switch (whence) { case SEEK_SET: case SEEK_CUR: return generic_file_llseek_size(file, offset, whence, OFFSET_MAX, 0); default: return -EINVAL; } } EXPORT_SYMBOL(no_seek_end_llseek); /** * no_seek_end_llseek_size - llseek implementation for fixed-sized devices * @file: file structure to seek on * @offset: file offset to seek to * @whence: type of seek * @size: maximal offset allowed * */ loff_t no_seek_end_llseek_size(struct file *file, loff_t offset, int whence, loff_t size) { switch (whence) { case SEEK_SET: case SEEK_CUR: return generic_file_llseek_size(file, offset, whence, size, 0); default: return -EINVAL; } } EXPORT_SYMBOL(no_seek_end_llseek_size); /** * noop_llseek - No Operation Performed llseek implementation * @file: file structure to seek on * @offset: file offset to seek to * @whence: type of seek * * This is an implementation of ->llseek useable for the rare special case when * userspace expects the seek to succeed but the (device) file is actually not * able to perform the seek. In this case you use noop_llseek() instead of * falling back to the default implementation of ->llseek. */ loff_t noop_llseek(struct file *file, loff_t offset, int whence) { return file->f_pos; } EXPORT_SYMBOL(noop_llseek); loff_t default_llseek(struct file *file, loff_t offset, int whence) { struct inode *inode = file_inode(file); loff_t retval; inode_lock(inode); switch (whence) { case SEEK_END: offset += i_size_read(inode); break; case SEEK_CUR: if (offset == 0) { retval = file->f_pos; goto out; } offset += file->f_pos; break; case SEEK_DATA: /* * In the generic case the entire file is data, so as * long as offset isn't at the end of the file then the * offset is data. */ if (offset >= inode->i_size) { retval = -ENXIO; goto out; } break; case SEEK_HOLE: /* * There is a virtual hole at the end of the file, so * as long as offset isn't i_size or larger, return * i_size. */ if (offset >= inode->i_size) { retval = -ENXIO; goto out; } offset = inode->i_size; break; } retval = -EINVAL; if (offset >= 0 || unsigned_offsets(file)) { if (offset != file->f_pos) file->f_pos = offset; retval = offset; } out: inode_unlock(inode); return retval; } EXPORT_SYMBOL(default_llseek); loff_t vfs_llseek(struct file *file, loff_t offset, int whence) { if (!(file->f_mode & FMODE_LSEEK)) return -ESPIPE; return file->f_op->llseek(file, offset, whence); } EXPORT_SYMBOL(vfs_llseek); static off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence) { off_t retval; CLASS(fd_pos, f)(fd); if (fd_empty(f)) return -EBADF; retval = -EINVAL; if (whence <= SEEK_MAX) { loff_t res = vfs_llseek(fd_file(f), offset, whence); retval = res; if (res != (loff_t)retval) retval = -EOVERFLOW; /* LFS: should only happen on 32 bit platforms */ } return retval; } SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence) { return ksys_lseek(fd, offset, whence); } #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE3(lseek, unsigned int, fd, compat_off_t, offset, unsigned int, whence) { return ksys_lseek(fd, offset, whence); } #endif #if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) || \ defined(__ARCH_WANT_SYS_LLSEEK) SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high, unsigned long, offset_low, loff_t __user *, result, unsigned int, whence) { int retval; CLASS(fd_pos, f)(fd); loff_t offset; if (fd_empty(f)) return -EBADF; if (whence > SEEK_MAX) return -EINVAL; offset = vfs_llseek(fd_file(f), ((loff_t) offset_high << 32) | offset_low, whence); retval = (int)offset; if (offset >= 0) { retval = -EFAULT; if (!copy_to_user(result, &offset, sizeof(offset))) retval = 0; } return retval; } #endif int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t count) { int mask = read_write == READ ? MAY_READ : MAY_WRITE; int ret; if (unlikely((ssize_t) count < 0)) return -EINVAL; if (ppos) { loff_t pos = *ppos; if (unlikely(pos < 0)) { if (!unsigned_offsets(file)) return -EINVAL; if (count >= -pos) /* both values are in 0..LLONG_MAX */ return -EOVERFLOW; } else if (unlikely((loff_t) (pos + count) < 0)) { if (!unsigned_offsets(file)) return -EINVAL; } } ret = security_file_permission(file, mask); if (ret) return ret; return fsnotify_file_area_perm(file, mask, ppos, count); } EXPORT_SYMBOL(rw_verify_area); static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, loff_t *ppos) { struct kiocb kiocb; struct iov_iter iter; ssize_t ret; init_sync_kiocb(&kiocb, filp); kiocb.ki_pos = (ppos ? *ppos : 0); iov_iter_ubuf(&iter, ITER_DEST, buf, len); ret = filp->f_op->read_iter(&kiocb, &iter); BUG_ON(ret == -EIOCBQUEUED); if (ppos) *ppos = kiocb.ki_pos; return ret; } static int warn_unsupported(struct file *file, const char *op) { pr_warn_ratelimited( "kernel %s not supported for file %pD4 (pid: %d comm: %.20s)\n", op, file, current->pid, current->comm); return -EINVAL; } ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) { struct kvec iov = { .iov_base = buf, .iov_len = min_t(size_t, count, MAX_RW_COUNT), }; struct kiocb kiocb; struct iov_iter iter; ssize_t ret; if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ))) return -EINVAL; if (!(file->f_mode & FMODE_CAN_READ)) return -EINVAL; /* * Also fail if ->read_iter and ->read are both wired up as that * implies very convoluted semantics. */ if (unlikely(!file->f_op->read_iter || file->f_op->read)) return warn_unsupported(file, "read"); init_sync_kiocb(&kiocb, file); kiocb.ki_pos = pos ? *pos : 0; iov_iter_kvec(&iter, ITER_DEST, &iov, 1, iov.iov_len); ret = file->f_op->read_iter(&kiocb, &iter); if (ret > 0) { if (pos) *pos = kiocb.ki_pos; fsnotify_access(file); add_rchar(current, ret); } inc_syscr(current); return ret; } ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) { ssize_t ret; ret = rw_verify_area(READ, file, pos, count); if (ret) return ret; return __kernel_read(file, buf, count, pos); } EXPORT_SYMBOL(kernel_read); ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos) { ssize_t ret; if (!(file->f_mode & FMODE_READ)) return -EBADF; if (!(file->f_mode & FMODE_CAN_READ)) return -EINVAL; if (unlikely(!access_ok(buf, count))) return -EFAULT; ret = rw_verify_area(READ, file, pos, count); if (ret) return ret; if (count > MAX_RW_COUNT) count = MAX_RW_COUNT; if (file->f_op->read) ret = file->f_op->read(file, buf, count, pos); else if (file->f_op->read_iter) ret = new_sync_read(file, buf, count, pos); else ret = -EINVAL; if (ret > 0) { fsnotify_access(file); add_rchar(current, ret); } inc_syscr(current); return ret; } static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t len, loff_t *ppos) { struct kiocb kiocb; struct iov_iter iter; ssize_t ret; init_sync_kiocb(&kiocb, filp); kiocb.ki_pos = (ppos ? *ppos : 0); iov_iter_ubuf(&iter, ITER_SOURCE, (void __user *)buf, len); ret = filp->f_op->write_iter(&kiocb, &iter); BUG_ON(ret == -EIOCBQUEUED); if (ret > 0 && ppos) *ppos = kiocb.ki_pos; return ret; } /* caller is responsible for file_start_write/file_end_write */ ssize_t __kernel_write_iter(struct file *file, struct iov_iter *from, loff_t *pos) { struct kiocb kiocb; ssize_t ret; if (WARN_ON_ONCE(!(file->f_mode & FMODE_WRITE))) return -EBADF; if (!(file->f_mode & FMODE_CAN_WRITE)) return -EINVAL; /* * Also fail if ->write_iter and ->write are both wired up as that * implies very convoluted semantics. */ if (unlikely(!file->f_op->write_iter || file->f_op->write)) return warn_unsupported(file, "write"); init_sync_kiocb(&kiocb, file); kiocb.ki_pos = pos ? *pos : 0; ret = file->f_op->write_iter(&kiocb, from); if (ret > 0) { if (pos) *pos = kiocb.ki_pos; fsnotify_modify(file); add_wchar(current, ret); } inc_syscw(current); return ret; } /* caller is responsible for file_start_write/file_end_write */ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) { struct kvec iov = { .iov_base = (void *)buf, .iov_len = min_t(size_t, count, MAX_RW_COUNT), }; struct iov_iter iter; iov_iter_kvec(&iter, ITER_SOURCE, &iov, 1, iov.iov_len); return __kernel_write_iter(file, &iter, pos); } /* * This "EXPORT_SYMBOL_GPL()" is more of a "EXPORT_SYMBOL_DONTUSE()", * but autofs is one of the few internal kernel users that actually * wants this _and_ can be built as a module. So we need to export * this symbol for autofs, even though it really isn't appropriate * for any other kernel modules. */ EXPORT_SYMBOL_GPL(__kernel_write); ssize_t kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) { ssize_t ret; ret = rw_verify_area(WRITE, file, pos, count); if (ret) return ret; file_start_write(file); ret = __kernel_write(file, buf, count, pos); file_end_write(file); return ret; } EXPORT_SYMBOL(kernel_write); ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos) { ssize_t ret; if (!(file->f_mode & FMODE_WRITE)) return -EBADF; if (!(file->f_mode & FMODE_CAN_WRITE)) return -EINVAL; if (unlikely(!access_ok(buf, count))) return -EFAULT; ret = rw_verify_area(WRITE, file, pos, count); if (ret) return ret; if (count > MAX_RW_COUNT) count = MAX_RW_COUNT; file_start_write(file); if (file->f_op->write) ret = file->f_op->write(file, buf, count, pos); else if (file->f_op->write_iter) ret = new_sync_write(file, buf, count, pos); else ret = -EINVAL; if (ret > 0) { fsnotify_modify(file); add_wchar(current, ret); } inc_syscw(current); file_end_write(file); return ret; } /* file_ppos returns &file->f_pos or NULL if file is stream */ static inline loff_t *file_ppos(struct file *file) { return file->f_mode & FMODE_STREAM ? NULL : &file->f_pos; } ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count) { CLASS(fd_pos, f)(fd); ssize_t ret = -EBADF; if (!fd_empty(f)) { loff_t pos, *ppos = file_ppos(fd_file(f)); if (ppos) { pos = *ppos; ppos = &pos; } ret = vfs_read(fd_file(f), buf, count, ppos); if (ret >= 0 && ppos) fd_file(f)->f_pos = pos; } return ret; } SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count) { return ksys_read(fd, buf, count); } ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count) { CLASS(fd_pos, f)(fd); ssize_t ret = -EBADF; if (!fd_empty(f)) { loff_t pos, *ppos = file_ppos(fd_file(f)); if (ppos) { pos = *ppos; ppos = &pos; } ret = vfs_write(fd_file(f), buf, count, ppos); if (ret >= 0 && ppos) fd_file(f)->f_pos = pos; } return ret; } SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf, size_t, count) { return ksys_write(fd, buf, count); } ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count, loff_t pos) { if (pos < 0) return -EINVAL; CLASS(fd, f)(fd); if (fd_empty(f)) return -EBADF; if (fd_file(f)->f_mode & FMODE_PREAD) return vfs_read(fd_file(f), buf, count, &pos); return -ESPIPE; } SYSCALL_DEFINE4(pread64, unsigned int, fd, char __user *, buf, size_t, count, loff_t, pos) { return ksys_pread64(fd, buf, count, pos); } #if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_PREAD64) COMPAT_SYSCALL_DEFINE5(pread64, unsigned int, fd, char __user *, buf, size_t, count, compat_arg_u64_dual(pos)) { return ksys_pread64(fd, buf, count, compat_arg_u64_glue(pos)); } #endif ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf, size_t count, loff_t pos) { if (pos < 0) return -EINVAL; CLASS(fd, f)(fd); if (fd_empty(f)) return -EBADF; if (fd_file(f)->f_mode & FMODE_PWRITE) return vfs_write(fd_file(f), buf, count, &pos); return -ESPIPE; } SYSCALL_DEFINE4(pwrite64, unsigned int, fd, const char __user *, buf, size_t, count, loff_t, pos) { return ksys_pwrite64(fd, buf, count, pos); } #if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_PWRITE64) COMPAT_SYSCALL_DEFINE5(pwrite64, unsigned int, fd, const char __user *, buf, size_t, count, compat_arg_u64_dual(pos)) { return ksys_pwrite64(fd, buf, count, compat_arg_u64_glue(pos)); } #endif static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter, loff_t *ppos, int type, rwf_t flags) { struct kiocb kiocb; ssize_t ret; init_sync_kiocb(&kiocb, filp); ret = kiocb_set_rw_flags(&kiocb, flags, type); if (ret) return ret; kiocb.ki_pos = (ppos ? *ppos : 0); if (type == READ) ret = filp->f_op->read_iter(&kiocb, iter); else ret = filp->f_op->write_iter(&kiocb, iter); BUG_ON(ret == -EIOCBQUEUED); if (ppos) *ppos = kiocb.ki_pos; return ret; } /* Do it by hand, with file-ops */ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter, loff_t *ppos, int type, rwf_t flags) { ssize_t ret = 0; if (flags & ~RWF_HIPRI) return -EOPNOTSUPP; while (iov_iter_count(iter)) { ssize_t nr; if (type == READ) { nr = filp->f_op->read(filp, iter_iov_addr(iter), iter_iov_len(iter), ppos); } else { nr = filp->f_op->write(filp, iter_iov_addr(iter), iter_iov_len(iter), ppos); } if (nr < 0) { if (!ret) ret = nr; break; } ret += nr; if (nr != iter_iov_len(iter)) break; iov_iter_advance(iter, nr); } return ret; } ssize_t vfs_iocb_iter_read(struct file *file, struct kiocb *iocb, struct iov_iter *iter) { size_t tot_len; ssize_t ret = 0; if (!file->f_op->read_iter) return -EINVAL; if (!(file->f_mode & FMODE_READ)) return -EBADF; if (!(file->f_mode & FMODE_CAN_READ)) return -EINVAL; tot_len = iov_iter_count(iter); if (!tot_len) goto out; ret = rw_verify_area(READ, file, &iocb->ki_pos, tot_len); if (ret < 0) return ret; ret = file->f_op->read_iter(iocb, iter); out: if (ret >= 0) fsnotify_access(file); return ret; } EXPORT_SYMBOL(vfs_iocb_iter_read); ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos, rwf_t flags) { size_t tot_len; ssize_t ret = 0; if (!file->f_op->read_iter) return -EINVAL; if (!(file->f_mode & FMODE_READ)) return -EBADF; if (!(file->f_mode & FMODE_CAN_READ)) return -EINVAL; tot_len = iov_iter_count(iter); if (!tot_len) goto out; ret = rw_verify_area(READ, file, ppos, tot_len); if (ret < 0) return ret; ret = do_iter_readv_writev(file, iter, ppos, READ, flags); out: if (ret >= 0) fsnotify_access(file); return ret; } EXPORT_SYMBOL(vfs_iter_read); /* * Caller is responsible for calling kiocb_end_write() on completion * if async iocb was queued. */ ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb, struct iov_iter *iter) { size_t tot_len; ssize_t ret = 0; if (!file->f_op->write_iter) return -EINVAL; if (!(file->f_mode & FMODE_WRITE)) return -EBADF; if (!(file->f_mode & FMODE_CAN_WRITE)) return -EINVAL; tot_len = iov_iter_count(iter); if (!tot_len) return 0; ret = rw_verify_area(WRITE, file, &iocb->ki_pos, tot_len); if (ret < 0) return ret; kiocb_start_write(iocb); ret = file->f_op->write_iter(iocb, iter); if (ret != -EIOCBQUEUED) kiocb_end_write(iocb); if (ret > 0) fsnotify_modify(file); return ret; } EXPORT_SYMBOL(vfs_iocb_iter_write); ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos, rwf_t flags) { size_t tot_len; ssize_t ret; if (!(file->f_mode & FMODE_WRITE)) return -EBADF; if (!(file->f_mode & FMODE_CAN_WRITE)) return -EINVAL; if (!file->f_op->write_iter) return -EINVAL; tot_len = iov_iter_count(iter); if (!tot_len) return 0; ret = rw_verify_area(WRITE, file, ppos, tot_len); if (ret < 0) return ret; file_start_write(file); ret = do_iter_readv_writev(file, iter, ppos, WRITE, flags); if (ret > 0) fsnotify_modify(file); file_end_write(file); return ret; } EXPORT_SYMBOL(vfs_iter_write); static ssize_t vfs_readv(struct file *file, const struct iovec __user *vec, unsigned long vlen, loff_t *pos, rwf_t flags) { struct iovec iovstack[UIO_FASTIOV]; struct iovec *iov = iovstack; struct iov_iter iter; size_t tot_len; ssize_t ret = 0; if (!(file->f_mode & FMODE_READ)) return -EBADF; if (!(file->f_mode & FMODE_CAN_READ)) return -EINVAL; ret = import_iovec(ITER_DEST, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter); if (ret < 0) return ret; tot_len = iov_iter_count(&iter); if (!tot_len) goto out; ret = rw_verify_area(READ, file, pos, tot_len); if (ret < 0) goto out; if (file->f_op->read_iter) ret = do_iter_readv_writev(file, &iter, pos, READ, flags); else ret = do_loop_readv_writev(file, &iter, pos, READ, flags); out: if (ret >= 0) fsnotify_access(file); kfree(iov); return ret; } static ssize_t vfs_writev(struct file *file, const struct iovec __user *vec, unsigned long vlen, loff_t *pos, rwf_t flags) { struct iovec iovstack[UIO_FASTIOV]; struct iovec *iov = iovstack; struct iov_iter iter; size_t tot_len; ssize_t ret = 0; if (!(file->f_mode & FMODE_WRITE)) return -EBADF; if (!(file->f_mode & FMODE_CAN_WRITE)) return -EINVAL; ret = import_iovec(ITER_SOURCE, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter); if (ret < 0) return ret; tot_len = iov_iter_count(&iter); if (!tot_len) goto out; ret = rw_verify_area(WRITE, file, pos, tot_len); if (ret < 0) goto out; file_start_write(file); if (file->f_op->write_iter) ret = do_iter_readv_writev(file, &iter, pos, WRITE, flags); else ret = do_loop_readv_writev(file, &iter, pos, WRITE, flags); if (ret > 0) fsnotify_modify(file); file_end_write(file); out: kfree(iov); return ret; } static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec, unsigned long vlen, rwf_t flags) { CLASS(fd_pos, f)(fd); ssize_t ret = -EBADF; if (!fd_empty(f)) { loff_t pos, *ppos = file_ppos(fd_file(f)); if (ppos) { pos = *ppos; ppos = &pos; } ret = vfs_readv(fd_file(f), vec, vlen, ppos, flags); if (ret >= 0 && ppos) fd_file(f)->f_pos = pos; } if (ret > 0) add_rchar(current, ret); inc_syscr(current); return ret; } static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec, unsigned long vlen, rwf_t flags) { CLASS(fd_pos, f)(fd); ssize_t ret = -EBADF; if (!fd_empty(f)) { loff_t pos, *ppos = file_ppos(fd_file(f)); if (ppos) { pos = *ppos; ppos = &pos; } ret = vfs_writev(fd_file(f), vec, vlen, ppos, flags); if (ret >= 0 && ppos) fd_file(f)->f_pos = pos; } if (ret > 0) add_wchar(current, ret); inc_syscw(current); return ret; } static inline loff_t pos_from_hilo(unsigned long high, unsigned long low) { #define HALF_LONG_BITS (BITS_PER_LONG / 2) return (((loff_t)high << HALF_LONG_BITS) << HALF_LONG_BITS) | low; } static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec, unsigned long vlen, loff_t pos, rwf_t flags) { ssize_t ret = -EBADF; if (pos < 0) return -EINVAL; CLASS(fd, f)(fd); if (!fd_empty(f)) { ret = -ESPIPE; if (fd_file(f)->f_mode & FMODE_PREAD) ret = vfs_readv(fd_file(f), vec, vlen, &pos, flags); } if (ret > 0) add_rchar(current, ret); inc_syscr(current); return ret; } static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec, unsigned long vlen, loff_t pos, rwf_t flags) { ssize_t ret = -EBADF; if (pos < 0) return -EINVAL; CLASS(fd, f)(fd); if (!fd_empty(f)) { ret = -ESPIPE; if (fd_file(f)->f_mode & FMODE_PWRITE) ret = vfs_writev(fd_file(f), vec, vlen, &pos, flags); } if (ret > 0) add_wchar(current, ret); inc_syscw(current); return ret; } SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen) { return do_readv(fd, vec, vlen, 0); } SYSCALL_DEFINE3(writev, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen) { return do_writev(fd, vec, vlen, 0); } SYSCALL_DEFINE5(preadv, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h) { loff_t pos = pos_from_hilo(pos_h, pos_l); return do_preadv(fd, vec, vlen, pos, 0); } SYSCALL_DEFINE6(preadv2, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h, rwf_t, flags) { loff_t pos = pos_from_hilo(pos_h, pos_l); if (pos == -1) return do_readv(fd, vec, vlen, flags); return do_preadv(fd, vec, vlen, pos, flags); } SYSCALL_DEFINE5(pwritev, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h) { loff_t pos = pos_from_hilo(pos_h, pos_l); return do_pwritev(fd, vec, vlen, pos, 0); } SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h, rwf_t, flags) { loff_t pos = pos_from_hilo(pos_h, pos_l); if (pos == -1) return do_writev(fd, vec, vlen, flags); return do_pwritev(fd, vec, vlen, pos, flags); } /* * Various compat syscalls. Note that they all pretend to take a native * iovec - import_iovec will properly treat those as compat_iovecs based on * in_compat_syscall(). */ #ifdef CONFIG_COMPAT #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64 COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, loff_t, pos) { return do_preadv(fd, vec, vlen, pos, 0); } #endif COMPAT_SYSCALL_DEFINE5(preadv, compat_ulong_t, fd, const struct iovec __user *, vec, compat_ulong_t, vlen, u32, pos_low, u32, pos_high) { loff_t pos = ((loff_t)pos_high << 32) | pos_low; return do_preadv(fd, vec, vlen, pos, 0); } #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2 COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, loff_t, pos, rwf_t, flags) { if (pos == -1) return do_readv(fd, vec, vlen, flags); return do_preadv(fd, vec, vlen, pos, flags); } #endif COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd, const struct iovec __user *, vec, compat_ulong_t, vlen, u32, pos_low, u32, pos_high, rwf_t, flags) { loff_t pos = ((loff_t)pos_high << 32) | pos_low; if (pos == -1) return do_readv(fd, vec, vlen, flags); return do_preadv(fd, vec, vlen, pos, flags); } #ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64 COMPAT_SYSCALL_DEFINE4(pwritev64, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, loff_t, pos) { return do_pwritev(fd, vec, vlen, pos, 0); } #endif COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd, const struct iovec __user *,vec, compat_ulong_t, vlen, u32, pos_low, u32, pos_high) { loff_t pos = ((loff_t)pos_high << 32) | pos_low; return do_pwritev(fd, vec, vlen, pos, 0); } #ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2 COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, loff_t, pos, rwf_t, flags) { if (pos == -1) return do_writev(fd, vec, vlen, flags); return do_pwritev(fd, vec, vlen, pos, flags); } #endif COMPAT_SYSCALL_DEFINE6(pwritev2, compat_ulong_t, fd, const struct iovec __user *,vec, compat_ulong_t, vlen, u32, pos_low, u32, pos_high, rwf_t, flags) { loff_t pos = ((loff_t)pos_high << 32) | pos_low; if (pos == -1) return do_writev(fd, vec, vlen, flags); return do_pwritev(fd, vec, vlen, pos, flags); } #endif /* CONFIG_COMPAT */ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos, size_t count, loff_t max) { struct inode *in_inode, *out_inode; struct pipe_inode_info *opipe; loff_t pos; loff_t out_pos; ssize_t retval; int fl; /* * Get input file, and verify that it is ok.. */ CLASS(fd, in)(in_fd); if (fd_empty(in)) return -EBADF; if (!(fd_file(in)->f_mode & FMODE_READ)) return -EBADF; if (!ppos) { pos = fd_file(in)->f_pos; } else { pos = *ppos; if (!(fd_file(in)->f_mode & FMODE_PREAD)) return -ESPIPE; } retval = rw_verify_area(READ, fd_file(in), &pos, count); if (retval < 0) return retval; if (count > MAX_RW_COUNT) count = MAX_RW_COUNT; /* * Get output file, and verify that it is ok.. */ CLASS(fd, out)(out_fd); if (fd_empty(out)) return -EBADF; if (!(fd_file(out)->f_mode & FMODE_WRITE)) return -EBADF; in_inode = file_inode(fd_file(in)); out_inode = file_inode(fd_file(out)); out_pos = fd_file(out)->f_pos; if (!max) max = min(in_inode->i_sb->s_maxbytes, out_inode->i_sb->s_maxbytes); if (unlikely(pos + count > max)) { if (pos >= max) return -EOVERFLOW; count = max - pos; } fl = 0; #if 0 /* * We need to debate whether we can enable this or not. The * man page documents EAGAIN return for the output at least, * and the application is arguably buggy if it doesn't expect * EAGAIN on a non-blocking file descriptor. */ if (fd_file(in)->f_flags & O_NONBLOCK) fl = SPLICE_F_NONBLOCK; #endif opipe = get_pipe_info(fd_file(out), true); if (!opipe) { retval = rw_verify_area(WRITE, fd_file(out), &out_pos, count); if (retval < 0) return retval; retval = do_splice_direct(fd_file(in), &pos, fd_file(out), &out_pos, count, fl); } else { if (fd_file(out)->f_flags & O_NONBLOCK) fl |= SPLICE_F_NONBLOCK; retval = splice_file_to_pipe(fd_file(in), opipe, &pos, count, fl); } if (retval > 0) { add_rchar(current, retval); add_wchar(current, retval); fsnotify_access(fd_file(in)); fsnotify_modify(fd_file(out)); fd_file(out)->f_pos = out_pos; if (ppos) *ppos = pos; else fd_file(in)->f_pos = pos; } inc_syscr(current); inc_syscw(current); if (pos > max) retval = -EOVERFLOW; return retval; } SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd, off_t __user *, offset, size_t, count) { loff_t pos; off_t off; ssize_t ret; if (offset) { if (unlikely(get_user(off, offset))) return -EFAULT; pos = off; ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS); if (unlikely(put_user(pos, offset))) return -EFAULT; return ret; } return do_sendfile(out_fd, in_fd, NULL, count, 0); } SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd, loff_t __user *, offset, size_t, count) { loff_t pos; ssize_t ret; if (offset) { if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t)))) return -EFAULT; ret = do_sendfile(out_fd, in_fd, &pos, count, 0); if (unlikely(put_user(pos, offset))) return -EFAULT; return ret; } return do_sendfile(out_fd, in_fd, NULL, count, 0); } #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd, compat_off_t __user *, offset, compat_size_t, count) { loff_t pos; off_t off; ssize_t ret; if (offset) { if (unlikely(get_user(off, offset))) return -EFAULT; pos = off; ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS); if (unlikely(put_user(pos, offset))) return -EFAULT; return ret; } return do_sendfile(out_fd, in_fd, NULL, count, 0); } COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd, compat_loff_t __user *, offset, compat_size_t, count) { loff_t pos; ssize_t ret; if (offset) { if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t)))) return -EFAULT; ret = do_sendfile(out_fd, in_fd, &pos, count, 0); if (unlikely(put_user(pos, offset))) return -EFAULT; return ret; } return do_sendfile(out_fd, in_fd, NULL, count, 0); } #endif /* * Performs necessary checks before doing a file copy * * Can adjust amount of bytes to copy via @req_count argument. * Returns appropriate error code that caller should return or * zero in case the copy should be allowed. */ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, size_t *req_count, unsigned int flags) { struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); uint64_t count = *req_count; loff_t size_in; int ret; ret = generic_file_rw_checks(file_in, file_out); if (ret) return ret; /* * We allow some filesystems to handle cross sb copy, but passing * a file of the wrong filesystem type to filesystem driver can result * in an attempt to dereference the wrong type of ->private_data, so * avoid doing that until we really have a good reason. * * nfs and cifs define several different file_system_type structures * and several different sets of file_operations, but they all end up * using the same ->copy_file_range() function pointer. */ if (flags & COPY_FILE_SPLICE) { /* cross sb splice is allowed */ } else if (file_out->f_op->copy_file_range) { if (file_in->f_op->copy_file_range != file_out->f_op->copy_file_range) return -EXDEV; } else if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb) { return -EXDEV; } /* Don't touch certain kinds of inodes */ if (IS_IMMUTABLE(inode_out)) return -EPERM; if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) return -ETXTBSY; /* Ensure offsets don't wrap. */ if (pos_in + count < pos_in || pos_out + count < pos_out) return -EOVERFLOW; /* Shorten the copy to EOF */ size_in = i_size_read(inode_in); if (pos_in >= size_in) count = 0; else count = min(count, size_in - (uint64_t)pos_in); ret = generic_write_check_limits(file_out, pos_out, &count); if (ret) return ret; /* Don't allow overlapped copying within the same file. */ if (inode_in == inode_out && pos_out + count > pos_in && pos_out < pos_in + count) return -EINVAL; *req_count = count; return 0; } /* * copy_file_range() differs from regular file read and write in that it * specifically allows return partial success. When it does so is up to * the copy_file_range method. */ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, size_t len, unsigned int flags) { ssize_t ret; bool splice = flags & COPY_FILE_SPLICE; bool samesb = file_inode(file_in)->i_sb == file_inode(file_out)->i_sb; if (flags & ~COPY_FILE_SPLICE) return -EINVAL; ret = generic_copy_file_checks(file_in, pos_in, file_out, pos_out, &len, flags); if (unlikely(ret)) return ret; ret = rw_verify_area(READ, file_in, &pos_in, len); if (unlikely(ret)) return ret; ret = rw_verify_area(WRITE, file_out, &pos_out, len); if (unlikely(ret)) return ret; if (len == 0) return 0; file_start_write(file_out); /* * Cloning is supported by more file systems, so we implement copy on * same sb using clone, but for filesystems where both clone and copy * are supported (e.g. nfs,cifs), we only call the copy method. */ if (!splice && file_out->f_op->copy_file_range) { ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out, pos_out, len, flags); } else if (!splice && file_in->f_op->remap_file_range && samesb) { ret = file_in->f_op->remap_file_range(file_in, pos_in, file_out, pos_out, min_t(loff_t, MAX_RW_COUNT, len), REMAP_FILE_CAN_SHORTEN); /* fallback to splice */ if (ret <= 0) splice = true; } else if (samesb) { /* Fallback to splice for same sb copy for backward compat */ splice = true; } file_end_write(file_out); if (!splice) goto done; /* * We can get here for same sb copy of filesystems that do not implement * ->copy_file_range() in case filesystem does not support clone or in * case filesystem supports clone but rejected the clone request (e.g. * because it was not block aligned). * * In both cases, fall back to kernel copy so we are able to maintain a * consistent story about which filesystems support copy_file_range() * and which filesystems do not, that will allow userspace tools to * make consistent desicions w.r.t using copy_file_range(). * * We also get here if caller (e.g. nfsd) requested COPY_FILE_SPLICE * for server-side-copy between any two sb. * * In any case, we call do_splice_direct() and not splice_file_range(), * without file_start_write() held, to avoid possible deadlocks related * to splicing from input file, while file_start_write() is held on * the output file on a different sb. */ ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out, min_t(size_t, len, MAX_RW_COUNT), 0); done: if (ret > 0) { fsnotify_access(file_in); add_rchar(current, ret); fsnotify_modify(file_out); add_wchar(current, ret); } inc_syscr(current); inc_syscw(current); return ret; } EXPORT_SYMBOL(vfs_copy_file_range); SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in, int, fd_out, loff_t __user *, off_out, size_t, len, unsigned int, flags) { loff_t pos_in; loff_t pos_out; ssize_t ret = -EBADF; CLASS(fd, f_in)(fd_in); if (fd_empty(f_in)) return -EBADF; CLASS(fd, f_out)(fd_out); if (fd_empty(f_out)) return -EBADF; if (off_in) { if (copy_from_user(&pos_in, off_in, sizeof(loff_t))) return -EFAULT; } else { pos_in = fd_file(f_in)->f_pos; } if (off_out) { if (copy_from_user(&pos_out, off_out, sizeof(loff_t))) return -EFAULT; } else { pos_out = fd_file(f_out)->f_pos; } if (flags != 0) return -EINVAL; ret = vfs_copy_file_range(fd_file(f_in), pos_in, fd_file(f_out), pos_out, len, flags); if (ret > 0) { pos_in += ret; pos_out += ret; if (off_in) { if (copy_to_user(off_in, &pos_in, sizeof(loff_t))) ret = -EFAULT; } else { fd_file(f_in)->f_pos = pos_in; } if (off_out) { if (copy_to_user(off_out, &pos_out, sizeof(loff_t))) ret = -EFAULT; } else { fd_file(f_out)->f_pos = pos_out; } } return ret; } /* * Don't operate on ranges the page cache doesn't support, and don't exceed the * LFS limits. If pos is under the limit it becomes a short access. If it * exceeds the limit we return -EFBIG. */ int generic_write_check_limits(struct file *file, loff_t pos, loff_t *count) { struct inode *inode = file->f_mapping->host; loff_t max_size = inode->i_sb->s_maxbytes; loff_t limit = rlimit(RLIMIT_FSIZE); if (limit != RLIM_INFINITY) { if (pos >= limit) { send_sig(SIGXFSZ, current, 0); return -EFBIG; } *count = min(*count, limit - pos); } if (!(file->f_flags & O_LARGEFILE)) max_size = MAX_NON_LFS; if (unlikely(pos >= max_size)) return -EFBIG; *count = min(*count, max_size - pos); return 0; } EXPORT_SYMBOL_GPL(generic_write_check_limits); /* Like generic_write_checks(), but takes size of write instead of iter. */ int generic_write_checks_count(struct kiocb *iocb, loff_t *count) { struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; if (IS_SWAPFILE(inode)) return -ETXTBSY; if (!*count) return 0; if (iocb->ki_flags & IOCB_APPEND) iocb->ki_pos = i_size_read(inode); if ((iocb->ki_flags & IOCB_NOWAIT) && !((iocb->ki_flags & IOCB_DIRECT) || (file->f_op->fop_flags & FOP_BUFFER_WASYNC))) return -EINVAL; return generic_write_check_limits(iocb->ki_filp, iocb->ki_pos, count); } EXPORT_SYMBOL(generic_write_checks_count); /* * Performs necessary checks before doing a write * * Can adjust writing position or amount of bytes to write. * Returns appropriate error code that caller should return or * zero in case that write should be allowed. */ ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) { loff_t count = iov_iter_count(from); int ret; ret = generic_write_checks_count(iocb, &count); if (ret) return ret; iov_iter_truncate(from, count); return iov_iter_count(from); } EXPORT_SYMBOL(generic_write_checks); /* * Performs common checks before doing a file copy/clone * from @file_in to @file_out. */ int generic_file_rw_checks(struct file *file_in, struct file *file_out) { struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); /* Don't copy dirs, pipes, sockets... */ if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) return -EISDIR; if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) return -EINVAL; if (!(file_in->f_mode & FMODE_READ) || !(file_out->f_mode & FMODE_WRITE) || (file_out->f_flags & O_APPEND)) return -EBADF; return 0; } int generic_atomic_write_valid(struct kiocb *iocb, struct iov_iter *iter) { size_t len = iov_iter_count(iter); if (!iter_is_ubuf(iter)) return -EINVAL; if (!is_power_of_2(len)) return -EINVAL; if (!IS_ALIGNED(iocb->ki_pos, len)) return -EINVAL; if (!(iocb->ki_flags & IOCB_DIRECT)) return -EOPNOTSUPP; return 0; } EXPORT_SYMBOL_GPL(generic_atomic_write_valid); |
| 7 6 6 6 7 2 2 2 411 21 155 153 21 21 21 5 5 1 22 22 22 22 22 7 6 7 7 7 7 7 7 7 7 7 7 7 7 7 22 22 22 22 22 22 22 22 22 22 518 134 390 517 515 6 6 4 3 3 2 1 1 552 516 515 515 555 659 658 652 142 142 551 427 418 55 555 26 26 26 25 7 25 26 26 25 26 49 45 26 5432 85 85 85 67 66 5 61 549 544 407 552 404 27 377 381 383 381 193 191 16 177 4725 4724 917 258 4721 101 3 98 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 | // SPDX-License-Identifier: GPL-2.0 /* * security/tomoyo/network.c * * Copyright (C) 2005-2011 NTT DATA CORPORATION */ #include "common.h" #include <linux/slab.h> /* Structure for holding inet domain socket's address. */ struct tomoyo_inet_addr_info { __be16 port; /* In network byte order. */ const __be32 *address; /* In network byte order. */ bool is_ipv6; }; /* Structure for holding unix domain socket's address. */ struct tomoyo_unix_addr_info { u8 *addr; /* This may not be '\0' terminated string. */ unsigned int addr_len; }; /* Structure for holding socket address. */ struct tomoyo_addr_info { u8 protocol; u8 operation; struct tomoyo_inet_addr_info inet; struct tomoyo_unix_addr_info unix0; }; /* String table for socket's protocols. */ const char * const tomoyo_proto_keyword[TOMOYO_SOCK_MAX] = { [SOCK_STREAM] = "stream", [SOCK_DGRAM] = "dgram", [SOCK_RAW] = "raw", [SOCK_SEQPACKET] = "seqpacket", [0] = " ", /* Dummy for avoiding NULL pointer dereference. */ [4] = " ", /* Dummy for avoiding NULL pointer dereference. */ }; /** * tomoyo_parse_ipaddr_union - Parse an IP address. * * @param: Pointer to "struct tomoyo_acl_param". * @ptr: Pointer to "struct tomoyo_ipaddr_union". * * Returns true on success, false otherwise. */ bool tomoyo_parse_ipaddr_union(struct tomoyo_acl_param *param, struct tomoyo_ipaddr_union *ptr) { u8 * const min = ptr->ip[0].in6_u.u6_addr8; u8 * const max = ptr->ip[1].in6_u.u6_addr8; char *address = tomoyo_read_token(param); const char *end; if (!strchr(address, ':') && in4_pton(address, -1, min, '-', &end) > 0) { ptr->is_ipv6 = false; if (!*end) ptr->ip[1].s6_addr32[0] = ptr->ip[0].s6_addr32[0]; else if (*end++ != '-' || in4_pton(end, -1, max, '\0', &end) <= 0 || *end) return false; return true; } if (in6_pton(address, -1, min, '-', &end) > 0) { ptr->is_ipv6 = true; if (!*end) memmove(max, min, sizeof(u16) * 8); else if (*end++ != '-' || in6_pton(end, -1, max, '\0', &end) <= 0 || *end) return false; return true; } return false; } /** * tomoyo_print_ipv4 - Print an IPv4 address. * * @buffer: Buffer to write to. * @buffer_len: Size of @buffer. * @min_ip: Pointer to __be32. * @max_ip: Pointer to __be32. * * Returns nothing. */ static void tomoyo_print_ipv4(char *buffer, const unsigned int buffer_len, const __be32 *min_ip, const __be32 *max_ip) { snprintf(buffer, buffer_len, "%pI4%c%pI4", min_ip, *min_ip == *max_ip ? '\0' : '-', max_ip); } /** * tomoyo_print_ipv6 - Print an IPv6 address. * * @buffer: Buffer to write to. * @buffer_len: Size of @buffer. * @min_ip: Pointer to "struct in6_addr". * @max_ip: Pointer to "struct in6_addr". * * Returns nothing. */ static void tomoyo_print_ipv6(char *buffer, const unsigned int buffer_len, const struct in6_addr *min_ip, const struct in6_addr *max_ip) { snprintf(buffer, buffer_len, "%pI6c%c%pI6c", min_ip, !memcmp(min_ip, max_ip, 16) ? '\0' : '-', max_ip); } /** * tomoyo_print_ip - Print an IP address. * * @buf: Buffer to write to. * @size: Size of @buf. * @ptr: Pointer to "struct ipaddr_union". * * Returns nothing. */ void tomoyo_print_ip(char *buf, const unsigned int size, const struct tomoyo_ipaddr_union *ptr) { if (ptr->is_ipv6) tomoyo_print_ipv6(buf, size, &ptr->ip[0], &ptr->ip[1]); else tomoyo_print_ipv4(buf, size, &ptr->ip[0].s6_addr32[0], &ptr->ip[1].s6_addr32[0]); } /* * Mapping table from "enum tomoyo_network_acl_index" to * "enum tomoyo_mac_index" for inet domain socket. */ static const u8 tomoyo_inet2mac [TOMOYO_SOCK_MAX][TOMOYO_MAX_NETWORK_OPERATION] = { [SOCK_STREAM] = { [TOMOYO_NETWORK_BIND] = TOMOYO_MAC_NETWORK_INET_STREAM_BIND, [TOMOYO_NETWORK_LISTEN] = TOMOYO_MAC_NETWORK_INET_STREAM_LISTEN, [TOMOYO_NETWORK_CONNECT] = TOMOYO_MAC_NETWORK_INET_STREAM_CONNECT, }, [SOCK_DGRAM] = { [TOMOYO_NETWORK_BIND] = TOMOYO_MAC_NETWORK_INET_DGRAM_BIND, [TOMOYO_NETWORK_SEND] = TOMOYO_MAC_NETWORK_INET_DGRAM_SEND, }, [SOCK_RAW] = { [TOMOYO_NETWORK_BIND] = TOMOYO_MAC_NETWORK_INET_RAW_BIND, [TOMOYO_NETWORK_SEND] = TOMOYO_MAC_NETWORK_INET_RAW_SEND, }, }; /* * Mapping table from "enum tomoyo_network_acl_index" to * "enum tomoyo_mac_index" for unix domain socket. */ static const u8 tomoyo_unix2mac [TOMOYO_SOCK_MAX][TOMOYO_MAX_NETWORK_OPERATION] = { [SOCK_STREAM] = { [TOMOYO_NETWORK_BIND] = TOMOYO_MAC_NETWORK_UNIX_STREAM_BIND, [TOMOYO_NETWORK_LISTEN] = TOMOYO_MAC_NETWORK_UNIX_STREAM_LISTEN, [TOMOYO_NETWORK_CONNECT] = TOMOYO_MAC_NETWORK_UNIX_STREAM_CONNECT, }, [SOCK_DGRAM] = { [TOMOYO_NETWORK_BIND] = TOMOYO_MAC_NETWORK_UNIX_DGRAM_BIND, [TOMOYO_NETWORK_SEND] = TOMOYO_MAC_NETWORK_UNIX_DGRAM_SEND, }, [SOCK_SEQPACKET] = { [TOMOYO_NETWORK_BIND] = TOMOYO_MAC_NETWORK_UNIX_SEQPACKET_BIND, [TOMOYO_NETWORK_LISTEN] = TOMOYO_MAC_NETWORK_UNIX_SEQPACKET_LISTEN, [TOMOYO_NETWORK_CONNECT] = TOMOYO_MAC_NETWORK_UNIX_SEQPACKET_CONNECT, }, }; /** * tomoyo_same_inet_acl - Check for duplicated "struct tomoyo_inet_acl" entry. * * @a: Pointer to "struct tomoyo_acl_info". * @b: Pointer to "struct tomoyo_acl_info". * * Returns true if @a == @b except permission bits, false otherwise. */ static bool tomoyo_same_inet_acl(const struct tomoyo_acl_info *a, const struct tomoyo_acl_info *b) { const struct tomoyo_inet_acl *p1 = container_of(a, typeof(*p1), head); const struct tomoyo_inet_acl *p2 = container_of(b, typeof(*p2), head); return p1->protocol == p2->protocol && tomoyo_same_ipaddr_union(&p1->address, &p2->address) && tomoyo_same_number_union(&p1->port, &p2->port); } /** * tomoyo_same_unix_acl - Check for duplicated "struct tomoyo_unix_acl" entry. * * @a: Pointer to "struct tomoyo_acl_info". * @b: Pointer to "struct tomoyo_acl_info". * * Returns true if @a == @b except permission bits, false otherwise. */ static bool tomoyo_same_unix_acl(const struct tomoyo_acl_info *a, const struct tomoyo_acl_info *b) { const struct tomoyo_unix_acl *p1 = container_of(a, typeof(*p1), head); const struct tomoyo_unix_acl *p2 = container_of(b, typeof(*p2), head); return p1->protocol == p2->protocol && tomoyo_same_name_union(&p1->name, &p2->name); } /** * tomoyo_merge_inet_acl - Merge duplicated "struct tomoyo_inet_acl" entry. * * @a: Pointer to "struct tomoyo_acl_info". * @b: Pointer to "struct tomoyo_acl_info". * @is_delete: True for @a &= ~@b, false for @a |= @b. * * Returns true if @a is empty, false otherwise. */ static bool tomoyo_merge_inet_acl(struct tomoyo_acl_info *a, struct tomoyo_acl_info *b, const bool is_delete) { u8 * const a_perm = &container_of(a, struct tomoyo_inet_acl, head)->perm; u8 perm = READ_ONCE(*a_perm); const u8 b_perm = container_of(b, struct tomoyo_inet_acl, head)->perm; if (is_delete) perm &= ~b_perm; else perm |= b_perm; WRITE_ONCE(*a_perm, perm); return !perm; } /** * tomoyo_merge_unix_acl - Merge duplicated "struct tomoyo_unix_acl" entry. * * @a: Pointer to "struct tomoyo_acl_info". * @b: Pointer to "struct tomoyo_acl_info". * @is_delete: True for @a &= ~@b, false for @a |= @b. * * Returns true if @a is empty, false otherwise. */ static bool tomoyo_merge_unix_acl(struct tomoyo_acl_info *a, struct tomoyo_acl_info *b, const bool is_delete) { u8 * const a_perm = &container_of(a, struct tomoyo_unix_acl, head)->perm; u8 perm = READ_ONCE(*a_perm); const u8 b_perm = container_of(b, struct tomoyo_unix_acl, head)->perm; if (is_delete) perm &= ~b_perm; else perm |= b_perm; WRITE_ONCE(*a_perm, perm); return !perm; } /** * tomoyo_write_inet_network - Write "struct tomoyo_inet_acl" list. * * @param: Pointer to "struct tomoyo_acl_param". * * Returns 0 on success, negative value otherwise. * * Caller holds tomoyo_read_lock(). */ int tomoyo_write_inet_network(struct tomoyo_acl_param *param) { struct tomoyo_inet_acl e = { .head.type = TOMOYO_TYPE_INET_ACL }; int error = -EINVAL; u8 type; const char *protocol = tomoyo_read_token(param); const char *operation = tomoyo_read_token(param); for (e.protocol = 0; e.protocol < TOMOYO_SOCK_MAX; e.protocol++) if (!strcmp(protocol, tomoyo_proto_keyword[e.protocol])) break; for (type = 0; type < TOMOYO_MAX_NETWORK_OPERATION; type++) if (tomoyo_permstr(operation, tomoyo_socket_keyword[type])) e.perm |= 1 << type; if (e.protocol == TOMOYO_SOCK_MAX || !e.perm) return -EINVAL; if (param->data[0] == '@') { param->data++; e.address.group = tomoyo_get_group(param, TOMOYO_ADDRESS_GROUP); if (!e.address.group) return -ENOMEM; } else { if (!tomoyo_parse_ipaddr_union(param, &e.address)) goto out; } if (!tomoyo_parse_number_union(param, &e.port) || e.port.values[1] > 65535) goto out; error = tomoyo_update_domain(&e.head, sizeof(e), param, tomoyo_same_inet_acl, tomoyo_merge_inet_acl); out: tomoyo_put_group(e.address.group); tomoyo_put_number_union(&e.port); return error; } /** * tomoyo_write_unix_network - Write "struct tomoyo_unix_acl" list. * * @param: Pointer to "struct tomoyo_acl_param". * * Returns 0 on success, negative value otherwise. */ int tomoyo_write_unix_network(struct tomoyo_acl_param *param) { struct tomoyo_unix_acl e = { .head.type = TOMOYO_TYPE_UNIX_ACL }; int error; u8 type; const char *protocol = tomoyo_read_token(param); const char *operation = tomoyo_read_token(param); for (e.protocol = 0; e.protocol < TOMOYO_SOCK_MAX; e.protocol++) if (!strcmp(protocol, tomoyo_proto_keyword[e.protocol])) break; for (type = 0; type < TOMOYO_MAX_NETWORK_OPERATION; type++) if (tomoyo_permstr(operation, tomoyo_socket_keyword[type])) e.perm |= 1 << type; if (e.protocol == TOMOYO_SOCK_MAX || !e.perm) return -EINVAL; if (!tomoyo_parse_name_union(param, &e.name)) return -EINVAL; error = tomoyo_update_domain(&e.head, sizeof(e), param, tomoyo_same_unix_acl, tomoyo_merge_unix_acl); tomoyo_put_name_union(&e.name); return error; } /** * tomoyo_audit_net_log - Audit network log. * * @r: Pointer to "struct tomoyo_request_info". * @family: Name of socket family ("inet" or "unix"). * @protocol: Name of protocol in @family. * @operation: Name of socket operation. * @address: Name of address. * * Returns 0 on success, negative value otherwise. */ static int tomoyo_audit_net_log(struct tomoyo_request_info *r, const char *family, const u8 protocol, const u8 operation, const char *address) { return tomoyo_supervisor(r, "network %s %s %s %s\n", family, tomoyo_proto_keyword[protocol], tomoyo_socket_keyword[operation], address); } /** * tomoyo_audit_inet_log - Audit INET network log. * * @r: Pointer to "struct tomoyo_request_info". * * Returns 0 on success, negative value otherwise. */ static int tomoyo_audit_inet_log(struct tomoyo_request_info *r) { char buf[128]; int len; const __be32 *address = r->param.inet_network.address; if (r->param.inet_network.is_ipv6) tomoyo_print_ipv6(buf, sizeof(buf), (const struct in6_addr *) address, (const struct in6_addr *) address); else tomoyo_print_ipv4(buf, sizeof(buf), address, address); len = strlen(buf); snprintf(buf + len, sizeof(buf) - len, " %u", r->param.inet_network.port); return tomoyo_audit_net_log(r, "inet", r->param.inet_network.protocol, r->param.inet_network.operation, buf); } /** * tomoyo_audit_unix_log - Audit UNIX network log. * * @r: Pointer to "struct tomoyo_request_info". * * Returns 0 on success, negative value otherwise. */ static int tomoyo_audit_unix_log(struct tomoyo_request_info *r) { return tomoyo_audit_net_log(r, "unix", r->param.unix_network.protocol, r->param.unix_network.operation, r->param.unix_network.address->name); } /** * tomoyo_check_inet_acl - Check permission for inet domain socket operation. * * @r: Pointer to "struct tomoyo_request_info". * @ptr: Pointer to "struct tomoyo_acl_info". * * Returns true if granted, false otherwise. */ static bool tomoyo_check_inet_acl(struct tomoyo_request_info *r, const struct tomoyo_acl_info *ptr) { const struct tomoyo_inet_acl *acl = container_of(ptr, typeof(*acl), head); const u8 size = r->param.inet_network.is_ipv6 ? 16 : 4; if (!(acl->perm & (1 << r->param.inet_network.operation)) || !tomoyo_compare_number_union(r->param.inet_network.port, &acl->port)) return false; if (acl->address.group) return tomoyo_address_matches_group (r->param.inet_network.is_ipv6, r->param.inet_network.address, acl->address.group); return acl->address.is_ipv6 == r->param.inet_network.is_ipv6 && memcmp(&acl->address.ip[0], r->param.inet_network.address, size) <= 0 && memcmp(r->param.inet_network.address, &acl->address.ip[1], size) <= 0; } /** * tomoyo_check_unix_acl - Check permission for unix domain socket operation. * * @r: Pointer to "struct tomoyo_request_info". * @ptr: Pointer to "struct tomoyo_acl_info". * * Returns true if granted, false otherwise. */ static bool tomoyo_check_unix_acl(struct tomoyo_request_info *r, const struct tomoyo_acl_info *ptr) { const struct tomoyo_unix_acl *acl = container_of(ptr, typeof(*acl), head); return (acl->perm & (1 << r->param.unix_network.operation)) && tomoyo_compare_name_union(r->param.unix_network.address, &acl->name); } /** * tomoyo_inet_entry - Check permission for INET network operation. * * @address: Pointer to "struct tomoyo_addr_info". * * Returns 0 on success, negative value otherwise. */ static int tomoyo_inet_entry(const struct tomoyo_addr_info *address) { const int idx = tomoyo_read_lock(); struct tomoyo_request_info r; int error = 0; const u8 type = tomoyo_inet2mac[address->protocol][address->operation]; if (type && tomoyo_init_request_info(&r, NULL, type) != TOMOYO_CONFIG_DISABLED) { r.param_type = TOMOYO_TYPE_INET_ACL; r.param.inet_network.protocol = address->protocol; r.param.inet_network.operation = address->operation; r.param.inet_network.is_ipv6 = address->inet.is_ipv6; r.param.inet_network.address = address->inet.address; r.param.inet_network.port = ntohs(address->inet.port); do { tomoyo_check_acl(&r, tomoyo_check_inet_acl); error = tomoyo_audit_inet_log(&r); } while (error == TOMOYO_RETRY_REQUEST); } tomoyo_read_unlock(idx); return error; } /** * tomoyo_check_inet_address - Check permission for inet domain socket's operation. * * @addr: Pointer to "struct sockaddr". * @addr_len: Size of @addr. * @port: Port number. * @address: Pointer to "struct tomoyo_addr_info". * * Returns 0 on success, negative value otherwise. */ static int tomoyo_check_inet_address(const struct sockaddr *addr, const unsigned int addr_len, const u16 port, struct tomoyo_addr_info *address) { struct tomoyo_inet_addr_info *i = &address->inet; if (addr_len < offsetofend(struct sockaddr, sa_family)) return 0; switch (addr->sa_family) { case AF_INET6: if (addr_len < SIN6_LEN_RFC2133) goto skip; i->is_ipv6 = true; i->address = (__be32 *) ((struct sockaddr_in6 *) addr)->sin6_addr.s6_addr; i->port = ((struct sockaddr_in6 *) addr)->sin6_port; break; case AF_INET: if (addr_len < sizeof(struct sockaddr_in)) goto skip; i->is_ipv6 = false; i->address = (__be32 *) &((struct sockaddr_in *) addr)->sin_addr; i->port = ((struct sockaddr_in *) addr)->sin_port; break; default: goto skip; } if (address->protocol == SOCK_RAW) i->port = htons(port); return tomoyo_inet_entry(address); skip: return 0; } /** * tomoyo_unix_entry - Check permission for UNIX network operation. * * @address: Pointer to "struct tomoyo_addr_info". * * Returns 0 on success, negative value otherwise. */ static int tomoyo_unix_entry(const struct tomoyo_addr_info *address) { const int idx = tomoyo_read_lock(); struct tomoyo_request_info r; int error = 0; const u8 type = tomoyo_unix2mac[address->protocol][address->operation]; if (type && tomoyo_init_request_info(&r, NULL, type) != TOMOYO_CONFIG_DISABLED) { char *buf = address->unix0.addr; int len = address->unix0.addr_len - sizeof(sa_family_t); if (len <= 0) { buf = "anonymous"; len = 9; } else if (buf[0]) { len = strnlen(buf, len); } buf = tomoyo_encode2(buf, len); if (buf) { struct tomoyo_path_info addr; addr.name = buf; tomoyo_fill_path_info(&addr); r.param_type = TOMOYO_TYPE_UNIX_ACL; r.param.unix_network.protocol = address->protocol; r.param.unix_network.operation = address->operation; r.param.unix_network.address = &addr; do { tomoyo_check_acl(&r, tomoyo_check_unix_acl); error = tomoyo_audit_unix_log(&r); } while (error == TOMOYO_RETRY_REQUEST); kfree(buf); } else error = -ENOMEM; } tomoyo_read_unlock(idx); return error; } /** * tomoyo_check_unix_address - Check permission for unix domain socket's operation. * * @addr: Pointer to "struct sockaddr". * @addr_len: Size of @addr. * @address: Pointer to "struct tomoyo_addr_info". * * Returns 0 on success, negative value otherwise. */ static int tomoyo_check_unix_address(struct sockaddr *addr, const unsigned int addr_len, struct tomoyo_addr_info *address) { struct tomoyo_unix_addr_info *u = &address->unix0; if (addr_len < offsetofend(struct sockaddr, sa_family)) return 0; if (addr->sa_family != AF_UNIX) return 0; u->addr = ((struct sockaddr_un *) addr)->sun_path; u->addr_len = addr_len; return tomoyo_unix_entry(address); } /** * tomoyo_kernel_service - Check whether I'm kernel service or not. * * Returns true if I'm kernel service, false otherwise. */ static bool tomoyo_kernel_service(void) { /* Nothing to do if I am a kernel service. */ return current->flags & PF_KTHREAD; } /** * tomoyo_sock_family - Get socket's family. * * @sk: Pointer to "struct sock". * * Returns one of PF_INET, PF_INET6, PF_UNIX or 0. */ static u8 tomoyo_sock_family(struct sock *sk) { u8 family; if (tomoyo_kernel_service()) return 0; family = sk->sk_family; switch (family) { case PF_INET: case PF_INET6: case PF_UNIX: return family; default: return 0; } } /** * tomoyo_socket_listen_permission - Check permission for listening a socket. * * @sock: Pointer to "struct socket". * * Returns 0 on success, negative value otherwise. */ int tomoyo_socket_listen_permission(struct socket *sock) { struct tomoyo_addr_info address; const u8 family = tomoyo_sock_family(sock->sk); const unsigned int type = sock->type; struct sockaddr_storage addr; int addr_len; if (!family || (type != SOCK_STREAM && type != SOCK_SEQPACKET)) return 0; { const int error = sock->ops->getname(sock, (struct sockaddr *) &addr, 0); if (error < 0) return error; addr_len = error; } address.protocol = type; address.operation = TOMOYO_NETWORK_LISTEN; if (family == PF_UNIX) return tomoyo_check_unix_address((struct sockaddr *) &addr, addr_len, &address); return tomoyo_check_inet_address((struct sockaddr *) &addr, addr_len, 0, &address); } /** * tomoyo_socket_connect_permission - Check permission for setting the remote address of a socket. * * @sock: Pointer to "struct socket". * @addr: Pointer to "struct sockaddr". * @addr_len: Size of @addr. * * Returns 0 on success, negative value otherwise. */ int tomoyo_socket_connect_permission(struct socket *sock, struct sockaddr *addr, int addr_len) { struct tomoyo_addr_info address; const u8 family = tomoyo_sock_family(sock->sk); const unsigned int type = sock->type; if (!family) return 0; address.protocol = type; switch (type) { case SOCK_DGRAM: case SOCK_RAW: address.operation = TOMOYO_NETWORK_SEND; break; case SOCK_STREAM: case SOCK_SEQPACKET: address.operation = TOMOYO_NETWORK_CONNECT; break; default: return 0; } if (family == PF_UNIX) return tomoyo_check_unix_address(addr, addr_len, &address); return tomoyo_check_inet_address(addr, addr_len, sock->sk->sk_protocol, &address); } /** * tomoyo_socket_bind_permission - Check permission for setting the local address of a socket. * * @sock: Pointer to "struct socket". * @addr: Pointer to "struct sockaddr". * @addr_len: Size of @addr. * * Returns 0 on success, negative value otherwise. */ int tomoyo_socket_bind_permission(struct socket *sock, struct sockaddr *addr, int addr_len) { struct tomoyo_addr_info address; const u8 family = tomoyo_sock_family(sock->sk); const unsigned int type = sock->type; if (!family) return 0; switch (type) { case SOCK_STREAM: case SOCK_DGRAM: case SOCK_RAW: case SOCK_SEQPACKET: address.protocol = type; address.operation = TOMOYO_NETWORK_BIND; break; default: return 0; } if (family == PF_UNIX) return tomoyo_check_unix_address(addr, addr_len, &address); return tomoyo_check_inet_address(addr, addr_len, sock->sk->sk_protocol, &address); } /** * tomoyo_socket_sendmsg_permission - Check permission for sending a datagram. * * @sock: Pointer to "struct socket". * @msg: Pointer to "struct msghdr". * @size: Unused. * * Returns 0 on success, negative value otherwise. */ int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg, int size) { struct tomoyo_addr_info address; const u8 family = tomoyo_sock_family(sock->sk); const unsigned int type = sock->type; if (!msg->msg_name || !family || (type != SOCK_DGRAM && type != SOCK_RAW)) return 0; address.protocol = type; address.operation = TOMOYO_NETWORK_SEND; if (family == PF_UNIX) return tomoyo_check_unix_address((struct sockaddr *) msg->msg_name, msg->msg_namelen, &address); return tomoyo_check_inet_address((struct sockaddr *) msg->msg_name, msg->msg_namelen, sock->sk->sk_protocol, &address); } |
| 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 5 4 3 3 3 1 1 2 2 2 3 5 4 4 4 4 2 4 4 4 4 4 4 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 | // SPDX-License-Identifier: GPL-2.0 /* * linux/drivers/scsi/scsi_proc.c * * The functions in this file provide an interface between * the PROC file system and the SCSI device drivers * It is mainly used for debugging, statistics and to pass * information directly to the lowlevel driver. * * (c) 1995 Michael Neuffer neuffer@goofy.zdv.uni-mainz.de * Version: 0.99.8 last change: 95/09/13 * * generic command parser provided by: * Andreas Heilwagen <crashcar@informatik.uni-koblenz.de> * * generic_proc_info() support of xxxx_info() by: * Michael A. Griffith <grif@acm.org> */ #include <linux/module.h> #include <linux/init.h> #include <linux/string.h> #include <linux/mm.h> #include <linux/proc_fs.h> #include <linux/errno.h> #include <linux/blkdev.h> #include <linux/seq_file.h> #include <linux/mutex.h> #include <linux/gfp.h> #include <linux/uaccess.h> #include <scsi/scsi.h> #include <scsi/scsi_device.h> #include <scsi/scsi_host.h> #include <scsi/scsi_transport.h> #include "scsi_priv.h" #include "scsi_logging.h" /* 4K page size, but our output routines, use some slack for overruns */ #define PROC_BLOCK_SIZE (3*1024) static struct proc_dir_entry *proc_scsi; /* Protects scsi_proc_list */ static DEFINE_MUTEX(global_host_template_mutex); static LIST_HEAD(scsi_proc_list); /** * struct scsi_proc_entry - (host template, SCSI proc dir) association * @entry: entry in scsi_proc_list. * @sht: SCSI host template associated with the procfs directory. * @proc_dir: procfs directory associated with the SCSI host template. * @present: Number of SCSI hosts instantiated for @sht. */ struct scsi_proc_entry { struct list_head entry; const struct scsi_host_template *sht; struct proc_dir_entry *proc_dir; unsigned int present; }; static ssize_t proc_scsi_host_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { struct Scsi_Host *shost = pde_data(file_inode(file)); ssize_t ret = -ENOMEM; char *page; if (count > PROC_BLOCK_SIZE) return -EOVERFLOW; if (!shost->hostt->write_info) return -EINVAL; page = (char *)__get_free_page(GFP_KERNEL); if (page) { ret = -EFAULT; if (copy_from_user(page, buf, count)) goto out; ret = shost->hostt->write_info(shost, page, count); } out: free_page((unsigned long)page); return ret; } static int proc_scsi_show(struct seq_file *m, void *v) { struct Scsi_Host *shost = m->private; return shost->hostt->show_info(m, shost); } static int proc_scsi_host_open(struct inode *inode, struct file *file) { return single_open_size(file, proc_scsi_show, pde_data(inode), 4 * PAGE_SIZE); } static struct scsi_proc_entry * __scsi_lookup_proc_entry(const struct scsi_host_template *sht) { struct scsi_proc_entry *e; lockdep_assert_held(&global_host_template_mutex); list_for_each_entry(e, &scsi_proc_list, entry) if (e->sht == sht) return e; return NULL; } static struct scsi_proc_entry * scsi_lookup_proc_entry(const struct scsi_host_template *sht) { struct scsi_proc_entry *e; mutex_lock(&global_host_template_mutex); e = __scsi_lookup_proc_entry(sht); mutex_unlock(&global_host_template_mutex); return e; } /** * scsi_template_proc_dir() - returns the procfs dir for a SCSI host template * @sht: SCSI host template pointer. */ struct proc_dir_entry * scsi_template_proc_dir(const struct scsi_host_template *sht) { struct scsi_proc_entry *e = scsi_lookup_proc_entry(sht); return e ? e->proc_dir : NULL; } EXPORT_SYMBOL_GPL(scsi_template_proc_dir); static const struct proc_ops proc_scsi_ops = { .proc_open = proc_scsi_host_open, .proc_release = single_release, .proc_read = seq_read, .proc_lseek = seq_lseek, .proc_write = proc_scsi_host_write }; /** * scsi_proc_hostdir_add - Create directory in /proc for a scsi host * @sht: owner of this directory * * Sets sht->proc_dir to the new directory. */ int scsi_proc_hostdir_add(const struct scsi_host_template *sht) { struct scsi_proc_entry *e; int ret; if (!sht->show_info) return 0; mutex_lock(&global_host_template_mutex); e = __scsi_lookup_proc_entry(sht); if (!e) { e = kzalloc(sizeof(*e), GFP_KERNEL); if (!e) { ret = -ENOMEM; goto unlock; } } if (e->present++) goto success; e->proc_dir = proc_mkdir(sht->proc_name, proc_scsi); if (!e->proc_dir) { printk(KERN_ERR "%s: proc_mkdir failed for %s\n", __func__, sht->proc_name); ret = -ENOMEM; goto unlock; } e->sht = sht; list_add_tail(&e->entry, &scsi_proc_list); success: e = NULL; ret = 0; unlock: mutex_unlock(&global_host_template_mutex); kfree(e); return ret; } /** * scsi_proc_hostdir_rm - remove directory in /proc for a scsi host * @sht: owner of directory */ void scsi_proc_hostdir_rm(const struct scsi_host_template *sht) { struct scsi_proc_entry *e; if (!sht->show_info) return; mutex_lock(&global_host_template_mutex); e = __scsi_lookup_proc_entry(sht); if (e && !--e->present) { remove_proc_entry(sht->proc_name, proc_scsi); list_del(&e->entry); kfree(e); } mutex_unlock(&global_host_template_mutex); } /** * scsi_proc_host_add - Add entry for this host to appropriate /proc dir * @shost: host to add */ void scsi_proc_host_add(struct Scsi_Host *shost) { const struct scsi_host_template *sht = shost->hostt; struct scsi_proc_entry *e; struct proc_dir_entry *p; char name[10]; if (!sht->show_info) return; e = scsi_lookup_proc_entry(sht); if (!e) goto err; sprintf(name,"%d", shost->host_no); p = proc_create_data(name, S_IRUGO | S_IWUSR, e->proc_dir, &proc_scsi_ops, shost); if (!p) goto err; return; err: shost_printk(KERN_ERR, shost, "%s: Failed to register host (%s failed)\n", __func__, e ? "proc_create_data()" : "scsi_proc_hostdir_add()"); } /** * scsi_proc_host_rm - remove this host's entry from /proc * @shost: which host */ void scsi_proc_host_rm(struct Scsi_Host *shost) { const struct scsi_host_template *sht = shost->hostt; struct scsi_proc_entry *e; char name[10]; if (!sht->show_info) return; e = scsi_lookup_proc_entry(sht); if (!e) return; sprintf(name,"%d", shost->host_no); remove_proc_entry(name, e->proc_dir); } /** * proc_print_scsidevice - return data about this host * @dev: A scsi device * @data: &struct seq_file to output to. * * Description: prints Host, Channel, Id, Lun, Vendor, Model, Rev, Type, * and revision. */ static int proc_print_scsidevice(struct device *dev, void *data) { struct scsi_device *sdev; struct seq_file *s = data; int i; if (!scsi_is_sdev_device(dev)) goto out; sdev = to_scsi_device(dev); seq_printf(s, "Host: scsi%d Channel: %02d Id: %02d Lun: %02llu\n Vendor: ", sdev->host->host_no, sdev->channel, sdev->id, sdev->lun); for (i = 0; i < 8; i++) { if (sdev->vendor[i] >= 0x20) seq_putc(s, sdev->vendor[i]); else seq_putc(s, ' '); } seq_puts(s, " Model: "); for (i = 0; i < 16; i++) { if (sdev->model[i] >= 0x20) seq_putc(s, sdev->model[i]); else seq_putc(s, ' '); } seq_puts(s, " Rev: "); for (i = 0; i < 4; i++) { if (sdev->rev[i] >= 0x20) seq_putc(s, sdev->rev[i]); else seq_putc(s, ' '); } seq_putc(s, '\n'); seq_printf(s, " Type: %s ", scsi_device_type(sdev->type)); seq_printf(s, " ANSI SCSI revision: %02x", sdev->scsi_level - (sdev->scsi_level > 1)); if (sdev->scsi_level == 2) seq_puts(s, " CCS\n"); else seq_putc(s, '\n'); out: return 0; } /** * scsi_add_single_device - Respond to user request to probe for/add device * @host: user-supplied decimal integer * @channel: user-supplied decimal integer * @id: user-supplied decimal integer * @lun: user-supplied decimal integer * * Description: called by writing "scsi add-single-device" to /proc/scsi/scsi. * * does scsi_host_lookup() and either user_scan() if that transport * type supports it, or else scsi_scan_host_selected() * * Note: this seems to be aimed exclusively at SCSI parallel busses. */ static int scsi_add_single_device(uint host, uint channel, uint id, uint lun) { struct Scsi_Host *shost; int error = -ENXIO; shost = scsi_host_lookup(host); if (!shost) return error; if (shost->transportt->user_scan) error = shost->transportt->user_scan(shost, channel, id, lun); else error = scsi_scan_host_selected(shost, channel, id, lun, SCSI_SCAN_MANUAL); scsi_host_put(shost); return error; } /** * scsi_remove_single_device - Respond to user request to remove a device * @host: user-supplied decimal integer * @channel: user-supplied decimal integer * @id: user-supplied decimal integer * @lun: user-supplied decimal integer * * Description: called by writing "scsi remove-single-device" to * /proc/scsi/scsi. Does a scsi_device_lookup() and scsi_remove_device() */ static int scsi_remove_single_device(uint host, uint channel, uint id, uint lun) { struct scsi_device *sdev; struct Scsi_Host *shost; int error = -ENXIO; shost = scsi_host_lookup(host); if (!shost) return error; sdev = scsi_device_lookup(shost, channel, id, lun); if (sdev) { scsi_remove_device(sdev); scsi_device_put(sdev); error = 0; } scsi_host_put(shost); return error; } /** * proc_scsi_write - handle writes to /proc/scsi/scsi * @file: not used * @buf: buffer to write * @length: length of buf, at most PAGE_SIZE * @ppos: not used * * Description: this provides a legacy mechanism to add or remove devices by * Host, Channel, ID, and Lun. To use, * "echo 'scsi add-single-device 0 1 2 3' > /proc/scsi/scsi" or * "echo 'scsi remove-single-device 0 1 2 3' > /proc/scsi/scsi" with * "0 1 2 3" replaced by the Host, Channel, Id, and Lun. * * Note: this seems to be aimed at parallel SCSI. Most modern busses (USB, * SATA, Firewire, Fibre Channel, etc) dynamically assign these values to * provide a unique identifier and nothing more. */ static ssize_t proc_scsi_write(struct file *file, const char __user *buf, size_t length, loff_t *ppos) { int host, channel, id, lun; char *buffer, *end, *p; int err; if (!buf || length > PAGE_SIZE) return -EINVAL; buffer = (char *)__get_free_page(GFP_KERNEL); if (!buffer) return -ENOMEM; err = -EFAULT; if (copy_from_user(buffer, buf, length)) goto out; err = -EINVAL; if (length < PAGE_SIZE) { end = buffer + length; *end = '\0'; } else { end = buffer + PAGE_SIZE - 1; if (*end) goto out; } /* * Usage: echo "scsi add-single-device 0 1 2 3" >/proc/scsi/scsi * with "0 1 2 3" replaced by your "Host Channel Id Lun". */ if (!strncmp("scsi add-single-device", buffer, 22)) { p = buffer + 23; host = (p < end) ? simple_strtoul(p, &p, 0) : 0; channel = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0; id = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0; lun = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0; err = scsi_add_single_device(host, channel, id, lun); /* * Usage: echo "scsi remove-single-device 0 1 2 3" >/proc/scsi/scsi * with "0 1 2 3" replaced by your "Host Channel Id Lun". */ } else if (!strncmp("scsi remove-single-device", buffer, 25)) { p = buffer + 26; host = (p < end) ? simple_strtoul(p, &p, 0) : 0; channel = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0; id = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0; lun = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0; err = scsi_remove_single_device(host, channel, id, lun); } /* * convert success returns so that we return the * number of bytes consumed. */ if (!err) err = length; out: free_page((unsigned long)buffer); return err; } static inline struct device *next_scsi_device(struct device *start) { struct device *next = bus_find_next_device(&scsi_bus_type, start); put_device(start); return next; } static void *scsi_seq_start(struct seq_file *sfile, loff_t *pos) { struct device *dev = NULL; loff_t n = *pos; while ((dev = next_scsi_device(dev))) { if (!n--) break; sfile->private++; } return dev; } static void *scsi_seq_next(struct seq_file *sfile, void *v, loff_t *pos) { (*pos)++; sfile->private++; return next_scsi_device(v); } static void scsi_seq_stop(struct seq_file *sfile, void *v) { put_device(v); } static int scsi_seq_show(struct seq_file *sfile, void *dev) { if (!sfile->private) seq_puts(sfile, "Attached devices:\n"); return proc_print_scsidevice(dev, sfile); } static const struct seq_operations scsi_seq_ops = { .start = scsi_seq_start, .next = scsi_seq_next, .stop = scsi_seq_stop, .show = scsi_seq_show }; /** * proc_scsi_open - glue function * @inode: not used * @file: passed to single_open() * * Associates proc_scsi_show with this file */ static int proc_scsi_open(struct inode *inode, struct file *file) { /* * We don't really need this for the write case but it doesn't * harm either. */ return seq_open(file, &scsi_seq_ops); } static const struct proc_ops scsi_scsi_proc_ops = { .proc_open = proc_scsi_open, .proc_read = seq_read, .proc_write = proc_scsi_write, .proc_lseek = seq_lseek, .proc_release = seq_release, }; /** * scsi_init_procfs - create scsi and scsi/scsi in procfs */ int __init scsi_init_procfs(void) { struct proc_dir_entry *pde; proc_scsi = proc_mkdir("scsi", NULL); if (!proc_scsi) goto err1; pde = proc_create("scsi/scsi", 0, NULL, &scsi_scsi_proc_ops); if (!pde) goto err2; return 0; err2: remove_proc_entry("scsi", NULL); err1: return -ENOMEM; } /** * scsi_exit_procfs - Remove scsi/scsi and scsi from procfs */ void scsi_exit_procfs(void) { remove_proc_entry("scsi/scsi", NULL); remove_proc_entry("scsi", NULL); } |
| 9 9 9 9 9 8 7 7 7 6 6 6 1 5 6 5 9 9 10 10 9 9 9 10 2 2 2 2 5 3 2 5 3 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 2 2 2 18 17 16 18 382 379 380 380 380 380 379 28 26 28 10 53 53 53 53 53 53 16 20 13 13 13 30 30 29 30 30 30 29 24 29 5 27 30 30 30 5 25 3 2 1 3 4 4 4 2 9 9 9 9 6 9 7 6 5 3 9 9 1 3 3 3 8 8 1 8 8 8 6 6 4 2 4 1 2 5 6 6 2 6 2 2 6 6 2 2 2 2 2 2 2 1 1 2 2 6 6 7 7 7 5 4 2 6 2 2 2 2 2 4 4 3 2 2 3 3 3 3 3 3 3 2 1 2 1 1 1 1 1 3 3 2 2 3 3 3 3 3 3 4 3 3 2 1 1 1 1 1 4 20 20 19 18 18 18 4 3 3 4 2 2 2 2 6 5 4 1 3 3 2 1 3 3 3 6 5 5 5 4 3 3 3 3 2 2 3 4 388 390 390 390 390 379 380 3 3 380 389 5 5 4 5 373 372 373 373 216 216 6 216 216 3 3 3 1 1 3 381 382 382 381 381 382 231 10 10 10 10 10 10 10 3 3 3 2 3 3 2 3 2 2 2 2 2 2 2 4 3 1 1 1 1 1 1 2 2 2 2 2 2 1 364 364 291 365 291 365 365 364 365 365 365 365 361 360 364 365 8 7 2 4 4 4 7 8 4 4 4 4 4 4 4 2 4 4 10 10 10 1 9 1 8 8 8 1 7 7 7 10 4 5 3 3 3 3 3 2 1 1 1 1 1 1 1 1 1 3 3 3 3 3 1 1 1 1 1 1 1 1 1 1 2 3 3 2 1 5 5 5 5 5 5 5 5 5 1 1 1 1 3 1 5 5 5 5 5 5 5 3 4 5 1 3 3 3 2 2 2 2 1 1 1 4 4 4 4 4 4 4 4 4 6 2 1 349 18 4 4 4 95 2 2 2 1 2 1 1 95 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 2 218 213 212 213 212 214 213 213 213 212 213 211 212 214 214 3 3 3 3 3 3 3 3 3 3 3 2 3 3 5 5 5 3 3 3 3 3 3 1 1 1 1 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 | // SPDX-License-Identifier: GPL-2.0 /* * linux/fs/proc/base.c * * Copyright (C) 1991, 1992 Linus Torvalds * * proc base directory handling functions * * 1999, Al Viro. Rewritten. Now it covers the whole per-process part. * Instead of using magical inumbers to determine the kind of object * we allocate and fill in-core inodes upon lookup. They don't even * go into icache. We cache the reference to task_struct upon lookup too. * Eventually it should become a filesystem in its own. We don't use the * rest of procfs anymore. * * * Changelog: * 17-Jan-2005 * Allan Bezerra * Bruna Moreira <bruna.moreira@indt.org.br> * Edjard Mota <edjard.mota@indt.org.br> * Ilias Biris <ilias.biris@indt.org.br> * Mauricio Lin <mauricio.lin@indt.org.br> * * Embedded Linux Lab - 10LE Instituto Nokia de Tecnologia - INdT * * A new process specific entry (smaps) included in /proc. It shows the * size of rss for each memory area. The maps entry lacks information * about physical memory size (rss) for each mapped file, i.e., * rss information for executables and library files. * This additional information is useful for any tools that need to know * about physical memory consumption for a process specific library. * * Changelog: * 21-Feb-2005 * Embedded Linux Lab - 10LE Instituto Nokia de Tecnologia - INdT * Pud inclusion in the page table walking. * * ChangeLog: * 10-Mar-2005 * 10LE Instituto Nokia de Tecnologia - INdT: * A better way to walks through the page table as suggested by Hugh Dickins. * * Simo Piiroinen <simo.piiroinen@nokia.com>: * Smaps information related to shared, private, clean and dirty pages. * * Paul Mundt <paul.mundt@nokia.com>: * Overall revision about smaps. */ #include <linux/uaccess.h> #include <linux/errno.h> #include <linux/time.h> #include <linux/proc_fs.h> #include <linux/stat.h> #include <linux/task_io_accounting_ops.h> #include <linux/init.h> #include <linux/capability.h> #include <linux/file.h> #include <linux/generic-radix-tree.h> #include <linux/string.h> #include <linux/seq_file.h> #include <linux/namei.h> #include <linux/mnt_namespace.h> #include <linux/mm.h> #include <linux/swap.h> #include <linux/rcupdate.h> #include <linux/kallsyms.h> #include <linux/stacktrace.h> #include <linux/resource.h> #include <linux/module.h> #include <linux/mount.h> #include <linux/security.h> #include <linux/ptrace.h> #include <linux/printk.h> #include <linux/cache.h> #include <linux/cgroup.h> #include <linux/cpuset.h> #include <linux/audit.h> #include <linux/poll.h> #include <linux/nsproxy.h> #include <linux/oom.h> #include <linux/elf.h> #include <linux/pid_namespace.h> #include <linux/user_namespace.h> #include <linux/fs_parser.h> #include <linux/fs_struct.h> #include <linux/slab.h> #include <linux/sched/autogroup.h> #include <linux/sched/mm.h> #include <linux/sched/coredump.h> #include <linux/sched/debug.h> #include <linux/sched/stat.h> #include <linux/posix-timers.h> #include <linux/time_namespace.h> #include <linux/resctrl.h> #include <linux/cn_proc.h> #include <linux/ksm.h> #include <uapi/linux/lsm.h> #include <trace/events/oom.h> #include "internal.h" #include "fd.h" #include "../../lib/kstrtox.h" /* NOTE: * Implementing inode permission operations in /proc is almost * certainly an error. Permission checks need to happen during * each system call not at open time. The reason is that most of * what we wish to check for permissions in /proc varies at runtime. * * The classic example of a problem is opening file descriptors * in /proc for a task before it execs a suid executable. */ static u8 nlink_tid __ro_after_init; static u8 nlink_tgid __ro_after_init; enum proc_mem_force { PROC_MEM_FORCE_ALWAYS, PROC_MEM_FORCE_PTRACE, PROC_MEM_FORCE_NEVER }; static enum proc_mem_force proc_mem_force_override __ro_after_init = IS_ENABLED(CONFIG_PROC_MEM_NO_FORCE) ? PROC_MEM_FORCE_NEVER : IS_ENABLED(CONFIG_PROC_MEM_FORCE_PTRACE) ? PROC_MEM_FORCE_PTRACE : PROC_MEM_FORCE_ALWAYS; static const struct constant_table proc_mem_force_table[] __initconst = { { "always", PROC_MEM_FORCE_ALWAYS }, { "ptrace", PROC_MEM_FORCE_PTRACE }, { "never", PROC_MEM_FORCE_NEVER }, { } }; static int __init early_proc_mem_force_override(char *buf) { if (!buf) return -EINVAL; /* * lookup_constant() defaults to proc_mem_force_override to preseve * the initial Kconfig choice in case an invalid param gets passed. */ proc_mem_force_override = lookup_constant(proc_mem_force_table, buf, proc_mem_force_override); return 0; } early_param("proc_mem.force_override", early_proc_mem_force_override); struct pid_entry { const char *name; unsigned int len; umode_t mode; const struct inode_operations *iop; const struct file_operations *fop; union proc_op op; }; #define NOD(NAME, MODE, IOP, FOP, OP) { \ .name = (NAME), \ .len = sizeof(NAME) - 1, \ .mode = MODE, \ .iop = IOP, \ .fop = FOP, \ .op = OP, \ } #define DIR(NAME, MODE, iops, fops) \ NOD(NAME, (S_IFDIR|(MODE)), &iops, &fops, {} ) #define LNK(NAME, get_link) \ NOD(NAME, (S_IFLNK|S_IRWXUGO), \ &proc_pid_link_inode_operations, NULL, \ { .proc_get_link = get_link } ) #define REG(NAME, MODE, fops) \ NOD(NAME, (S_IFREG|(MODE)), NULL, &fops, {}) #define ONE(NAME, MODE, show) \ NOD(NAME, (S_IFREG|(MODE)), \ NULL, &proc_single_file_operations, \ { .proc_show = show } ) #define ATTR(LSMID, NAME, MODE) \ NOD(NAME, (S_IFREG|(MODE)), \ NULL, &proc_pid_attr_operations, \ { .lsmid = LSMID }) /* * Count the number of hardlinks for the pid_entry table, excluding the . * and .. links. */ static unsigned int __init pid_entry_nlink(const struct pid_entry *entries, unsigned int n) { unsigned int i; unsigned int count; count = 2; for (i = 0; i < n; ++i) { if (S_ISDIR(entries[i].mode)) ++count; } return count; } static int get_task_root(struct task_struct *task, struct path *root) { int result = -ENOENT; task_lock(task); if (task->fs) { get_fs_root(task->fs, root); result = 0; } task_unlock(task); return result; } static int proc_cwd_link(struct dentry *dentry, struct path *path) { struct task_struct *task = get_proc_task(d_inode(dentry)); int result = -ENOENT; if (task) { task_lock(task); if (task->fs) { get_fs_pwd(task->fs, path); result = 0; } task_unlock(task); put_task_struct(task); } return result; } static int proc_root_link(struct dentry *dentry, struct path *path) { struct task_struct *task = get_proc_task(d_inode(dentry)); int result = -ENOENT; if (task) { result = get_task_root(task, path); put_task_struct(task); } return result; } /* * If the user used setproctitle(), we just get the string from * user space at arg_start, and limit it to a maximum of one page. */ static ssize_t get_mm_proctitle(struct mm_struct *mm, char __user *buf, size_t count, unsigned long pos, unsigned long arg_start) { char *page; int ret, got; if (pos >= PAGE_SIZE) return 0; page = (char *)__get_free_page(GFP_KERNEL); if (!page) return -ENOMEM; ret = 0; got = access_remote_vm(mm, arg_start, page, PAGE_SIZE, FOLL_ANON); if (got > 0) { int len = strnlen(page, got); /* Include the NUL character if it was found */ if (len < got) len++; if (len > pos) { len -= pos; if (len > count) len = count; len -= copy_to_user(buf, page+pos, len); if (!len) len = -EFAULT; ret = len; } } free_page((unsigned long)page); return ret; } static ssize_t get_mm_cmdline(struct mm_struct *mm, char __user *buf, size_t count, loff_t *ppos) { unsigned long arg_start, arg_end, env_start, env_end; unsigned long pos, len; char *page, c; /* Check if process spawned far enough to have cmdline. */ if (!mm->env_end) return 0; spin_lock(&mm->arg_lock); arg_start = mm->arg_start; arg_end = mm->arg_end; env_start = mm->env_start; env_end = mm->env_end; spin_unlock(&mm->arg_lock); if (arg_start >= arg_end) return 0; /* * We allow setproctitle() to overwrite the argument * strings, and overflow past the original end. But * only when it overflows into the environment area. */ if (env_start != arg_end || env_end < env_start) env_start = env_end = arg_end; len = env_end - arg_start; /* We're not going to care if "*ppos" has high bits set */ pos = *ppos; if (pos >= len) return 0; if (count > len - pos) count = len - pos; if (!count) return 0; /* * Magical special case: if the argv[] end byte is not * zero, the user has overwritten it with setproctitle(3). * * Possible future enhancement: do this only once when * pos is 0, and set a flag in the 'struct file'. */ if (access_remote_vm(mm, arg_end-1, &c, 1, FOLL_ANON) == 1 && c) return get_mm_proctitle(mm, buf, count, pos, arg_start); /* * For the non-setproctitle() case we limit things strictly * to the [arg_start, arg_end[ range. */ pos += arg_start; if (pos < arg_start || pos >= arg_end) return 0; if (count > arg_end - pos) count = arg_end - pos; page = (char *)__get_free_page(GFP_KERNEL); if (!page) return -ENOMEM; len = 0; while (count) { int got; size_t size = min_t(size_t, PAGE_SIZE, count); got = access_remote_vm(mm, pos, page, size, FOLL_ANON); if (got <= 0) break; got -= copy_to_user(buf, page, got); if (unlikely(!got)) { if (!len) len = -EFAULT; break; } pos += got; buf += got; len += got; count -= got; } free_page((unsigned long)page); return len; } static ssize_t get_task_cmdline(struct task_struct *tsk, char __user *buf, size_t count, loff_t *pos) { struct mm_struct *mm; ssize_t ret; mm = get_task_mm(tsk); if (!mm) return 0; ret = get_mm_cmdline(mm, buf, count, pos); mmput(mm); return ret; } static ssize_t proc_pid_cmdline_read(struct file *file, char __user *buf, size_t count, loff_t *pos) { struct task_struct *tsk; ssize_t ret; BUG_ON(*pos < 0); tsk = get_proc_task(file_inode(file)); if (!tsk) return -ESRCH; ret = get_task_cmdline(tsk, buf, count, pos); put_task_struct(tsk); if (ret > 0) *pos += ret; return ret; } static const struct file_operations proc_pid_cmdline_ops = { .read = proc_pid_cmdline_read, .llseek = generic_file_llseek, }; #ifdef CONFIG_KALLSYMS /* * Provides a wchan file via kallsyms in a proper one-value-per-file format. * Returns the resolved symbol to user space. */ static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { unsigned long wchan; char symname[KSYM_NAME_LEN]; if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) goto print0; wchan = get_wchan(task); if (wchan && !lookup_symbol_name(wchan, symname)) { seq_puts(m, symname); return 0; } print0: seq_putc(m, '0'); return 0; } #endif /* CONFIG_KALLSYMS */ static int lock_trace(struct task_struct *task) { int err = down_read_killable(&task->signal->exec_update_lock); if (err) return err; if (!ptrace_may_access(task, PTRACE_MODE_ATTACH_FSCREDS)) { up_read(&task->signal->exec_update_lock); return -EPERM; } return 0; } static void unlock_trace(struct task_struct *task) { up_read(&task->signal->exec_update_lock); } #ifdef CONFIG_STACKTRACE #define MAX_STACK_TRACE_DEPTH 64 static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { unsigned long *entries; int err; /* * The ability to racily run the kernel stack unwinder on a running task * and then observe the unwinder output is scary; while it is useful for * debugging kernel issues, it can also allow an attacker to leak kernel * stack contents. * Doing this in a manner that is at least safe from races would require * some work to ensure that the remote task can not be scheduled; and * even then, this would still expose the unwinder as local attack * surface. * Therefore, this interface is restricted to root. */ if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN)) return -EACCES; entries = kmalloc_array(MAX_STACK_TRACE_DEPTH, sizeof(*entries), GFP_KERNEL); if (!entries) return -ENOMEM; err = lock_trace(task); if (!err) { unsigned int i, nr_entries; nr_entries = stack_trace_save_tsk(task, entries, MAX_STACK_TRACE_DEPTH, 0); for (i = 0; i < nr_entries; i++) { seq_printf(m, "[<0>] %pB\n", (void *)entries[i]); } unlock_trace(task); } kfree(entries); return err; } #endif #ifdef CONFIG_SCHED_INFO /* * Provides /proc/PID/schedstat */ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { if (unlikely(!sched_info_on())) seq_puts(m, "0 0 0\n"); else seq_printf(m, "%llu %llu %lu\n", (unsigned long long)task->se.sum_exec_runtime, (unsigned long long)task->sched_info.run_delay, task->sched_info.pcount); return 0; } #endif #ifdef CONFIG_LATENCYTOP static int lstats_show_proc(struct seq_file *m, void *v) { int i; struct inode *inode = m->private; struct task_struct *task = get_proc_task(inode); if (!task) return -ESRCH; seq_puts(m, "Latency Top version : v0.1\n"); for (i = 0; i < LT_SAVECOUNT; i++) { struct latency_record *lr = &task->latency_record[i]; if (lr->backtrace[0]) { int q; seq_printf(m, "%i %li %li", lr->count, lr->time, lr->max); for (q = 0; q < LT_BACKTRACEDEPTH; q++) { unsigned long bt = lr->backtrace[q]; if (!bt) break; seq_printf(m, " %ps", (void *)bt); } seq_putc(m, '\n'); } } put_task_struct(task); return 0; } static int lstats_open(struct inode *inode, struct file *file) { return single_open(file, lstats_show_proc, inode); } static ssize_t lstats_write(struct file *file, const char __user *buf, size_t count, loff_t *offs) { struct task_struct *task = get_proc_task(file_inode(file)); if (!task) return -ESRCH; clear_tsk_latency_tracing(task); put_task_struct(task); return count; } static const struct file_operations proc_lstats_operations = { .open = lstats_open, .read = seq_read, .write = lstats_write, .llseek = seq_lseek, .release = single_release, }; #endif static int proc_oom_score(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { unsigned long totalpages = totalram_pages() + total_swap_pages; unsigned long points = 0; long badness; badness = oom_badness(task, totalpages); /* * Special case OOM_SCORE_ADJ_MIN for all others scale the * badness value into [0, 2000] range which we have been * exporting for a long time so userspace might depend on it. */ if (badness != LONG_MIN) points = (1000 + badness * 1000 / (long)totalpages) * 2 / 3; seq_printf(m, "%lu\n", points); return 0; } struct limit_names { const char *name; const char *unit; }; static const struct limit_names lnames[RLIM_NLIMITS] = { [RLIMIT_CPU] = {"Max cpu time", "seconds"}, [RLIMIT_FSIZE] = {"Max file size", "bytes"}, [RLIMIT_DATA] = {"Max data size", "bytes"}, [RLIMIT_STACK] = {"Max stack size", "bytes"}, [RLIMIT_CORE] = {"Max core file size", "bytes"}, [RLIMIT_RSS] = {"Max resident set", "bytes"}, [RLIMIT_NPROC] = {"Max processes", "processes"}, [RLIMIT_NOFILE] = {"Max open files", "files"}, [RLIMIT_MEMLOCK] = {"Max locked memory", "bytes"}, [RLIMIT_AS] = {"Max address space", "bytes"}, [RLIMIT_LOCKS] = {"Max file locks", "locks"}, [RLIMIT_SIGPENDING] = {"Max pending signals", "signals"}, [RLIMIT_MSGQUEUE] = {"Max msgqueue size", "bytes"}, [RLIMIT_NICE] = {"Max nice priority", NULL}, [RLIMIT_RTPRIO] = {"Max realtime priority", NULL}, [RLIMIT_RTTIME] = {"Max realtime timeout", "us"}, }; /* Display limits for a process */ static int proc_pid_limits(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { unsigned int i; unsigned long flags; struct rlimit rlim[RLIM_NLIMITS]; if (!lock_task_sighand(task, &flags)) return 0; memcpy(rlim, task->signal->rlim, sizeof(struct rlimit) * RLIM_NLIMITS); unlock_task_sighand(task, &flags); /* * print the file header */ seq_puts(m, "Limit " "Soft Limit " "Hard Limit " "Units \n"); for (i = 0; i < RLIM_NLIMITS; i++) { if (rlim[i].rlim_cur == RLIM_INFINITY) seq_printf(m, "%-25s %-20s ", lnames[i].name, "unlimited"); else seq_printf(m, "%-25s %-20lu ", lnames[i].name, rlim[i].rlim_cur); if (rlim[i].rlim_max == RLIM_INFINITY) seq_printf(m, "%-20s ", "unlimited"); else seq_printf(m, "%-20lu ", rlim[i].rlim_max); if (lnames[i].unit) seq_printf(m, "%-10s\n", lnames[i].unit); else seq_putc(m, '\n'); } return 0; } #ifdef CONFIG_HAVE_ARCH_TRACEHOOK static int proc_pid_syscall(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { struct syscall_info info; u64 *args = &info.data.args[0]; int res; res = lock_trace(task); if (res) return res; if (task_current_syscall(task, &info)) seq_puts(m, "running\n"); else if (info.data.nr < 0) seq_printf(m, "%d 0x%llx 0x%llx\n", info.data.nr, info.sp, info.data.instruction_pointer); else seq_printf(m, "%d 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx\n", info.data.nr, args[0], args[1], args[2], args[3], args[4], args[5], info.sp, info.data.instruction_pointer); unlock_trace(task); return 0; } #endif /* CONFIG_HAVE_ARCH_TRACEHOOK */ /************************************************************************/ /* Here the fs part begins */ /************************************************************************/ /* permission checks */ static bool proc_fd_access_allowed(struct inode *inode) { struct task_struct *task; bool allowed = false; /* Allow access to a task's file descriptors if it is us or we * may use ptrace attach to the process and find out that * information. */ task = get_proc_task(inode); if (task) { allowed = ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS); put_task_struct(task); } return allowed; } int proc_setattr(struct mnt_idmap *idmap, struct dentry *dentry, struct iattr *attr) { int error; struct inode *inode = d_inode(dentry); if (attr->ia_valid & ATTR_MODE) return -EPERM; error = setattr_prepare(&nop_mnt_idmap, dentry, attr); if (error) return error; setattr_copy(&nop_mnt_idmap, inode, attr); return 0; } /* * May current process learn task's sched/cmdline info (for hide_pid_min=1) * or euid/egid (for hide_pid_min=2)? */ static bool has_pid_permissions(struct proc_fs_info *fs_info, struct task_struct *task, enum proc_hidepid hide_pid_min) { /* * If 'hidpid' mount option is set force a ptrace check, * we indicate that we are using a filesystem syscall * by passing PTRACE_MODE_READ_FSCREDS */ if (fs_info->hide_pid == HIDEPID_NOT_PTRACEABLE) return ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS); if (fs_info->hide_pid < hide_pid_min) return true; if (in_group_p(fs_info->pid_gid)) return true; return ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS); } static int proc_pid_permission(struct mnt_idmap *idmap, struct inode *inode, int mask) { struct proc_fs_info *fs_info = proc_sb_info(inode->i_sb); struct task_struct *task; bool has_perms; task = get_proc_task(inode); if (!task) return -ESRCH; has_perms = has_pid_permissions(fs_info, task, HIDEPID_NO_ACCESS); put_task_struct(task); if (!has_perms) { if (fs_info->hide_pid == HIDEPID_INVISIBLE) { /* * Let's make getdents(), stat(), and open() * consistent with each other. If a process * may not stat() a file, it shouldn't be seen * in procfs at all. */ return -ENOENT; } return -EPERM; } return generic_permission(&nop_mnt_idmap, inode, mask); } static const struct inode_operations proc_def_inode_operations = { .setattr = proc_setattr, }; static int proc_single_show(struct seq_file *m, void *v) { struct inode *inode = m->private; struct pid_namespace *ns = proc_pid_ns(inode->i_sb); struct pid *pid = proc_pid(inode); struct task_struct *task; int ret; task = get_pid_task(pid, PIDTYPE_PID); if (!task) return -ESRCH; ret = PROC_I(inode)->op.proc_show(m, ns, pid, task); put_task_struct(task); return ret; } static int proc_single_open(struct inode *inode, struct file *filp) { return single_open(filp, proc_single_show, inode); } static const struct file_operations proc_single_file_operations = { .open = proc_single_open, .read = seq_read, .llseek = seq_lseek, .release = single_release, }; struct mm_struct *proc_mem_open(struct inode *inode, unsigned int mode) { struct task_struct *task = get_proc_task(inode); struct mm_struct *mm; if (!task) return ERR_PTR(-ESRCH); mm = mm_access(task, mode | PTRACE_MODE_FSCREDS); put_task_struct(task); if (IS_ERR(mm)) return mm == ERR_PTR(-ESRCH) ? NULL : mm; /* ensure this mm_struct can't be freed */ mmgrab(mm); /* but do not pin its memory */ mmput(mm); return mm; } static int __mem_open(struct inode *inode, struct file *file, unsigned int mode) { struct mm_struct *mm = proc_mem_open(inode, mode); if (IS_ERR(mm)) return PTR_ERR(mm); file->private_data = mm; return 0; } static int mem_open(struct inode *inode, struct file *file) { if (WARN_ON_ONCE(!(file->f_op->fop_flags & FOP_UNSIGNED_OFFSET))) return -EINVAL; return __mem_open(inode, file, PTRACE_MODE_ATTACH); } static bool proc_mem_foll_force(struct file *file, struct mm_struct *mm) { struct task_struct *task; bool ptrace_active = false; switch (proc_mem_force_override) { case PROC_MEM_FORCE_NEVER: return false; case PROC_MEM_FORCE_PTRACE: task = get_proc_task(file_inode(file)); if (task) { ptrace_active = READ_ONCE(task->ptrace) && READ_ONCE(task->mm) == mm && READ_ONCE(task->parent) == current; put_task_struct(task); } return ptrace_active; default: return true; } } static ssize_t mem_rw(struct file *file, char __user *buf, size_t count, loff_t *ppos, int write) { struct mm_struct *mm = file->private_data; unsigned long addr = *ppos; ssize_t copied; char *page; unsigned int flags; if (!mm) return 0; page = (char *)__get_free_page(GFP_KERNEL); if (!page) return -ENOMEM; copied = 0; if (!mmget_not_zero(mm)) goto free; flags = write ? FOLL_WRITE : 0; if (proc_mem_foll_force(file, mm)) flags |= FOLL_FORCE; while (count > 0) { size_t this_len = min_t(size_t, count, PAGE_SIZE); if (write && copy_from_user(page, buf, this_len)) { copied = -EFAULT; break; } this_len = access_remote_vm(mm, addr, page, this_len, flags); if (!this_len) { if (!copied) copied = -EIO; break; } if (!write && copy_to_user(buf, page, this_len)) { copied = -EFAULT; break; } buf += this_len; addr += this_len; copied += this_len; count -= this_len; } *ppos = addr; mmput(mm); free: free_page((unsigned long) page); return copied; } static ssize_t mem_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { return mem_rw(file, buf, count, ppos, 0); } static ssize_t mem_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { return mem_rw(file, (char __user*)buf, count, ppos, 1); } loff_t mem_lseek(struct file *file, loff_t offset, int orig) { switch (orig) { case 0: file->f_pos = offset; break; case 1: file->f_pos += offset; break; default: return -EINVAL; } force_successful_syscall_return(); return file->f_pos; } static int mem_release(struct inode *inode, struct file *file) { struct mm_struct *mm = file->private_data; if (mm) mmdrop(mm); return 0; } static const struct file_operations proc_mem_operations = { .llseek = mem_lseek, .read = mem_read, .write = mem_write, .open = mem_open, .release = mem_release, .fop_flags = FOP_UNSIGNED_OFFSET, }; static int environ_open(struct inode *inode, struct file *file) { return __mem_open(inode, file, PTRACE_MODE_READ); } static ssize_t environ_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { char *page; unsigned long src = *ppos; int ret = 0; struct mm_struct *mm = file->private_data; unsigned long env_start, env_end; /* Ensure the process spawned far enough to have an environment. */ if (!mm || !mm->env_end) return 0; page = (char *)__get_free_page(GFP_KERNEL); if (!page) return -ENOMEM; ret = 0; if (!mmget_not_zero(mm)) goto free; spin_lock(&mm->arg_lock); env_start = mm->env_start; env_end = mm->env_end; spin_unlock(&mm->arg_lock); while (count > 0) { size_t this_len, max_len; int retval; if (src >= (env_end - env_start)) break; this_len = env_end - (env_start + src); max_len = min_t(size_t, PAGE_SIZE, count); this_len = min(max_len, this_len); retval = access_remote_vm(mm, (env_start + src), page, this_len, FOLL_ANON); if (retval <= 0) { ret = retval; break; } if (copy_to_user(buf, page, retval)) { ret = -EFAULT; break; } ret += retval; src += retval; buf += retval; count -= retval; } *ppos = src; mmput(mm); free: free_page((unsigned long) page); return ret; } static const struct file_operations proc_environ_operations = { .open = environ_open, .read = environ_read, .llseek = generic_file_llseek, .release = mem_release, }; static int auxv_open(struct inode *inode, struct file *file) { return __mem_open(inode, file, PTRACE_MODE_READ_FSCREDS); } static ssize_t auxv_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { struct mm_struct *mm = file->private_data; unsigned int nwords = 0; if (!mm) return 0; do { nwords += 2; } while (mm->saved_auxv[nwords - 2] != 0); /* AT_NULL */ return simple_read_from_buffer(buf, count, ppos, mm->saved_auxv, nwords * sizeof(mm->saved_auxv[0])); } static const struct file_operations proc_auxv_operations = { .open = auxv_open, .read = auxv_read, .llseek = generic_file_llseek, .release = mem_release, }; static ssize_t oom_adj_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { struct task_struct *task = get_proc_task(file_inode(file)); char buffer[PROC_NUMBUF]; int oom_adj = OOM_ADJUST_MIN; size_t len; if (!task) return -ESRCH; if (task->signal->oom_score_adj == OOM_SCORE_ADJ_MAX) oom_adj = OOM_ADJUST_MAX; else oom_adj = (task->signal->oom_score_adj * -OOM_DISABLE) / OOM_SCORE_ADJ_MAX; put_task_struct(task); if (oom_adj > OOM_ADJUST_MAX) oom_adj = OOM_ADJUST_MAX; len = snprintf(buffer, sizeof(buffer), "%d\n", oom_adj); return simple_read_from_buffer(buf, count, ppos, buffer, len); } static int __set_oom_adj(struct file *file, int oom_adj, bool legacy) { struct mm_struct *mm = NULL; struct task_struct *task; int err = 0; task = get_proc_task(file_inode(file)); if (!task) return -ESRCH; mutex_lock(&oom_adj_mutex); if (legacy) { if (oom_adj < task->signal->oom_score_adj && !capable(CAP_SYS_RESOURCE)) { err = -EACCES; goto err_unlock; } /* * /proc/pid/oom_adj is provided for legacy purposes, ask users to use * /proc/pid/oom_score_adj instead. */ pr_warn_once("%s (%d): /proc/%d/oom_adj is deprecated, please use /proc/%d/oom_score_adj instead.\n", current->comm, task_pid_nr(current), task_pid_nr(task), task_pid_nr(task)); } else { if ((short)oom_adj < task->signal->oom_score_adj_min && !capable(CAP_SYS_RESOURCE)) { err = -EACCES; goto err_unlock; } } /* * Make sure we will check other processes sharing the mm if this is * not vfrok which wants its own oom_score_adj. * pin the mm so it doesn't go away and get reused after task_unlock */ if (!task->vfork_done) { struct task_struct *p = find_lock_task_mm(task); if (p) { if (test_bit(MMF_MULTIPROCESS, &p->mm->flags)) { mm = p->mm; mmgrab(mm); } task_unlock(p); } } task->signal->oom_score_adj = oom_adj; if (!legacy && has_capability_noaudit(current, CAP_SYS_RESOURCE)) task->signal->oom_score_adj_min = (short)oom_adj; trace_oom_score_adj_update(task); if (mm) { struct task_struct *p; rcu_read_lock(); for_each_process(p) { if (same_thread_group(task, p)) continue; /* do not touch kernel threads or the global init */ if (p->flags & PF_KTHREAD || is_global_init(p)) continue; task_lock(p); if (!p->vfork_done && process_shares_mm(p, mm)) { p->signal->oom_score_adj = oom_adj; if (!legacy && has_capability_noaudit(current, CAP_SYS_RESOURCE)) p->signal->oom_score_adj_min = (short)oom_adj; } task_unlock(p); } rcu_read_unlock(); mmdrop(mm); } err_unlock: mutex_unlock(&oom_adj_mutex); put_task_struct(task); return err; } /* * /proc/pid/oom_adj exists solely for backwards compatibility with previous * kernels. The effective policy is defined by oom_score_adj, which has a * different scale: oom_adj grew exponentially and oom_score_adj grows linearly. * Values written to oom_adj are simply mapped linearly to oom_score_adj. * Processes that become oom disabled via oom_adj will still be oom disabled * with this implementation. * * oom_adj cannot be removed since existing userspace binaries use it. */ static ssize_t oom_adj_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { char buffer[PROC_NUMBUF] = {}; int oom_adj; int err; if (count > sizeof(buffer) - 1) count = sizeof(buffer) - 1; if (copy_from_user(buffer, buf, count)) { err = -EFAULT; goto out; } err = kstrtoint(strstrip(buffer), 0, &oom_adj); if (err) goto out; if ((oom_adj < OOM_ADJUST_MIN || oom_adj > OOM_ADJUST_MAX) && oom_adj != OOM_DISABLE) { err = -EINVAL; goto out; } /* * Scale /proc/pid/oom_score_adj appropriately ensuring that a maximum * value is always attainable. */ if (oom_adj == OOM_ADJUST_MAX) oom_adj = OOM_SCORE_ADJ_MAX; else oom_adj = (oom_adj * OOM_SCORE_ADJ_MAX) / -OOM_DISABLE; err = __set_oom_adj(file, oom_adj, true); out: return err < 0 ? err : count; } static const struct file_operations proc_oom_adj_operations = { .read = oom_adj_read, .write = oom_adj_write, .llseek = generic_file_llseek, }; static ssize_t oom_score_adj_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { struct task_struct *task = get_proc_task(file_inode(file)); char buffer[PROC_NUMBUF]; short oom_score_adj = OOM_SCORE_ADJ_MIN; size_t len; if (!task) return -ESRCH; oom_score_adj = task->signal->oom_score_adj; put_task_struct(task); len = snprintf(buffer, sizeof(buffer), "%hd\n", oom_score_adj); return simple_read_from_buffer(buf, count, ppos, buffer, len); } static ssize_t oom_score_adj_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { char buffer[PROC_NUMBUF] = {}; int oom_score_adj; int err; if (count > sizeof(buffer) - 1) count = sizeof(buffer) - 1; if (copy_from_user(buffer, buf, count)) { err = -EFAULT; goto out; } err = kstrtoint(strstrip(buffer), 0, &oom_score_adj); if (err) goto out; if (oom_score_adj < OOM_SCORE_ADJ_MIN || oom_score_adj > OOM_SCORE_ADJ_MAX) { err = -EINVAL; goto out; } err = __set_oom_adj(file, oom_score_adj, false); out: return err < 0 ? err : count; } static const struct file_operations proc_oom_score_adj_operations = { .read = oom_score_adj_read, .write = oom_score_adj_write, .llseek = default_llseek, }; #ifdef CONFIG_AUDIT #define TMPBUFLEN 11 static ssize_t proc_loginuid_read(struct file * file, char __user * buf, size_t count, loff_t *ppos) { struct inode * inode = file_inode(file); struct task_struct *task = get_proc_task(inode); ssize_t length; char tmpbuf[TMPBUFLEN]; if (!task) return -ESRCH; length = scnprintf(tmpbuf, TMPBUFLEN, "%u", from_kuid(file->f_cred->user_ns, audit_get_loginuid(task))); put_task_struct(task); return simple_read_from_buffer(buf, count, ppos, tmpbuf, length); } static ssize_t proc_loginuid_write(struct file * file, const char __user * buf, size_t count, loff_t *ppos) { struct inode * inode = file_inode(file); uid_t loginuid; kuid_t kloginuid; int rv; /* Don't let kthreads write their own loginuid */ if (current->flags & PF_KTHREAD) return -EPERM; rcu_read_lock(); if (current != pid_task(proc_pid(inode), PIDTYPE_PID)) { rcu_read_unlock(); return -EPERM; } rcu_read_unlock(); if (*ppos != 0) { /* No partial writes. */ return -EINVAL; } rv = kstrtou32_from_user(buf, count, 10, &loginuid); if (rv < 0) return rv; /* is userspace tring to explicitly UNSET the loginuid? */ if (loginuid == AUDIT_UID_UNSET) { kloginuid = INVALID_UID; } else { kloginuid = make_kuid(file->f_cred->user_ns, loginuid); if (!uid_valid(kloginuid)) return -EINVAL; } rv = audit_set_loginuid(kloginuid); if (rv < 0) return rv; return count; } static const struct file_operations proc_loginuid_operations = { .read = proc_loginuid_read, .write = proc_loginuid_write, .llseek = generic_file_llseek, }; static ssize_t proc_sessionid_read(struct file * file, char __user * buf, size_t count, loff_t *ppos) { struct inode * inode = file_inode(file); struct task_struct *task = get_proc_task(inode); ssize_t length; char tmpbuf[TMPBUFLEN]; if (!task) return -ESRCH; length = scnprintf(tmpbuf, TMPBUFLEN, "%u", audit_get_sessionid(task)); put_task_struct(task); return simple_read_from_buffer(buf, count, ppos, tmpbuf, length); } static const struct file_operations proc_sessionid_operations = { .read = proc_sessionid_read, .llseek = generic_file_llseek, }; #endif #ifdef CONFIG_FAULT_INJECTION static ssize_t proc_fault_inject_read(struct file * file, char __user * buf, size_t count, loff_t *ppos) { struct task_struct *task = get_proc_task(file_inode(file)); char buffer[PROC_NUMBUF]; size_t len; int make_it_fail; if (!task) return -ESRCH; make_it_fail = task->make_it_fail; put_task_struct(task); len = snprintf(buffer, sizeof(buffer), "%i\n", make_it_fail); return simple_read_from_buffer(buf, count, ppos, buffer, len); } static ssize_t proc_fault_inject_write(struct file * file, const char __user * buf, size_t count, loff_t *ppos) { struct task_struct *task; char buffer[PROC_NUMBUF] = {}; int make_it_fail; int rv; if (!capable(CAP_SYS_RESOURCE)) return -EPERM; if (count > sizeof(buffer) - 1) count = sizeof(buffer) - 1; if (copy_from_user(buffer, buf, count)) return -EFAULT; rv = kstrtoint(strstrip(buffer), 0, &make_it_fail); if (rv < 0) return rv; if (make_it_fail < 0 || make_it_fail > 1) return -EINVAL; task = get_proc_task(file_inode(file)); if (!task) return -ESRCH; task->make_it_fail = make_it_fail; put_task_struct(task); return count; } static const struct file_operations proc_fault_inject_operations = { .read = proc_fault_inject_read, .write = proc_fault_inject_write, .llseek = generic_file_llseek, }; static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { struct task_struct *task; int err; unsigned int n; err = kstrtouint_from_user(buf, count, 0, &n); if (err) return err; task = get_proc_task(file_inode(file)); if (!task) return -ESRCH; task->fail_nth = n; put_task_struct(task); return count; } static ssize_t proc_fail_nth_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { struct task_struct *task; char numbuf[PROC_NUMBUF]; ssize_t len; task = get_proc_task(file_inode(file)); if (!task) return -ESRCH; len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth); put_task_struct(task); return simple_read_from_buffer(buf, count, ppos, numbuf, len); } static const struct file_operations proc_fail_nth_operations = { .read = proc_fail_nth_read, .write = proc_fail_nth_write, }; #endif /* * Print out various scheduling related per-task fields: */ static int sched_show(struct seq_file *m, void *v) { struct inode *inode = m->private; struct pid_namespace *ns = proc_pid_ns(inode->i_sb); struct task_struct *p; p = get_proc_task(inode); if (!p) return -ESRCH; proc_sched_show_task(p, ns, m); put_task_struct(p); return 0; } static ssize_t sched_write(struct file *file, const char __user *buf, size_t count, loff_t *offset) { struct inode *inode = file_inode(file); struct task_struct *p; p = get_proc_task(inode); if (!p) return -ESRCH; proc_sched_set_task(p); put_task_struct(p); return count; } static int sched_open(struct inode *inode, struct file *filp) { return single_open(filp, sched_show, inode); } static const struct file_operations proc_pid_sched_operations = { .open = sched_open, .read = seq_read, .write = sched_write, .llseek = seq_lseek, .release = single_release, }; #ifdef CONFIG_SCHED_AUTOGROUP /* * Print out autogroup related information: */ static int sched_autogroup_show(struct seq_file *m, void *v) { struct inode *inode = m->private; struct task_struct *p; p = get_proc_task(inode); if (!p) return -ESRCH; proc_sched_autogroup_show_task(p, m); put_task_struct(p); return 0; } static ssize_t sched_autogroup_write(struct file *file, const char __user *buf, size_t count, loff_t *offset) { struct inode *inode = file_inode(file); struct task_struct *p; char buffer[PROC_NUMBUF] = {}; int nice; int err; if (count > sizeof(buffer) - 1) count = sizeof(buffer) - 1; if (copy_from_user(buffer, buf, count)) return -EFAULT; err = kstrtoint(strstrip(buffer), 0, &nice); if (err < 0) return err; p = get_proc_task(inode); if (!p) return -ESRCH; err = proc_sched_autogroup_set_nice(p, nice); if (err) count = err; put_task_struct(p); return count; } static int sched_autogroup_open(struct inode *inode, struct file *filp) { int ret; ret = single_open(filp, sched_autogroup_show, NULL); if (!ret) { struct seq_file *m = filp->private_data; m->private = inode; } return ret; } static const struct file_operations proc_pid_sched_autogroup_operations = { .open = sched_autogroup_open, .read = seq_read, .write = sched_autogroup_write, .llseek = seq_lseek, .release = single_release, }; #endif /* CONFIG_SCHED_AUTOGROUP */ #ifdef CONFIG_TIME_NS static int timens_offsets_show(struct seq_file *m, void *v) { struct task_struct *p; p = get_proc_task(file_inode(m->file)); if (!p) return -ESRCH; proc_timens_show_offsets(p, m); put_task_struct(p); return 0; } static ssize_t timens_offsets_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { struct inode *inode = file_inode(file); struct proc_timens_offset offsets[2]; char *kbuf = NULL, *pos, *next_line; struct task_struct *p; int ret, noffsets; /* Only allow < page size writes at the beginning of the file */ if ((*ppos != 0) || (count >= PAGE_SIZE)) return -EINVAL; /* Slurp in the user data */ kbuf = memdup_user_nul(buf, count); if (IS_ERR(kbuf)) return PTR_ERR(kbuf); /* Parse the user data */ ret = -EINVAL; noffsets = 0; for (pos = kbuf; pos; pos = next_line) { struct proc_timens_offset *off = &offsets[noffsets]; char clock[10]; int err; /* Find the end of line and ensure we don't look past it */ next_line = strchr(pos, '\n'); if (next_line) { *next_line = '\0'; next_line++; if (*next_line == '\0') next_line = NULL; } err = sscanf(pos, "%9s %lld %lu", clock, &off->val.tv_sec, &off->val.tv_nsec); if (err != 3 || off->val.tv_nsec >= NSEC_PER_SEC) goto out; clock[sizeof(clock) - 1] = 0; if (strcmp(clock, "monotonic") == 0 || strcmp(clock, __stringify(CLOCK_MONOTONIC)) == 0) off->clockid = CLOCK_MONOTONIC; else if (strcmp(clock, "boottime") == 0 || strcmp(clock, __stringify(CLOCK_BOOTTIME)) == 0) off->clockid = CLOCK_BOOTTIME; else goto out; noffsets++; if (noffsets == ARRAY_SIZE(offsets)) { if (next_line) count = next_line - kbuf; break; } } ret = -ESRCH; p = get_proc_task(inode); if (!p) goto out; ret = proc_timens_set_offset(file, p, offsets, noffsets); put_task_struct(p); if (ret) goto out; ret = count; out: kfree(kbuf); return ret; } static int timens_offsets_open(struct inode *inode, struct file *filp) { return single_open(filp, timens_offsets_show, inode); } static const struct file_operations proc_timens_offsets_operations = { .open = timens_offsets_open, .read = seq_read, .write = timens_offsets_write, .llseek = seq_lseek, .release = single_release, }; #endif /* CONFIG_TIME_NS */ static ssize_t comm_write(struct file *file, const char __user *buf, size_t count, loff_t *offset) { struct inode *inode = file_inode(file); struct task_struct *p; char buffer[TASK_COMM_LEN] = {}; const size_t maxlen = sizeof(buffer) - 1; if (copy_from_user(buffer, buf, count > maxlen ? maxlen : count)) return -EFAULT; p = get_proc_task(inode); if (!p) return -ESRCH; if (same_thread_group(current, p)) { set_task_comm(p, buffer); proc_comm_connector(p); } else count = -EINVAL; put_task_struct(p); return count; } static int comm_show(struct seq_file *m, void *v) { struct inode *inode = m->private; struct task_struct *p; p = get_proc_task(inode); if (!p) return -ESRCH; proc_task_name(m, p, false); seq_putc(m, '\n'); put_task_struct(p); return 0; } static int comm_open(struct inode *inode, struct file *filp) { return single_open(filp, comm_show, inode); } static const struct file_operations proc_pid_set_comm_operations = { .open = comm_open, .read = seq_read, .write = comm_write, .llseek = seq_lseek, .release = single_release, }; static int proc_exe_link(struct dentry *dentry, struct path *exe_path) { struct task_struct *task; struct file *exe_file; task = get_proc_task(d_inode(dentry)); if (!task) return -ENOENT; exe_file = get_task_exe_file(task); put_task_struct(task); if (exe_file) { *exe_path = exe_file->f_path; path_get(&exe_file->f_path); fput(exe_file); return 0; } else return -ENOENT; } static const char *proc_pid_get_link(struct dentry *dentry, struct inode *inode, struct delayed_call *done) { struct path path; int error = -EACCES; if (!dentry) return ERR_PTR(-ECHILD); /* Are we allowed to snoop on the tasks file descriptors? */ if (!proc_fd_access_allowed(inode)) goto out; error = PROC_I(inode)->op.proc_get_link(dentry, &path); if (error) goto out; error = nd_jump_link(&path); out: return ERR_PTR(error); } static int do_proc_readlink(const struct path *path, char __user *buffer, int buflen) { char *tmp = kmalloc(PATH_MAX, GFP_KERNEL); char *pathname; int len; if (!tmp) return -ENOMEM; pathname = d_path(path, tmp, PATH_MAX); len = PTR_ERR(pathname); if (IS_ERR(pathname)) goto out; len = tmp + PATH_MAX - 1 - pathname; if (len > buflen) len = buflen; if (copy_to_user(buffer, pathname, len)) len = -EFAULT; out: kfree(tmp); return len; } static int proc_pid_readlink(struct dentry * dentry, char __user * buffer, int buflen) { int error = -EACCES; struct inode *inode = d_inode(dentry); struct path path; /* Are we allowed to snoop on the tasks file descriptors? */ if (!proc_fd_access_allowed(inode)) goto out; error = PROC_I(inode)->op.proc_get_link(dentry, &path); if (error) goto out; error = do_proc_readlink(&path, buffer, buflen); path_put(&path); out: return error; } const struct inode_operations proc_pid_link_inode_operations = { .readlink = proc_pid_readlink, .get_link = proc_pid_get_link, .setattr = proc_setattr, }; /* building an inode */ void task_dump_owner(struct task_struct *task, umode_t mode, kuid_t *ruid, kgid_t *rgid) { /* Depending on the state of dumpable compute who should own a * proc file for a task. */ const struct cred *cred; kuid_t uid; kgid_t gid; if (unlikely(task->flags & PF_KTHREAD)) { *ruid = GLOBAL_ROOT_UID; *rgid = GLOBAL_ROOT_GID; return; } /* Default to the tasks effective ownership */ rcu_read_lock(); cred = __task_cred(task); uid = cred->euid; gid = cred->egid; rcu_read_unlock(); /* * Before the /proc/pid/status file was created the only way to read * the effective uid of a /process was to stat /proc/pid. Reading * /proc/pid/status is slow enough that procps and other packages * kept stating /proc/pid. To keep the rules in /proc simple I have * made this apply to all per process world readable and executable * directories. */ if (mode != (S_IFDIR|S_IRUGO|S_IXUGO)) { struct mm_struct *mm; task_lock(task); mm = task->mm; /* Make non-dumpable tasks owned by some root */ if (mm) { if (get_dumpable(mm) != SUID_DUMP_USER) { struct user_namespace *user_ns = mm->user_ns; uid = make_kuid(user_ns, 0); if (!uid_valid(uid)) uid = GLOBAL_ROOT_UID; gid = make_kgid(user_ns, 0); if (!gid_valid(gid)) gid = GLOBAL_ROOT_GID; } } else { uid = GLOBAL_ROOT_UID; gid = GLOBAL_ROOT_GID; } task_unlock(task); } *ruid = uid; *rgid = gid; } void proc_pid_evict_inode(struct proc_inode *ei) { struct pid *pid = ei->pid; if (S_ISDIR(ei->vfs_inode.i_mode)) { spin_lock(&pid->lock); hlist_del_init_rcu(&ei->sibling_inodes); spin_unlock(&pid->lock); } } struct inode *proc_pid_make_inode(struct super_block *sb, struct task_struct *task, umode_t mode) { struct inode * inode; struct proc_inode *ei; struct pid *pid; /* We need a new inode */ inode = new_inode(sb); if (!inode) goto out; /* Common stuff */ ei = PROC_I(inode); inode->i_mode = mode; inode->i_ino = get_next_ino(); simple_inode_init_ts(inode); inode->i_op = &proc_def_inode_operations; /* * grab the reference to task. */ pid = get_task_pid(task, PIDTYPE_PID); if (!pid) goto out_unlock; /* Let the pid remember us for quick removal */ ei->pid = pid; task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid); security_task_to_inode(task, inode); out: return inode; out_unlock: iput(inode); return NULL; } /* * Generating an inode and adding it into @pid->inodes, so that task will * invalidate inode's dentry before being released. * * This helper is used for creating dir-type entries under '/proc' and * '/proc/<tgid>/task'. Other entries(eg. fd, stat) under '/proc/<tgid>' * can be released by invalidating '/proc/<tgid>' dentry. * In theory, dentries under '/proc/<tgid>/task' can also be released by * invalidating '/proc/<tgid>' dentry, we reserve it to handle single * thread exiting situation: Any one of threads should invalidate its * '/proc/<tgid>/task/<pid>' dentry before released. */ static struct inode *proc_pid_make_base_inode(struct super_block *sb, struct task_struct *task, umode_t mode) { struct inode *inode; struct proc_inode *ei; struct pid *pid; inode = proc_pid_make_inode(sb, task, mode); if (!inode) return NULL; /* Let proc_flush_pid find this directory inode */ ei = PROC_I(inode); pid = ei->pid; spin_lock(&pid->lock); hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes); spin_unlock(&pid->lock); return inode; } int pid_getattr(struct mnt_idmap *idmap, const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { struct inode *inode = d_inode(path->dentry); struct proc_fs_info *fs_info = proc_sb_info(inode->i_sb); struct task_struct *task; generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat); stat->uid = GLOBAL_ROOT_UID; stat->gid = GLOBAL_ROOT_GID; rcu_read_lock(); task = pid_task(proc_pid(inode), PIDTYPE_PID); if (task) { if (!has_pid_permissions(fs_info, task, HIDEPID_INVISIBLE)) { rcu_read_unlock(); /* * This doesn't prevent learning whether PID exists, * it only makes getattr() consistent with readdir(). */ return -ENOENT; } task_dump_owner(task, inode->i_mode, &stat->uid, &stat->gid); } rcu_read_unlock(); return 0; } /* dentry stuff */ /* * Set <pid>/... inode ownership (can change due to setuid(), etc.) */ void pid_update_inode(struct task_struct *task, struct inode *inode) { task_dump_owner(task, inode->i_mode, &inode->i_uid, &inode->i_gid); inode->i_mode &= ~(S_ISUID | S_ISGID); security_task_to_inode(task, inode); } /* * Rewrite the inode's ownerships here because the owning task may have * performed a setuid(), etc. * */ static int pid_revalidate(struct inode *dir, const struct qstr *name, struct dentry *dentry, unsigned int flags) { struct inode *inode; struct task_struct *task; int ret = 0; rcu_read_lock(); inode = d_inode_rcu(dentry); if (!inode) goto out; task = pid_task(proc_pid(inode), PIDTYPE_PID); if (task) { pid_update_inode(task, inode); ret = 1; } out: rcu_read_unlock(); return ret; } static inline bool proc_inode_is_dead(struct inode *inode) { return !proc_pid(inode)->tasks[PIDTYPE_PID].first; } int pid_delete_dentry(const struct dentry *dentry) { /* Is the task we represent dead? * If so, then don't put the dentry on the lru list, * kill it immediately. */ return proc_inode_is_dead(d_inode(dentry)); } const struct dentry_operations pid_dentry_operations = { .d_revalidate = pid_revalidate, .d_delete = pid_delete_dentry, }; /* Lookups */ /* * Fill a directory entry. * * If possible create the dcache entry and derive our inode number and * file type from dcache entry. * * Since all of the proc inode numbers are dynamically generated, the inode * numbers do not exist until the inode is cache. This means creating * the dcache entry in readdir is necessary to keep the inode numbers * reported by readdir in sync with the inode numbers reported * by stat. */ bool proc_fill_cache(struct file *file, struct dir_context *ctx, const char *name, unsigned int len, instantiate_t instantiate, struct task_struct *task, const void *ptr) { struct dentry *child, *dir = file->f_path.dentry; struct qstr qname = QSTR_INIT(name, len); struct inode *inode; unsigned type = DT_UNKNOWN; ino_t ino = 1; child = d_hash_and_lookup(dir, &qname); if (!child) { DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); child = d_alloc_parallel(dir, &qname, &wq); if (IS_ERR(child)) goto end_instantiate; if (d_in_lookup(child)) { struct dentry *res; res = instantiate(child, task, ptr); d_lookup_done(child); if (unlikely(res)) { dput(child); child = res; if (IS_ERR(child)) goto end_instantiate; } } } inode = d_inode(child); ino = inode->i_ino; type = inode->i_mode >> 12; dput(child); end_instantiate: return dir_emit(ctx, name, len, ino, type); } /* * dname_to_vma_addr - maps a dentry name into two unsigned longs * which represent vma start and end addresses. */ static int dname_to_vma_addr(struct dentry *dentry, unsigned long *start, unsigned long *end) { const char *str = dentry->d_name.name; unsigned long long sval, eval; unsigned int len; if (str[0] == '0' && str[1] != '-') return -EINVAL; len = _parse_integer(str, 16, &sval); if (len & KSTRTOX_OVERFLOW) return -EINVAL; if (sval != (unsigned long)sval) return -EINVAL; str += len; if (*str != '-') return -EINVAL; str++; if (str[0] == '0' && str[1]) return -EINVAL; len = _parse_integer(str, 16, &eval); if (len & KSTRTOX_OVERFLOW) return -EINVAL; if (eval != (unsigned long)eval) return -EINVAL; str += len; if (*str != '\0') return -EINVAL; *start = sval; *end = eval; return 0; } static int map_files_d_revalidate(struct inode *dir, const struct qstr *name, struct dentry *dentry, unsigned int flags) { unsigned long vm_start, vm_end; bool exact_vma_exists = false; struct mm_struct *mm = NULL; struct task_struct *task; struct inode *inode; int status = 0; if (flags & LOOKUP_RCU) return -ECHILD; inode = d_inode(dentry); task = get_proc_task(inode); if (!task) goto out_notask; mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); if (IS_ERR(mm)) goto out; if (!dname_to_vma_addr(dentry, &vm_start, &vm_end)) { status = mmap_read_lock_killable(mm); if (!status) { exact_vma_exists = !!find_exact_vma(mm, vm_start, vm_end); mmap_read_unlock(mm); } } mmput(mm); if (exact_vma_exists) { task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid); security_task_to_inode(task, inode); status = 1; } out: put_task_struct(task); out_notask: return status; } static const struct dentry_operations tid_map_files_dentry_operations = { .d_revalidate = map_files_d_revalidate, .d_delete = pid_delete_dentry, }; static int map_files_get_link(struct dentry *dentry, struct path *path) { unsigned long vm_start, vm_end; struct vm_area_struct *vma; struct task_struct *task; struct mm_struct *mm; int rc; rc = -ENOENT; task = get_proc_task(d_inode(dentry)); if (!task) goto out; mm = get_task_mm(task); put_task_struct(task); if (!mm) goto out; rc = dname_to_vma_addr(dentry, &vm_start, &vm_end); if (rc) goto out_mmput; rc = mmap_read_lock_killable(mm); if (rc) goto out_mmput; rc = -ENOENT; vma = find_exact_vma(mm, vm_start, vm_end); if (vma && vma->vm_file) { *path = *file_user_path(vma->vm_file); path_get(path); rc = 0; } mmap_read_unlock(mm); out_mmput: mmput(mm); out: return rc; } struct map_files_info { unsigned long start; unsigned long end; fmode_t mode; }; /* * Only allow CAP_SYS_ADMIN and CAP_CHECKPOINT_RESTORE to follow the links, due * to concerns about how the symlinks may be used to bypass permissions on * ancestor directories in the path to the file in question. */ static const char * proc_map_files_get_link(struct dentry *dentry, struct inode *inode, struct delayed_call *done) { if (!checkpoint_restore_ns_capable(&init_user_ns)) return ERR_PTR(-EPERM); return proc_pid_get_link(dentry, inode, done); } /* * Identical to proc_pid_link_inode_operations except for get_link() */ static const struct inode_operations proc_map_files_link_inode_operations = { .readlink = proc_pid_readlink, .get_link = proc_map_files_get_link, .setattr = proc_setattr, }; static struct dentry * proc_map_files_instantiate(struct dentry *dentry, struct task_struct *task, const void *ptr) { fmode_t mode = (fmode_t)(unsigned long)ptr; struct proc_inode *ei; struct inode *inode; inode = proc_pid_make_inode(dentry->d_sb, task, S_IFLNK | ((mode & FMODE_READ ) ? S_IRUSR : 0) | ((mode & FMODE_WRITE) ? S_IWUSR : 0)); if (!inode) return ERR_PTR(-ENOENT); ei = PROC_I(inode); ei->op.proc_get_link = map_files_get_link; inode->i_op = &proc_map_files_link_inode_operations; inode->i_size = 64; return proc_splice_unmountable(inode, dentry, &tid_map_files_dentry_operations); } static struct dentry *proc_map_files_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { unsigned long vm_start, vm_end; struct vm_area_struct *vma; struct task_struct *task; struct dentry *result; struct mm_struct *mm; result = ERR_PTR(-ENOENT); task = get_proc_task(dir); if (!task) goto out; result = ERR_PTR(-EACCES); if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) goto out_put_task; result = ERR_PTR(-ENOENT); if (dname_to_vma_addr(dentry, &vm_start, &vm_end)) goto out_put_task; mm = get_task_mm(task); if (!mm) goto out_put_task; result = ERR_PTR(-EINTR); if (mmap_read_lock_killable(mm)) goto out_put_mm; result = ERR_PTR(-ENOENT); vma = find_exact_vma(mm, vm_start, vm_end); if (!vma) goto out_no_vma; if (vma->vm_file) result = proc_map_files_instantiate(dentry, task, (void *)(unsigned long)vma->vm_file->f_mode); out_no_vma: mmap_read_unlock(mm); out_put_mm: mmput(mm); out_put_task: put_task_struct(task); out: return result; } static const struct inode_operations proc_map_files_inode_operations = { .lookup = proc_map_files_lookup, .permission = proc_fd_permission, .setattr = proc_setattr, }; static int proc_map_files_readdir(struct file *file, struct dir_context *ctx) { struct vm_area_struct *vma; struct task_struct *task; struct mm_struct *mm; unsigned long nr_files, pos, i; GENRADIX(struct map_files_info) fa; struct map_files_info *p; int ret; struct vma_iterator vmi; genradix_init(&fa); ret = -ENOENT; task = get_proc_task(file_inode(file)); if (!task) goto out; ret = -EACCES; if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) goto out_put_task; ret = 0; if (!dir_emit_dots(file, ctx)) goto out_put_task; mm = get_task_mm(task); if (!mm) goto out_put_task; ret = mmap_read_lock_killable(mm); if (ret) { mmput(mm); goto out_put_task; } nr_files = 0; /* * We need two passes here: * * 1) Collect vmas of mapped files with mmap_lock taken * 2) Release mmap_lock and instantiate entries * * otherwise we get lockdep complained, since filldir() * routine might require mmap_lock taken in might_fault(). */ pos = 2; vma_iter_init(&vmi, mm, 0); for_each_vma(vmi, vma) { if (!vma->vm_file) continue; if (++pos <= ctx->pos) continue; p = genradix_ptr_alloc(&fa, nr_files++, GFP_KERNEL); if (!p) { ret = -ENOMEM; mmap_read_unlock(mm); mmput(mm); goto out_put_task; } p->start = vma->vm_start; p->end = vma->vm_end; p->mode = vma->vm_file->f_mode; } mmap_read_unlock(mm); mmput(mm); for (i = 0; i < nr_files; i++) { char buf[4 * sizeof(long) + 2]; /* max: %lx-%lx\0 */ unsigned int len; p = genradix_ptr(&fa, i); len = snprintf(buf, sizeof(buf), "%lx-%lx", p->start, p->end); if (!proc_fill_cache(file, ctx, buf, len, proc_map_files_instantiate, task, (void *)(unsigned long)p->mode)) break; ctx->pos++; } out_put_task: put_task_struct(task); out: genradix_free(&fa); return ret; } static const struct file_operations proc_map_files_operations = { .read = generic_read_dir, .iterate_shared = proc_map_files_readdir, .llseek = generic_file_llseek, }; #if defined(CONFIG_CHECKPOINT_RESTORE) && defined(CONFIG_POSIX_TIMERS) struct timers_private { struct pid *pid; struct task_struct *task; struct pid_namespace *ns; }; static void *timers_start(struct seq_file *m, loff_t *pos) { struct timers_private *tp = m->private; tp->task = get_pid_task(tp->pid, PIDTYPE_PID); if (!tp->task) return ERR_PTR(-ESRCH); rcu_read_lock(); return seq_hlist_start_rcu(&tp->task->signal->posix_timers, *pos); } static void *timers_next(struct seq_file *m, void *v, loff_t *pos) { struct timers_private *tp = m->private; return seq_hlist_next_rcu(v, &tp->task->signal->posix_timers, pos); } static void timers_stop(struct seq_file *m, void *v) { struct timers_private *tp = m->private; if (tp->task) { put_task_struct(tp->task); tp->task = NULL; rcu_read_unlock(); } } static int show_timer(struct seq_file *m, void *v) { static const char * const nstr[] = { [SIGEV_SIGNAL] = "signal", [SIGEV_NONE] = "none", [SIGEV_THREAD] = "thread", }; struct k_itimer *timer = hlist_entry((struct hlist_node *)v, struct k_itimer, list); struct timers_private *tp = m->private; int notify = timer->it_sigev_notify; guard(spinlock_irq)(&timer->it_lock); if (!posixtimer_valid(timer)) return 0; seq_printf(m, "ID: %d\n", timer->it_id); seq_printf(m, "signal: %d/%px\n", timer->sigq.info.si_signo, timer->sigq.info.si_value.sival_ptr); seq_printf(m, "notify: %s/%s.%d\n", nstr[notify & ~SIGEV_THREAD_ID], (notify & SIGEV_THREAD_ID) ? "tid" : "pid", pid_nr_ns(timer->it_pid, tp->ns)); seq_printf(m, "ClockID: %d\n", timer->it_clock); return 0; } static const struct seq_operations proc_timers_seq_ops = { .start = timers_start, .next = timers_next, .stop = timers_stop, .show = show_timer, }; static int proc_timers_open(struct inode *inode, struct file *file) { struct timers_private *tp; tp = __seq_open_private(file, &proc_timers_seq_ops, sizeof(struct timers_private)); if (!tp) return -ENOMEM; tp->pid = proc_pid(inode); tp->ns = proc_pid_ns(inode->i_sb); return 0; } static const struct file_operations proc_timers_operations = { .open = proc_timers_open, .read = seq_read, .llseek = seq_lseek, .release = seq_release_private, }; #endif static ssize_t timerslack_ns_write(struct file *file, const char __user *buf, size_t count, loff_t *offset) { struct inode *inode = file_inode(file); struct task_struct *p; u64 slack_ns; int err; err = kstrtoull_from_user(buf, count, 10, &slack_ns); if (err < 0) return err; p = get_proc_task(inode); if (!p) return -ESRCH; if (p != current) { rcu_read_lock(); if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) { rcu_read_unlock(); count = -EPERM; goto out; } rcu_read_unlock(); err = security_task_setscheduler(p); if (err) { count = err; goto out; } } task_lock(p); if (rt_or_dl_task_policy(p)) slack_ns = 0; else if (slack_ns == 0) slack_ns = p->default_timer_slack_ns; p->timer_slack_ns = slack_ns; task_unlock(p); out: put_task_struct(p); return count; } static int timerslack_ns_show(struct seq_file *m, void *v) { struct inode *inode = m->private; struct task_struct *p; int err = 0; p = get_proc_task(inode); if (!p) return -ESRCH; if (p != current) { rcu_read_lock(); if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) { rcu_read_unlock(); err = -EPERM; goto out; } rcu_read_unlock(); err = security_task_getscheduler(p); if (err) goto out; } task_lock(p); seq_printf(m, "%llu\n", p->timer_slack_ns); task_unlock(p); out: put_task_struct(p); return err; } static int timerslack_ns_open(struct inode *inode, struct file *filp) { return single_open(filp, timerslack_ns_show, inode); } static const struct file_operations proc_pid_set_timerslack_ns_operations = { .open = timerslack_ns_open, .read = seq_read, .write = timerslack_ns_write, .llseek = seq_lseek, .release = single_release, }; static struct dentry *proc_pident_instantiate(struct dentry *dentry, struct task_struct *task, const void *ptr) { const struct pid_entry *p = ptr; struct inode *inode; struct proc_inode *ei; inode = proc_pid_make_inode(dentry->d_sb, task, p->mode); if (!inode) return ERR_PTR(-ENOENT); ei = PROC_I(inode); if (S_ISDIR(inode->i_mode)) set_nlink(inode, 2); /* Use getattr to fix if necessary */ if (p->iop) inode->i_op = p->iop; if (p->fop) inode->i_fop = p->fop; ei->op = p->op; pid_update_inode(task, inode); d_set_d_op(dentry, &pid_dentry_operations); return d_splice_alias(inode, dentry); } static struct dentry *proc_pident_lookup(struct inode *dir, struct dentry *dentry, const struct pid_entry *p, const struct pid_entry *end) { struct task_struct *task = get_proc_task(dir); struct dentry *res = ERR_PTR(-ENOENT); if (!task) goto out_no_task; /* * Yes, it does not scale. And it should not. Don't add * new entries into /proc/<tgid>/ without very good reasons. */ for (; p < end; p++) { if (p->len != dentry->d_name.len) continue; if (!memcmp(dentry->d_name.name, p->name, p->len)) { res = proc_pident_instantiate(dentry, task, p); break; } } put_task_struct(task); out_no_task: return res; } static int proc_pident_readdir(struct file *file, struct dir_context *ctx, const struct pid_entry *ents, unsigned int nents) { struct task_struct *task = get_proc_task(file_inode(file)); const struct pid_entry *p; if (!task) return -ENOENT; if (!dir_emit_dots(file, ctx)) goto out; if (ctx->pos >= nents + 2) goto out; for (p = ents + (ctx->pos - 2); p < ents + nents; p++) { if (!proc_fill_cache(file, ctx, p->name, p->len, proc_pident_instantiate, task, p)) break; ctx->pos++; } out: put_task_struct(task); return 0; } #ifdef CONFIG_SECURITY static int proc_pid_attr_open(struct inode *inode, struct file *file) { file->private_data = NULL; __mem_open(inode, file, PTRACE_MODE_READ_FSCREDS); return 0; } static ssize_t proc_pid_attr_read(struct file * file, char __user * buf, size_t count, loff_t *ppos) { struct inode * inode = file_inode(file); char *p = NULL; ssize_t length; struct task_struct *task = get_proc_task(inode); if (!task) return -ESRCH; length = security_getprocattr(task, PROC_I(inode)->op.lsmid, file->f_path.dentry->d_name.name, &p); put_task_struct(task); if (length > 0) length = simple_read_from_buffer(buf, count, ppos, p, length); kfree(p); return length; } static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf, size_t count, loff_t *ppos) { struct inode * inode = file_inode(file); struct task_struct *task; void *page; int rv; /* A task may only write when it was the opener. */ if (file->private_data != current->mm) return -EPERM; rcu_read_lock(); task = pid_task(proc_pid(inode), PIDTYPE_PID); if (!task) { rcu_read_unlock(); return -ESRCH; } /* A task may only write its own attributes. */ if (current != task) { rcu_read_unlock(); return -EACCES; } /* Prevent changes to overridden credentials. */ if (current_cred() != current_real_cred()) { rcu_read_unlock(); return -EBUSY; } rcu_read_unlock(); if (count > PAGE_SIZE) count = PAGE_SIZE; /* No partial writes. */ if (*ppos != 0) return -EINVAL; page = memdup_user(buf, count); if (IS_ERR(page)) { rv = PTR_ERR(page); goto out; } /* Guard against adverse ptrace interaction */ rv = mutex_lock_interruptible(¤t->signal->cred_guard_mutex); if (rv < 0) goto out_free; rv = security_setprocattr(PROC_I(inode)->op.lsmid, file->f_path.dentry->d_name.name, page, count); mutex_unlock(¤t->signal->cred_guard_mutex); out_free: kfree(page); out: return rv; } static const struct file_operations proc_pid_attr_operations = { .open = proc_pid_attr_open, .read = proc_pid_attr_read, .write = proc_pid_attr_write, .llseek = generic_file_llseek, .release = mem_release, }; #define LSM_DIR_OPS(LSM) \ static int proc_##LSM##_attr_dir_iterate(struct file *filp, \ struct dir_context *ctx) \ { \ return proc_pident_readdir(filp, ctx, \ LSM##_attr_dir_stuff, \ ARRAY_SIZE(LSM##_attr_dir_stuff)); \ } \ \ static const struct file_operations proc_##LSM##_attr_dir_ops = { \ .read = generic_read_dir, \ .iterate_shared = proc_##LSM##_attr_dir_iterate, \ .llseek = default_llseek, \ }; \ \ static struct dentry *proc_##LSM##_attr_dir_lookup(struct inode *dir, \ struct dentry *dentry, unsigned int flags) \ { \ return proc_pident_lookup(dir, dentry, \ LSM##_attr_dir_stuff, \ LSM##_attr_dir_stuff + ARRAY_SIZE(LSM##_attr_dir_stuff)); \ } \ \ static const struct inode_operations proc_##LSM##_attr_dir_inode_ops = { \ .lookup = proc_##LSM##_attr_dir_lookup, \ .getattr = pid_getattr, \ .setattr = proc_setattr, \ } #ifdef CONFIG_SECURITY_SMACK static const struct pid_entry smack_attr_dir_stuff[] = { ATTR(LSM_ID_SMACK, "current", 0666), }; LSM_DIR_OPS(smack); #endif #ifdef CONFIG_SECURITY_APPARMOR static const struct pid_entry apparmor_attr_dir_stuff[] = { ATTR(LSM_ID_APPARMOR, "current", 0666), ATTR(LSM_ID_APPARMOR, "prev", 0444), ATTR(LSM_ID_APPARMOR, "exec", 0666), }; LSM_DIR_OPS(apparmor); #endif static const struct pid_entry attr_dir_stuff[] = { ATTR(LSM_ID_UNDEF, "current", 0666), ATTR(LSM_ID_UNDEF, "prev", 0444), ATTR(LSM_ID_UNDEF, "exec", 0666), ATTR(LSM_ID_UNDEF, "fscreate", 0666), ATTR(LSM_ID_UNDEF, "keycreate", 0666), ATTR(LSM_ID_UNDEF, "sockcreate", 0666), #ifdef CONFIG_SECURITY_SMACK DIR("smack", 0555, proc_smack_attr_dir_inode_ops, proc_smack_attr_dir_ops), #endif #ifdef CONFIG_SECURITY_APPARMOR DIR("apparmor", 0555, proc_apparmor_attr_dir_inode_ops, proc_apparmor_attr_dir_ops), #endif }; static int proc_attr_dir_readdir(struct file *file, struct dir_context *ctx) { return proc_pident_readdir(file, ctx, attr_dir_stuff, ARRAY_SIZE(attr_dir_stuff)); } static const struct file_operations proc_attr_dir_operations = { .read = generic_read_dir, .iterate_shared = proc_attr_dir_readdir, .llseek = generic_file_llseek, }; static struct dentry *proc_attr_dir_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { return proc_pident_lookup(dir, dentry, attr_dir_stuff, attr_dir_stuff + ARRAY_SIZE(attr_dir_stuff)); } static const struct inode_operations proc_attr_dir_inode_operations = { .lookup = proc_attr_dir_lookup, .getattr = pid_getattr, .setattr = proc_setattr, }; #endif #ifdef CONFIG_ELF_CORE static ssize_t proc_coredump_filter_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { struct task_struct *task = get_proc_task(file_inode(file)); struct mm_struct *mm; char buffer[PROC_NUMBUF]; size_t len; int ret; if (!task) return -ESRCH; ret = 0; mm = get_task_mm(task); if (mm) { len = snprintf(buffer, sizeof(buffer), "%08lx\n", ((mm->flags & MMF_DUMP_FILTER_MASK) >> MMF_DUMP_FILTER_SHIFT)); mmput(mm); ret = simple_read_from_buffer(buf, count, ppos, buffer, len); } put_task_struct(task); return ret; } static ssize_t proc_coredump_filter_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { struct task_struct *task; struct mm_struct *mm; unsigned int val; int ret; int i; unsigned long mask; ret = kstrtouint_from_user(buf, count, 0, &val); if (ret < 0) return ret; ret = -ESRCH; task = get_proc_task(file_inode(file)); if (!task) goto out_no_task; mm = get_task_mm(task); if (!mm) goto out_no_mm; ret = 0; for (i = 0, mask = 1; i < MMF_DUMP_FILTER_BITS; i++, mask <<= 1) { if (val & mask) set_bit(i + MMF_DUMP_FILTER_SHIFT, &mm->flags); else clear_bit(i + MMF_DUMP_FILTER_SHIFT, &mm->flags); } mmput(mm); out_no_mm: put_task_struct(task); out_no_task: if (ret < 0) return ret; return count; } static const struct file_operations proc_coredump_filter_operations = { .read = proc_coredump_filter_read, .write = proc_coredump_filter_write, .llseek = generic_file_llseek, }; #endif #ifdef CONFIG_TASK_IO_ACCOUNTING static int do_io_accounting(struct task_struct *task, struct seq_file *m, int whole) { struct task_io_accounting acct; int result; result = down_read_killable(&task->signal->exec_update_lock); if (result) return result; if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) { result = -EACCES; goto out_unlock; } if (whole) { struct signal_struct *sig = task->signal; struct task_struct *t; unsigned int seq = 1; unsigned long flags; rcu_read_lock(); do { seq++; /* 2 on the 1st/lockless path, otherwise odd */ flags = read_seqbegin_or_lock_irqsave(&sig->stats_lock, &seq); acct = sig->ioac; __for_each_thread(sig, t) task_io_accounting_add(&acct, &t->ioac); } while (need_seqretry(&sig->stats_lock, seq)); done_seqretry_irqrestore(&sig->stats_lock, seq, flags); rcu_read_unlock(); } else { acct = task->ioac; } seq_printf(m, "rchar: %llu\n" "wchar: %llu\n" "syscr: %llu\n" "syscw: %llu\n" "read_bytes: %llu\n" "write_bytes: %llu\n" "cancelled_write_bytes: %llu\n", (unsigned long long)acct.rchar, (unsigned long long)acct.wchar, (unsigned long long)acct.syscr, (unsigned long long)acct.syscw, (unsigned long long)acct.read_bytes, (unsigned long long)acct.write_bytes, (unsigned long long)acct.cancelled_write_bytes); result = 0; out_unlock: up_read(&task->signal->exec_update_lock); return result; } static int proc_tid_io_accounting(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { return do_io_accounting(task, m, 0); } static int proc_tgid_io_accounting(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { return do_io_accounting(task, m, 1); } #endif /* CONFIG_TASK_IO_ACCOUNTING */ #ifdef CONFIG_USER_NS static int proc_id_map_open(struct inode *inode, struct file *file, const struct seq_operations *seq_ops) { struct user_namespace *ns = NULL; struct task_struct *task; struct seq_file *seq; int ret = -EINVAL; task = get_proc_task(inode); if (task) { rcu_read_lock(); ns = get_user_ns(task_cred_xxx(task, user_ns)); rcu_read_unlock(); put_task_struct(task); } if (!ns) goto err; ret = seq_open(file, seq_ops); if (ret) goto err_put_ns; seq = file->private_data; seq->private = ns; return 0; err_put_ns: put_user_ns(ns); err: return ret; } static int proc_id_map_release(struct inode *inode, struct file *file) { struct seq_file *seq = file->private_data; struct user_namespace *ns = seq->private; put_user_ns(ns); return seq_release(inode, file); } static int proc_uid_map_open(struct inode *inode, struct file *file) { return proc_id_map_open(inode, file, &proc_uid_seq_operations); } static int proc_gid_map_open(struct inode *inode, struct file *file) { return proc_id_map_open(inode, file, &proc_gid_seq_operations); } static int proc_projid_map_open(struct inode *inode, struct file *file) { return proc_id_map_open(inode, file, &proc_projid_seq_operations); } static const struct file_operations proc_uid_map_operations = { .open = proc_uid_map_open, .write = proc_uid_map_write, .read = seq_read, .llseek = seq_lseek, .release = proc_id_map_release, }; static const struct file_operations proc_gid_map_operations = { .open = proc_gid_map_open, .write = proc_gid_map_write, .read = seq_read, .llseek = seq_lseek, .release = proc_id_map_release, }; static const struct file_operations proc_projid_map_operations = { .open = proc_projid_map_open, .write = proc_projid_map_write, .read = seq_read, .llseek = seq_lseek, .release = proc_id_map_release, }; static int proc_setgroups_open(struct inode *inode, struct file *file) { struct user_namespace *ns = NULL; struct task_struct *task; int ret; ret = -ESRCH; task = get_proc_task(inode); if (task) { rcu_read_lock(); ns = get_user_ns(task_cred_xxx(task, user_ns)); rcu_read_unlock(); put_task_struct(task); } if (!ns) goto err; if (file->f_mode & FMODE_WRITE) { ret = -EACCES; if (!ns_capable(ns, CAP_SYS_ADMIN)) goto err_put_ns; } ret = single_open(file, &proc_setgroups_show, ns); if (ret) goto err_put_ns; return 0; err_put_ns: put_user_ns(ns); err: return ret; } static int proc_setgroups_release(struct inode *inode, struct file *file) { struct seq_file *seq = file->private_data; struct user_namespace *ns = seq->private; int ret = single_release(inode, file); put_user_ns(ns); return ret; } static const struct file_operations proc_setgroups_operations = { .open = proc_setgroups_open, .write = proc_setgroups_write, .read = seq_read, .llseek = seq_lseek, .release = proc_setgroups_release, }; #endif /* CONFIG_USER_NS */ static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { int err = lock_trace(task); if (!err) { seq_printf(m, "%08x\n", task->personality); unlock_trace(task); } return err; } #ifdef CONFIG_LIVEPATCH static int proc_pid_patch_state(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { seq_printf(m, "%d\n", task->patch_state); return 0; } #endif /* CONFIG_LIVEPATCH */ #ifdef CONFIG_KSM static int proc_pid_ksm_merging_pages(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { struct mm_struct *mm; mm = get_task_mm(task); if (mm) { seq_printf(m, "%lu\n", mm->ksm_merging_pages); mmput(mm); } return 0; } static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { struct mm_struct *mm; int ret = 0; mm = get_task_mm(task); if (mm) { seq_printf(m, "ksm_rmap_items %lu\n", mm->ksm_rmap_items); seq_printf(m, "ksm_zero_pages %ld\n", mm_ksm_zero_pages(mm)); seq_printf(m, "ksm_merging_pages %lu\n", mm->ksm_merging_pages); seq_printf(m, "ksm_process_profit %ld\n", ksm_process_profit(mm)); seq_printf(m, "ksm_merge_any: %s\n", test_bit(MMF_VM_MERGE_ANY, &mm->flags) ? "yes" : "no"); ret = mmap_read_lock_killable(mm); if (ret) { mmput(mm); return ret; } seq_printf(m, "ksm_mergeable: %s\n", ksm_process_mergeable(mm) ? "yes" : "no"); mmap_read_unlock(mm); mmput(mm); } return 0; } #endif /* CONFIG_KSM */ #ifdef CONFIG_STACKLEAK_METRICS static int proc_stack_depth(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { unsigned long prev_depth = THREAD_SIZE - (task->prev_lowest_stack & (THREAD_SIZE - 1)); unsigned long depth = THREAD_SIZE - (task->lowest_stack & (THREAD_SIZE - 1)); seq_printf(m, "previous stack depth: %lu\nstack depth: %lu\n", prev_depth, depth); return 0; } #endif /* CONFIG_STACKLEAK_METRICS */ /* * Thread groups */ static const struct file_operations proc_task_operations; static const struct inode_operations proc_task_inode_operations; static const struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), DIR("map_files", S_IRUSR|S_IXUSR, proc_map_files_inode_operations, proc_map_files_operations), DIR("fdinfo", S_IRUGO|S_IXUGO, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations), #ifdef CONFIG_NET DIR("net", S_IRUGO|S_IXUGO, proc_net_inode_operations, proc_net_operations), #endif REG("environ", S_IRUSR, proc_environ_operations), REG("auxv", S_IRUSR, proc_auxv_operations), ONE("status", S_IRUGO, proc_pid_status), ONE("personality", S_IRUSR, proc_pid_personality), ONE("limits", S_IRUGO, proc_pid_limits), REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations), #ifdef CONFIG_SCHED_AUTOGROUP REG("autogroup", S_IRUGO|S_IWUSR, proc_pid_sched_autogroup_operations), #endif #ifdef CONFIG_TIME_NS REG("timens_offsets", S_IRUGO|S_IWUSR, proc_timens_offsets_operations), #endif REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations), #ifdef CONFIG_HAVE_ARCH_TRACEHOOK ONE("syscall", S_IRUSR, proc_pid_syscall), #endif REG("cmdline", S_IRUGO, proc_pid_cmdline_ops), ONE("stat", S_IRUGO, proc_tgid_stat), ONE("statm", S_IRUGO, proc_pid_statm), REG("maps", S_IRUGO, proc_pid_maps_operations), #ifdef CONFIG_NUMA REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations), #endif REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations), LNK("cwd", proc_cwd_link), LNK("root", proc_root_link), LNK("exe", proc_exe_link), REG("mounts", S_IRUGO, proc_mounts_operations), REG("mountinfo", S_IRUGO, proc_mountinfo_operations), REG("mountstats", S_IRUSR, proc_mountstats_operations), #ifdef CONFIG_PROC_PAGE_MONITOR REG("clear_refs", S_IWUSR, proc_clear_refs_operations), REG("smaps", S_IRUGO, proc_pid_smaps_operations), REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations), REG("pagemap", S_IRUSR, proc_pagemap_operations), #endif #ifdef CONFIG_SECURITY DIR("attr", S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations), #endif #ifdef CONFIG_KALLSYMS ONE("wchan", S_IRUGO, proc_pid_wchan), #endif #ifdef CONFIG_STACKTRACE ONE("stack", S_IRUSR, proc_pid_stack), #endif #ifdef CONFIG_SCHED_INFO ONE("schedstat", S_IRUGO, proc_pid_schedstat), #endif #ifdef CONFIG_LATENCYTOP REG("latency", S_IRUGO, proc_lstats_operations), #endif #ifdef CONFIG_PROC_PID_CPUSET ONE("cpuset", S_IRUGO, proc_cpuset_show), #endif #ifdef CONFIG_CGROUPS ONE("cgroup", S_IRUGO, proc_cgroup_show), #endif #ifdef CONFIG_PROC_CPU_RESCTRL ONE("cpu_resctrl_groups", S_IRUGO, proc_resctrl_show), #endif ONE("oom_score", S_IRUGO, proc_oom_score), REG("oom_adj", S_IRUGO|S_IWUSR, proc_oom_adj_operations), REG("oom_score_adj", S_IRUGO|S_IWUSR, proc_oom_score_adj_operations), #ifdef CONFIG_AUDIT REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations), REG("sessionid", S_IRUGO, proc_sessionid_operations), #endif #ifdef CONFIG_FAULT_INJECTION REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations), REG("fail-nth", 0644, proc_fail_nth_operations), #endif #ifdef CONFIG_ELF_CORE REG("coredump_filter", S_IRUGO|S_IWUSR, proc_coredump_filter_operations), #endif #ifdef CONFIG_TASK_IO_ACCOUNTING ONE("io", S_IRUSR, proc_tgid_io_accounting), #endif #ifdef CONFIG_USER_NS REG("uid_map", S_IRUGO|S_IWUSR, proc_uid_map_operations), REG("gid_map", S_IRUGO|S_IWUSR, proc_gid_map_operations), REG("projid_map", S_IRUGO|S_IWUSR, proc_projid_map_operations), REG("setgroups", S_IRUGO|S_IWUSR, proc_setgroups_operations), #endif #if defined(CONFIG_CHECKPOINT_RESTORE) && defined(CONFIG_POSIX_TIMERS) REG("timers", S_IRUGO, proc_timers_operations), #endif REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations), #ifdef CONFIG_LIVEPATCH ONE("patch_state", S_IRUSR, proc_pid_patch_state), #endif #ifdef CONFIG_STACKLEAK_METRICS ONE("stack_depth", S_IRUGO, proc_stack_depth), #endif #ifdef CONFIG_PROC_PID_ARCH_STATUS ONE("arch_status", S_IRUGO, proc_pid_arch_status), #endif #ifdef CONFIG_SECCOMP_CACHE_DEBUG ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache), #endif #ifdef CONFIG_KSM ONE("ksm_merging_pages", S_IRUSR, proc_pid_ksm_merging_pages), ONE("ksm_stat", S_IRUSR, proc_pid_ksm_stat), #endif }; static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx) { return proc_pident_readdir(file, ctx, tgid_base_stuff, ARRAY_SIZE(tgid_base_stuff)); } static const struct file_operations proc_tgid_base_operations = { .read = generic_read_dir, .iterate_shared = proc_tgid_base_readdir, .llseek = generic_file_llseek, }; struct pid *tgid_pidfd_to_pid(const struct file *file) { if (file->f_op != &proc_tgid_base_operations) return ERR_PTR(-EBADF); return proc_pid(file_inode(file)); } static struct dentry *proc_tgid_base_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { return proc_pident_lookup(dir, dentry, tgid_base_stuff, tgid_base_stuff + ARRAY_SIZE(tgid_base_stuff)); } static const struct inode_operations proc_tgid_base_inode_operations = { .lookup = proc_tgid_base_lookup, .getattr = pid_getattr, .setattr = proc_setattr, .permission = proc_pid_permission, }; /** * proc_flush_pid - Remove dcache entries for @pid from the /proc dcache. * @pid: pid that should be flushed. * * This function walks a list of inodes (that belong to any proc * filesystem) that are attached to the pid and flushes them from * the dentry cache. * * It is safe and reasonable to cache /proc entries for a task until * that task exits. After that they just clog up the dcache with * useless entries, possibly causing useful dcache entries to be * flushed instead. This routine is provided to flush those useless * dcache entries when a process is reaped. * * NOTE: This routine is just an optimization so it does not guarantee * that no dcache entries will exist after a process is reaped * it just makes it very unlikely that any will persist. */ void proc_flush_pid(struct pid *pid) { proc_invalidate_siblings_dcache(&pid->inodes, &pid->lock); } static struct dentry *proc_pid_instantiate(struct dentry * dentry, struct task_struct *task, const void *ptr) { struct inode *inode; inode = proc_pid_make_base_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO); if (!inode) return ERR_PTR(-ENOENT); inode->i_op = &proc_tgid_base_inode_operations; inode->i_fop = &proc_tgid_base_operations; inode->i_flags|=S_IMMUTABLE; set_nlink(inode, nlink_tgid); pid_update_inode(task, inode); d_set_d_op(dentry, &pid_dentry_operations); return d_splice_alias(inode, dentry); } struct dentry *proc_pid_lookup(struct dentry *dentry, unsigned int flags) { struct task_struct *task; unsigned tgid; struct proc_fs_info *fs_info; struct pid_namespace *ns; struct dentry *result = ERR_PTR(-ENOENT); tgid = name_to_int(&dentry->d_name); if (tgid == ~0U) goto out; fs_info = proc_sb_info(dentry->d_sb); ns = fs_info->pid_ns; rcu_read_lock(); task = find_task_by_pid_ns(tgid, ns); if (task) get_task_struct(task); rcu_read_unlock(); if (!task) goto out; /* Limit procfs to only ptraceable tasks */ if (fs_info->hide_pid == HIDEPID_NOT_PTRACEABLE) { if (!has_pid_permissions(fs_info, task, HIDEPID_NO_ACCESS)) goto out_put_task; } result = proc_pid_instantiate(dentry, task, NULL); out_put_task: put_task_struct(task); out: return result; } /* * Find the first task with tgid >= tgid * */ struct tgid_iter { unsigned int tgid; struct task_struct *task; }; static struct tgid_iter next_tgid(struct pid_namespace *ns, struct tgid_iter iter) { struct pid *pid; if (iter.task) put_task_struct(iter.task); rcu_read_lock(); retry: iter.task = NULL; pid = find_ge_pid(iter.tgid, ns); if (pid) { iter.tgid = pid_nr_ns(pid, ns); iter.task = pid_task(pid, PIDTYPE_TGID); if (!iter.task) { iter.tgid += 1; goto retry; } get_task_struct(iter.task); } rcu_read_unlock(); return iter; } #define TGID_OFFSET (FIRST_PROCESS_ENTRY + 2) /* for the /proc/ directory itself, after non-process stuff has been done */ int proc_pid_readdir(struct file *file, struct dir_context *ctx) { struct tgid_iter iter; struct proc_fs_info *fs_info = proc_sb_info(file_inode(file)->i_sb); struct pid_namespace *ns = proc_pid_ns(file_inode(file)->i_sb); loff_t pos = ctx->pos; if (pos >= PID_MAX_LIMIT + TGID_OFFSET) return 0; if (pos == TGID_OFFSET - 2) { struct inode *inode = d_inode(fs_info->proc_self); if (!dir_emit(ctx, "self", 4, inode->i_ino, DT_LNK)) return 0; ctx->pos = pos = pos + 1; } if (pos == TGID_OFFSET - 1) { struct inode *inode = d_inode(fs_info->proc_thread_self); if (!dir_emit(ctx, "thread-self", 11, inode->i_ino, DT_LNK)) return 0; ctx->pos = pos = pos + 1; } iter.tgid = pos - TGID_OFFSET; iter.task = NULL; for (iter = next_tgid(ns, iter); iter.task; iter.tgid += 1, iter = next_tgid(ns, iter)) { char name[10 + 1]; unsigned int len; cond_resched(); if (!has_pid_permissions(fs_info, iter.task, HIDEPID_INVISIBLE)) continue; len = snprintf(name, sizeof(name), "%u", iter.tgid); ctx->pos = iter.tgid + TGID_OFFSET; if (!proc_fill_cache(file, ctx, name, len, proc_pid_instantiate, iter.task, NULL)) { put_task_struct(iter.task); return 0; } } ctx->pos = PID_MAX_LIMIT + TGID_OFFSET; return 0; } /* * proc_tid_comm_permission is a special permission function exclusively * used for the node /proc/<pid>/task/<tid>/comm. * It bypasses generic permission checks in the case where a task of the same * task group attempts to access the node. * The rationale behind this is that glibc and bionic access this node for * cross thread naming (pthread_set/getname_np(!self)). However, if * PR_SET_DUMPABLE gets set to 0 this node among others becomes uid=0 gid=0, * which locks out the cross thread naming implementation. * This function makes sure that the node is always accessible for members of * same thread group. */ static int proc_tid_comm_permission(struct mnt_idmap *idmap, struct inode *inode, int mask) { bool is_same_tgroup; struct task_struct *task; task = get_proc_task(inode); if (!task) return -ESRCH; is_same_tgroup = same_thread_group(current, task); put_task_struct(task); if (likely(is_same_tgroup && !(mask & MAY_EXEC))) { /* This file (/proc/<pid>/task/<tid>/comm) can always be * read or written by the members of the corresponding * thread group. */ return 0; } return generic_permission(&nop_mnt_idmap, inode, mask); } static const struct inode_operations proc_tid_comm_inode_operations = { .setattr = proc_setattr, .permission = proc_tid_comm_permission, }; /* * Tasks */ static const struct pid_entry tid_base_stuff[] = { DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), DIR("fdinfo", S_IRUGO|S_IXUGO, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations), #ifdef CONFIG_NET DIR("net", S_IRUGO|S_IXUGO, proc_net_inode_operations, proc_net_operations), #endif REG("environ", S_IRUSR, proc_environ_operations), REG("auxv", S_IRUSR, proc_auxv_operations), ONE("status", S_IRUGO, proc_pid_status), ONE("personality", S_IRUSR, proc_pid_personality), ONE("limits", S_IRUGO, proc_pid_limits), REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations), NOD("comm", S_IFREG|S_IRUGO|S_IWUSR, &proc_tid_comm_inode_operations, &proc_pid_set_comm_operations, {}), #ifdef CONFIG_HAVE_ARCH_TRACEHOOK ONE("syscall", S_IRUSR, proc_pid_syscall), #endif REG("cmdline", S_IRUGO, proc_pid_cmdline_ops), ONE("stat", S_IRUGO, proc_tid_stat), ONE("statm", S_IRUGO, proc_pid_statm), REG("maps", S_IRUGO, proc_pid_maps_operations), #ifdef CONFIG_PROC_CHILDREN REG("children", S_IRUGO, proc_tid_children_operations), #endif #ifdef CONFIG_NUMA REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations), #endif REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations), LNK("cwd", proc_cwd_link), LNK("root", proc_root_link), LNK("exe", proc_exe_link), REG("mounts", S_IRUGO, proc_mounts_operations), REG("mountinfo", S_IRUGO, proc_mountinfo_operations), #ifdef CONFIG_PROC_PAGE_MONITOR REG("clear_refs", S_IWUSR, proc_clear_refs_operations), REG("smaps", S_IRUGO, proc_pid_smaps_operations), REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations), REG("pagemap", S_IRUSR, proc_pagemap_operations), #endif #ifdef CONFIG_SECURITY DIR("attr", S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations), #endif #ifdef CONFIG_KALLSYMS ONE("wchan", S_IRUGO, proc_pid_wchan), #endif #ifdef CONFIG_STACKTRACE ONE("stack", S_IRUSR, proc_pid_stack), #endif #ifdef CONFIG_SCHED_INFO ONE("schedstat", S_IRUGO, proc_pid_schedstat), #endif #ifdef CONFIG_LATENCYTOP REG("latency", S_IRUGO, proc_lstats_operations), #endif #ifdef CONFIG_PROC_PID_CPUSET ONE("cpuset", S_IRUGO, proc_cpuset_show), #endif #ifdef CONFIG_CGROUPS ONE("cgroup", S_IRUGO, proc_cgroup_show), #endif #ifdef CONFIG_PROC_CPU_RESCTRL ONE("cpu_resctrl_groups", S_IRUGO, proc_resctrl_show), #endif ONE("oom_score", S_IRUGO, proc_oom_score), REG("oom_adj", S_IRUGO|S_IWUSR, proc_oom_adj_operations), REG("oom_score_adj", S_IRUGO|S_IWUSR, proc_oom_score_adj_operations), #ifdef CONFIG_AUDIT REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations), REG("sessionid", S_IRUGO, proc_sessionid_operations), #endif #ifdef CONFIG_FAULT_INJECTION REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations), REG("fail-nth", 0644, proc_fail_nth_operations), #endif #ifdef CONFIG_TASK_IO_ACCOUNTING ONE("io", S_IRUSR, proc_tid_io_accounting), #endif #ifdef CONFIG_USER_NS REG("uid_map", S_IRUGO|S_IWUSR, proc_uid_map_operations), REG("gid_map", S_IRUGO|S_IWUSR, proc_gid_map_operations), REG("projid_map", S_IRUGO|S_IWUSR, proc_projid_map_operations), REG("setgroups", S_IRUGO|S_IWUSR, proc_setgroups_operations), #endif #ifdef CONFIG_LIVEPATCH ONE("patch_state", S_IRUSR, proc_pid_patch_state), #endif #ifdef CONFIG_PROC_PID_ARCH_STATUS ONE("arch_status", S_IRUGO, proc_pid_arch_status), #endif #ifdef CONFIG_SECCOMP_CACHE_DEBUG ONE("seccomp_cache", S_IRUSR, proc_pid_seccomp_cache), #endif #ifdef CONFIG_KSM ONE("ksm_merging_pages", S_IRUSR, proc_pid_ksm_merging_pages), ONE("ksm_stat", S_IRUSR, proc_pid_ksm_stat), #endif }; static int proc_tid_base_readdir(struct file *file, struct dir_context *ctx) { return proc_pident_readdir(file, ctx, tid_base_stuff, ARRAY_SIZE(tid_base_stuff)); } static struct dentry *proc_tid_base_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { return proc_pident_lookup(dir, dentry, tid_base_stuff, tid_base_stuff + ARRAY_SIZE(tid_base_stuff)); } static const struct file_operations proc_tid_base_operations = { .read = generic_read_dir, .iterate_shared = proc_tid_base_readdir, .llseek = generic_file_llseek, }; static const struct inode_operations proc_tid_base_inode_operations = { .lookup = proc_tid_base_lookup, .getattr = pid_getattr, .setattr = proc_setattr, }; static struct dentry *proc_task_instantiate(struct dentry *dentry, struct task_struct *task, const void *ptr) { struct inode *inode; inode = proc_pid_make_base_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO); if (!inode) return ERR_PTR(-ENOENT); inode->i_op = &proc_tid_base_inode_operations; inode->i_fop = &proc_tid_base_operations; inode->i_flags |= S_IMMUTABLE; set_nlink(inode, nlink_tid); pid_update_inode(task, inode); d_set_d_op(dentry, &pid_dentry_operations); return d_splice_alias(inode, dentry); } static struct dentry *proc_task_lookup(struct inode *dir, struct dentry * dentry, unsigned int flags) { struct task_struct *task; struct task_struct *leader = get_proc_task(dir); unsigned tid; struct proc_fs_info *fs_info; struct pid_namespace *ns; struct dentry *result = ERR_PTR(-ENOENT); if (!leader) goto out_no_task; tid = name_to_int(&dentry->d_name); if (tid == ~0U) goto out; fs_info = proc_sb_info(dentry->d_sb); ns = fs_info->pid_ns; rcu_read_lock(); task = find_task_by_pid_ns(tid, ns); if (task) get_task_struct(task); rcu_read_unlock(); if (!task) goto out; if (!same_thread_group(leader, task)) goto out_drop_task; result = proc_task_instantiate(dentry, task, NULL); out_drop_task: put_task_struct(task); out: put_task_struct(leader); out_no_task: return result; } /* * Find the first tid of a thread group to return to user space. * * Usually this is just the thread group leader, but if the users * buffer was too small or there was a seek into the middle of the * directory we have more work todo. * * In the case of a short read we start with find_task_by_pid. * * In the case of a seek we start with the leader and walk nr * threads past it. */ static struct task_struct *first_tid(struct pid *pid, int tid, loff_t f_pos, struct pid_namespace *ns) { struct task_struct *pos, *task; unsigned long nr = f_pos; if (nr != f_pos) /* 32bit overflow? */ return NULL; rcu_read_lock(); task = pid_task(pid, PIDTYPE_PID); if (!task) goto fail; /* Attempt to start with the tid of a thread */ if (tid && nr) { pos = find_task_by_pid_ns(tid, ns); if (pos && same_thread_group(pos, task)) goto found; } /* If nr exceeds the number of threads there is nothing todo */ if (nr >= get_nr_threads(task)) goto fail; /* If we haven't found our starting place yet start * with the leader and walk nr threads forward. */ for_each_thread(task, pos) { if (!nr--) goto found; } fail: pos = NULL; goto out; found: get_task_struct(pos); out: rcu_read_unlock(); return pos; } /* * Find the next thread in the thread list. * Return NULL if there is an error or no next thread. * * The reference to the input task_struct is released. */ static struct task_struct *next_tid(struct task_struct *start) { struct task_struct *pos = NULL; rcu_read_lock(); if (pid_alive(start)) { pos = __next_thread(start); if (pos) get_task_struct(pos); } rcu_read_unlock(); put_task_struct(start); return pos; } /* for the /proc/TGID/task/ directories */ static int proc_task_readdir(struct file *file, struct dir_context *ctx) { struct inode *inode = file_inode(file); struct task_struct *task; struct pid_namespace *ns; int tid; if (proc_inode_is_dead(inode)) return -ENOENT; if (!dir_emit_dots(file, ctx)) return 0; /* We cache the tgid value that the last readdir call couldn't * return and lseek resets it to 0. */ ns = proc_pid_ns(inode->i_sb); tid = (int)(intptr_t)file->private_data; file->private_data = NULL; for (task = first_tid(proc_pid(inode), tid, ctx->pos - 2, ns); task; task = next_tid(task), ctx->pos++) { char name[10 + 1]; unsigned int len; tid = task_pid_nr_ns(task, ns); if (!tid) continue; /* The task has just exited. */ len = snprintf(name, sizeof(name), "%u", tid); if (!proc_fill_cache(file, ctx, name, len, proc_task_instantiate, task, NULL)) { /* returning this tgid failed, save it as the first * pid for the next readir call */ file->private_data = (void *)(intptr_t)tid; put_task_struct(task); break; } } return 0; } static int proc_task_getattr(struct mnt_idmap *idmap, const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { struct inode *inode = d_inode(path->dentry); struct task_struct *p = get_proc_task(inode); generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat); if (p) { stat->nlink += get_nr_threads(p); put_task_struct(p); } return 0; } /* * proc_task_readdir() set @file->private_data to a positive integer * value, so casting that to u64 is safe. generic_llseek_cookie() will * set @cookie to 0, so casting to an int is safe. The WARN_ON_ONCE() is * here to catch any unexpected change in behavior either in * proc_task_readdir() or generic_llseek_cookie(). */ static loff_t proc_dir_llseek(struct file *file, loff_t offset, int whence) { u64 cookie = (u64)(intptr_t)file->private_data; loff_t off; off = generic_llseek_cookie(file, offset, whence, &cookie); WARN_ON_ONCE(cookie > INT_MAX); file->private_data = (void *)(intptr_t)cookie; /* serialized by f_pos_lock */ return off; } static const struct inode_operations proc_task_inode_operations = { .lookup = proc_task_lookup, .getattr = proc_task_getattr, .setattr = proc_setattr, .permission = proc_pid_permission, }; static const struct file_operations proc_task_operations = { .read = generic_read_dir, .iterate_shared = proc_task_readdir, .llseek = proc_dir_llseek, }; void __init set_proc_pid_nlink(void) { nlink_tid = pid_entry_nlink(tid_base_stuff, ARRAY_SIZE(tid_base_stuff)); nlink_tgid = pid_entry_nlink(tgid_base_stuff, ARRAY_SIZE(tgid_base_stuff)); } |
| 14 14 14 3 3 3 3 1 3 3 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 8 11 8 11 13 13 13 13 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 | // SPDX-License-Identifier: GPL-2.0-or-later /* * INET An implementation of the TCP/IP protocol suite for the LINUX * operating system. INET is implemented using the BSD Socket * interface as the means of communication with the user level. * * Generic INET6 transport hashtables * * Authors: Lotsa people, from code originally in tcp, generalised here * by Arnaldo Carvalho de Melo <acme@mandriva.com> */ #include <linux/module.h> #include <linux/random.h> #include <net/addrconf.h> #include <net/hotdata.h> #include <net/inet_connection_sock.h> #include <net/inet_hashtables.h> #include <net/inet6_hashtables.h> #include <net/secure_seq.h> #include <net/ip.h> #include <net/sock_reuseport.h> #include <net/tcp.h> u32 inet6_ehashfn(const struct net *net, const struct in6_addr *laddr, const u16 lport, const struct in6_addr *faddr, const __be16 fport) { u32 lhash, fhash; net_get_random_once(&inet6_ehash_secret, sizeof(inet6_ehash_secret)); net_get_random_once(&tcp_ipv6_hash_secret, sizeof(tcp_ipv6_hash_secret)); lhash = (__force u32)laddr->s6_addr32[3]; fhash = __ipv6_addr_jhash(faddr, tcp_ipv6_hash_secret); return lport + __inet6_ehashfn(lhash, 0, fhash, fport, inet6_ehash_secret + net_hash_mix(net)); } EXPORT_SYMBOL_GPL(inet6_ehashfn); /* * Sockets in TCP_CLOSE state are _always_ taken out of the hash, so * we need not check it for TCP lookups anymore, thanks Alexey. -DaveM * * The sockhash lock must be held as a reader here. */ struct sock *__inet6_lookup_established(const struct net *net, struct inet_hashinfo *hashinfo, const struct in6_addr *saddr, const __be16 sport, const struct in6_addr *daddr, const u16 hnum, const int dif, const int sdif) { struct sock *sk; const struct hlist_nulls_node *node; const __portpair ports = INET_COMBINED_PORTS(sport, hnum); /* Optimize here for direct hit, only listening connections can * have wildcards anyways. */ unsigned int hash = inet6_ehashfn(net, daddr, hnum, saddr, sport); unsigned int slot = hash & hashinfo->ehash_mask; struct inet_ehash_bucket *head = &hashinfo->ehash[slot]; begin: sk_nulls_for_each_rcu(sk, node, &head->chain) { if (sk->sk_hash != hash) continue; if (!inet6_match(net, sk, saddr, daddr, ports, dif, sdif)) continue; if (unlikely(!refcount_inc_not_zero(&sk->sk_refcnt))) goto out; if (unlikely(!inet6_match(net, sk, saddr, daddr, ports, dif, sdif))) { sock_gen_put(sk); goto begin; } goto found; } if (get_nulls_value(node) != slot) goto begin; out: sk = NULL; found: return sk; } EXPORT_SYMBOL(__inet6_lookup_established); static inline int compute_score(struct sock *sk, const struct net *net, const unsigned short hnum, const struct in6_addr *daddr, const int dif, const int sdif) { int score = -1; if (net_eq(sock_net(sk), net) && inet_sk(sk)->inet_num == hnum && sk->sk_family == PF_INET6) { if (!ipv6_addr_equal(&sk->sk_v6_rcv_saddr, daddr)) return -1; if (!inet_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif)) return -1; score = sk->sk_bound_dev_if ? 2 : 1; if (READ_ONCE(sk->sk_incoming_cpu) == raw_smp_processor_id()) score++; } return score; } /** * inet6_lookup_reuseport() - execute reuseport logic on AF_INET6 socket if necessary. * @net: network namespace. * @sk: AF_INET6 socket, must be in TCP_LISTEN state for TCP or TCP_CLOSE for UDP. * @skb: context for a potential SK_REUSEPORT program. * @doff: header offset. * @saddr: source address. * @sport: source port. * @daddr: destination address. * @hnum: destination port in host byte order. * @ehashfn: hash function used to generate the fallback hash. * * Return: NULL if sk doesn't have SO_REUSEPORT set, otherwise a pointer to * the selected sock or an error. */ struct sock *inet6_lookup_reuseport(const struct net *net, struct sock *sk, struct sk_buff *skb, int doff, const struct in6_addr *saddr, __be16 sport, const struct in6_addr *daddr, unsigned short hnum, inet6_ehashfn_t *ehashfn) { struct sock *reuse_sk = NULL; u32 phash; if (sk->sk_reuseport) { phash = INDIRECT_CALL_INET(ehashfn, udp6_ehashfn, inet6_ehashfn, net, daddr, hnum, saddr, sport); reuse_sk = reuseport_select_sock(sk, phash, skb, doff); } return reuse_sk; } EXPORT_SYMBOL_GPL(inet6_lookup_reuseport); /* called with rcu_read_lock() */ static struct sock *inet6_lhash2_lookup(const struct net *net, struct inet_listen_hashbucket *ilb2, struct sk_buff *skb, int doff, const struct in6_addr *saddr, const __be16 sport, const struct in6_addr *daddr, const unsigned short hnum, const int dif, const int sdif) { struct sock *sk, *result = NULL; struct hlist_nulls_node *node; int score, hiscore = 0; sk_nulls_for_each_rcu(sk, node, &ilb2->nulls_head) { score = compute_score(sk, net, hnum, daddr, dif, sdif); if (score > hiscore) { result = inet6_lookup_reuseport(net, sk, skb, doff, saddr, sport, daddr, hnum, inet6_ehashfn); if (result) return result; result = sk; hiscore = score; } } return result; } struct sock *inet6_lookup_run_sk_lookup(const struct net *net, int protocol, struct sk_buff *skb, int doff, const struct in6_addr *saddr, const __be16 sport, const struct in6_addr *daddr, const u16 hnum, const int dif, inet6_ehashfn_t *ehashfn) { struct sock *sk, *reuse_sk; bool no_reuseport; no_reuseport = bpf_sk_lookup_run_v6(net, protocol, saddr, sport, daddr, hnum, dif, &sk); if (no_reuseport || IS_ERR_OR_NULL(sk)) return sk; reuse_sk = inet6_lookup_reuseport(net, sk, skb, doff, saddr, sport, daddr, hnum, ehashfn); if (reuse_sk) sk = reuse_sk; return sk; } EXPORT_SYMBOL_GPL(inet6_lookup_run_sk_lookup); struct sock *inet6_lookup_listener(const struct net *net, struct inet_hashinfo *hashinfo, struct sk_buff *skb, int doff, const struct in6_addr *saddr, const __be16 sport, const struct in6_addr *daddr, const unsigned short hnum, const int dif, const int sdif) { struct inet_listen_hashbucket *ilb2; struct sock *result = NULL; unsigned int hash2; /* Lookup redirect from BPF */ if (static_branch_unlikely(&bpf_sk_lookup_enabled) && hashinfo == net->ipv4.tcp_death_row.hashinfo) { result = inet6_lookup_run_sk_lookup(net, IPPROTO_TCP, skb, doff, saddr, sport, daddr, hnum, dif, inet6_ehashfn); if (result) goto done; } hash2 = ipv6_portaddr_hash(net, daddr, hnum); ilb2 = inet_lhash2_bucket(hashinfo, hash2); result = inet6_lhash2_lookup(net, ilb2, skb, doff, saddr, sport, daddr, hnum, dif, sdif); if (result) goto done; /* Lookup lhash2 with in6addr_any */ hash2 = ipv6_portaddr_hash(net, &in6addr_any, hnum); ilb2 = inet_lhash2_bucket(hashinfo, hash2); result = inet6_lhash2_lookup(net, ilb2, skb, doff, saddr, sport, &in6addr_any, hnum, dif, sdif); done: if (IS_ERR(result)) return NULL; return result; } EXPORT_SYMBOL_GPL(inet6_lookup_listener); struct sock *inet6_lookup(const struct net *net, struct inet_hashinfo *hashinfo, struct sk_buff *skb, int doff, const struct in6_addr *saddr, const __be16 sport, const struct in6_addr *daddr, const __be16 dport, const int dif) { struct sock *sk; bool refcounted; sk = __inet6_lookup(net, hashinfo, skb, doff, saddr, sport, daddr, ntohs(dport), dif, 0, &refcounted); if (sk && !refcounted && !refcount_inc_not_zero(&sk->sk_refcnt)) sk = NULL; return sk; } EXPORT_SYMBOL_GPL(inet6_lookup); static int __inet6_check_established(struct inet_timewait_death_row *death_row, struct sock *sk, const __u16 lport, struct inet_timewait_sock **twp, bool rcu_lookup, u32 hash) { struct inet_hashinfo *hinfo = death_row->hashinfo; struct inet_sock *inet = inet_sk(sk); const struct in6_addr *daddr = &sk->sk_v6_rcv_saddr; const struct in6_addr *saddr = &sk->sk_v6_daddr; const int dif = sk->sk_bound_dev_if; struct net *net = sock_net(sk); const int sdif = l3mdev_master_ifindex_by_index(net, dif); const __portpair ports = INET_COMBINED_PORTS(inet->inet_dport, lport); struct inet_ehash_bucket *head = inet_ehash_bucket(hinfo, hash); struct inet_timewait_sock *tw = NULL; const struct hlist_nulls_node *node; struct sock *sk2; spinlock_t *lock; if (rcu_lookup) { sk_nulls_for_each(sk2, node, &head->chain) { if (sk2->sk_hash != hash || !inet6_match(net, sk2, saddr, daddr, ports, dif, sdif)) continue; if (sk2->sk_state == TCP_TIME_WAIT) break; return -EADDRNOTAVAIL; } return 0; } lock = inet_ehash_lockp(hinfo, hash); spin_lock(lock); sk_nulls_for_each(sk2, node, &head->chain) { if (sk2->sk_hash != hash) continue; if (likely(inet6_match(net, sk2, saddr, daddr, ports, dif, sdif))) { if (sk2->sk_state == TCP_TIME_WAIT) { tw = inet_twsk(sk2); if (sk->sk_protocol == IPPROTO_TCP && tcp_twsk_unique(sk, sk2, twp)) break; } goto not_unique; } } /* Must record num and sport now. Otherwise we will see * in hash table socket with a funny identity. */ inet->inet_num = lport; inet->inet_sport = htons(lport); sk->sk_hash = hash; WARN_ON(!sk_unhashed(sk)); __sk_nulls_add_node_rcu(sk, &head->chain); if (tw) { sk_nulls_del_node_init_rcu((struct sock *)tw); __NET_INC_STATS(net, LINUX_MIB_TIMEWAITRECYCLED); } spin_unlock(lock); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); if (twp) { *twp = tw; } else if (tw) { /* Silly. Should hash-dance instead... */ inet_twsk_deschedule_put(tw); } return 0; not_unique: spin_unlock(lock); return -EADDRNOTAVAIL; } static u64 inet6_sk_port_offset(const struct sock *sk) { const struct inet_sock *inet = inet_sk(sk); return secure_ipv6_port_ephemeral(sk->sk_v6_rcv_saddr.s6_addr32, sk->sk_v6_daddr.s6_addr32, inet->inet_dport); } int inet6_hash_connect(struct inet_timewait_death_row *death_row, struct sock *sk) { const struct in6_addr *daddr = &sk->sk_v6_rcv_saddr; const struct in6_addr *saddr = &sk->sk_v6_daddr; const struct inet_sock *inet = inet_sk(sk); const struct net *net = sock_net(sk); u64 port_offset = 0; u32 hash_port0; if (!inet_sk(sk)->inet_num) port_offset = inet6_sk_port_offset(sk); hash_port0 = inet6_ehashfn(net, daddr, 0, saddr, inet->inet_dport); return __inet_hash_connect(death_row, sk, port_offset, hash_port0, __inet6_check_established); } EXPORT_SYMBOL_GPL(inet6_hash_connect); int inet6_hash(struct sock *sk) { int err = 0; if (sk->sk_state != TCP_CLOSE) err = __inet_hash(sk, NULL); return err; } EXPORT_SYMBOL_GPL(inet6_hash); |
| 3 2 2 2 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 | // SPDX-License-Identifier: GPL-2.0-or-later /* RxRPC key management * * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) * * RxRPC keys should have a description of describing their purpose: * "afs@example.com" */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <crypto/skcipher.h> #include <linux/module.h> #include <linux/net.h> #include <linux/skbuff.h> #include <linux/key-type.h> #include <linux/ctype.h> #include <linux/slab.h> #include <net/sock.h> #include <net/af_rxrpc.h> #include <keys/rxrpc-type.h> #include <keys/user-type.h> #include "ar-internal.h" static int rxrpc_preparse(struct key_preparsed_payload *); static void rxrpc_free_preparse(struct key_preparsed_payload *); static void rxrpc_destroy(struct key *); static void rxrpc_describe(const struct key *, struct seq_file *); static long rxrpc_read(const struct key *, char *, size_t); /* * rxrpc defined keys take an arbitrary string as the description and an * arbitrary blob of data as the payload */ struct key_type key_type_rxrpc = { .name = "rxrpc", .flags = KEY_TYPE_NET_DOMAIN, .preparse = rxrpc_preparse, .free_preparse = rxrpc_free_preparse, .instantiate = generic_key_instantiate, .destroy = rxrpc_destroy, .describe = rxrpc_describe, .read = rxrpc_read, }; EXPORT_SYMBOL(key_type_rxrpc); /* * parse an RxKAD type XDR format token * - the caller guarantees we have at least 4 words */ static int rxrpc_preparse_xdr_rxkad(struct key_preparsed_payload *prep, size_t datalen, const __be32 *xdr, unsigned int toklen) { struct rxrpc_key_token *token, **pptoken; time64_t expiry; size_t plen; u32 tktlen; _enter(",{%x,%x,%x,%x},%u", ntohl(xdr[0]), ntohl(xdr[1]), ntohl(xdr[2]), ntohl(xdr[3]), toklen); if (toklen <= 8 * 4) return -EKEYREJECTED; tktlen = ntohl(xdr[7]); _debug("tktlen: %x", tktlen); if (tktlen > AFSTOKEN_RK_TIX_MAX) return -EKEYREJECTED; if (toklen < 8 * 4 + tktlen) return -EKEYREJECTED; plen = sizeof(*token) + sizeof(*token->kad) + tktlen; prep->quotalen = datalen + plen; plen -= sizeof(*token); token = kzalloc(sizeof(*token), GFP_KERNEL); if (!token) return -ENOMEM; token->kad = kzalloc(plen, GFP_KERNEL); if (!token->kad) { kfree(token); return -ENOMEM; } token->security_index = RXRPC_SECURITY_RXKAD; token->kad->ticket_len = tktlen; token->kad->vice_id = ntohl(xdr[0]); token->kad->kvno = ntohl(xdr[1]); token->kad->start = ntohl(xdr[4]); token->kad->expiry = ntohl(xdr[5]); token->kad->primary_flag = ntohl(xdr[6]); memcpy(&token->kad->session_key, &xdr[2], 8); memcpy(&token->kad->ticket, &xdr[8], tktlen); _debug("SCIX: %u", token->security_index); _debug("TLEN: %u", token->kad->ticket_len); _debug("EXPY: %x", token->kad->expiry); _debug("KVNO: %u", token->kad->kvno); _debug("PRIM: %u", token->kad->primary_flag); _debug("SKEY: %02x%02x%02x%02x%02x%02x%02x%02x", token->kad->session_key[0], token->kad->session_key[1], token->kad->session_key[2], token->kad->session_key[3], token->kad->session_key[4], token->kad->session_key[5], token->kad->session_key[6], token->kad->session_key[7]); if (token->kad->ticket_len >= 8) _debug("TCKT: %02x%02x%02x%02x%02x%02x%02x%02x", token->kad->ticket[0], token->kad->ticket[1], token->kad->ticket[2], token->kad->ticket[3], token->kad->ticket[4], token->kad->ticket[5], token->kad->ticket[6], token->kad->ticket[7]); /* count the number of tokens attached */ prep->payload.data[1] = (void *)((unsigned long)prep->payload.data[1] + 1); /* attach the data */ for (pptoken = (struct rxrpc_key_token **)&prep->payload.data[0]; *pptoken; pptoken = &(*pptoken)->next) continue; *pptoken = token; expiry = rxrpc_u32_to_time64(token->kad->expiry); if (expiry < prep->expiry) prep->expiry = expiry; _leave(" = 0"); return 0; } /* * attempt to parse the data as the XDR format * - the caller guarantees we have more than 7 words */ static int rxrpc_preparse_xdr(struct key_preparsed_payload *prep) { const __be32 *xdr = prep->data, *token, *p; const char *cp; unsigned int len, paddedlen, loop, ntoken, toklen, sec_ix; size_t datalen = prep->datalen; int ret, ret2; _enter(",{%x,%x,%x,%x},%zu", ntohl(xdr[0]), ntohl(xdr[1]), ntohl(xdr[2]), ntohl(xdr[3]), prep->datalen); if (datalen > AFSTOKEN_LENGTH_MAX) goto not_xdr; /* XDR is an array of __be32's */ if (datalen & 3) goto not_xdr; /* the flags should be 0 (the setpag bit must be handled by * userspace) */ if (ntohl(*xdr++) != 0) goto not_xdr; datalen -= 4; /* check the cell name */ len = ntohl(*xdr++); if (len < 1 || len > AFSTOKEN_CELL_MAX) goto not_xdr; datalen -= 4; paddedlen = (len + 3) & ~3; if (paddedlen > datalen) goto not_xdr; cp = (const char *) xdr; for (loop = 0; loop < len; loop++) if (!isprint(cp[loop])) goto not_xdr; for (; loop < paddedlen; loop++) if (cp[loop]) goto not_xdr; _debug("cellname: [%u/%u] '%*.*s'", len, paddedlen, len, len, (const char *) xdr); datalen -= paddedlen; xdr += paddedlen >> 2; /* get the token count */ if (datalen < 12) goto not_xdr; ntoken = ntohl(*xdr++); datalen -= 4; _debug("ntoken: %x", ntoken); if (ntoken < 1 || ntoken > AFSTOKEN_MAX) goto not_xdr; /* check each token wrapper */ p = xdr; loop = ntoken; do { if (datalen < 8) goto not_xdr; toklen = ntohl(*p++); sec_ix = ntohl(*p); datalen -= 4; _debug("token: [%x/%zx] %x", toklen, datalen, sec_ix); paddedlen = (toklen + 3) & ~3; if (toklen < 20 || toklen > datalen || paddedlen > datalen) goto not_xdr; datalen -= paddedlen; p += paddedlen >> 2; } while (--loop > 0); _debug("remainder: %zu", datalen); if (datalen != 0) goto not_xdr; /* okay: we're going to assume it's valid XDR format * - we ignore the cellname, relying on the key to be correctly named */ ret = -EPROTONOSUPPORT; do { toklen = ntohl(*xdr++); token = xdr; xdr += (toklen + 3) / 4; sec_ix = ntohl(*token++); toklen -= 4; _debug("TOKEN type=%x len=%x", sec_ix, toklen); switch (sec_ix) { case RXRPC_SECURITY_RXKAD: ret2 = rxrpc_preparse_xdr_rxkad(prep, datalen, token, toklen); break; default: ret2 = -EPROTONOSUPPORT; break; } switch (ret2) { case 0: ret = 0; break; case -EPROTONOSUPPORT: break; case -ENOPKG: if (ret != 0) ret = -ENOPKG; break; default: ret = ret2; goto error; } } while (--ntoken > 0); error: _leave(" = %d", ret); return ret; not_xdr: _leave(" = -EPROTO"); return -EPROTO; } /* * Preparse an rxrpc defined key. * * Data should be of the form: * OFFSET LEN CONTENT * 0 4 key interface version number * 4 2 security index (type) * 6 2 ticket length * 8 4 key expiry time (time_t) * 12 4 kvno * 16 8 session key * 24 [len] ticket * * if no data is provided, then a no-security key is made */ static int rxrpc_preparse(struct key_preparsed_payload *prep) { const struct rxrpc_key_data_v1 *v1; struct rxrpc_key_token *token, **pp; time64_t expiry; size_t plen; u32 kver; int ret; _enter("%zu", prep->datalen); /* handle a no-security key */ if (!prep->data && prep->datalen == 0) return 0; /* determine if the XDR payload format is being used */ if (prep->datalen > 7 * 4) { ret = rxrpc_preparse_xdr(prep); if (ret != -EPROTO) return ret; } /* get the key interface version number */ ret = -EINVAL; if (prep->datalen <= 4 || !prep->data) goto error; memcpy(&kver, prep->data, sizeof(kver)); prep->data += sizeof(kver); prep->datalen -= sizeof(kver); _debug("KEY I/F VERSION: %u", kver); ret = -EKEYREJECTED; if (kver != 1) goto error; /* deal with a version 1 key */ ret = -EINVAL; if (prep->datalen < sizeof(*v1)) goto error; v1 = prep->data; if (prep->datalen != sizeof(*v1) + v1->ticket_length) goto error; _debug("SCIX: %u", v1->security_index); _debug("TLEN: %u", v1->ticket_length); _debug("EXPY: %x", v1->expiry); _debug("KVNO: %u", v1->kvno); _debug("SKEY: %02x%02x%02x%02x%02x%02x%02x%02x", v1->session_key[0], v1->session_key[1], v1->session_key[2], v1->session_key[3], v1->session_key[4], v1->session_key[5], v1->session_key[6], v1->session_key[7]); if (v1->ticket_length >= 8) _debug("TCKT: %02x%02x%02x%02x%02x%02x%02x%02x", v1->ticket[0], v1->ticket[1], v1->ticket[2], v1->ticket[3], v1->ticket[4], v1->ticket[5], v1->ticket[6], v1->ticket[7]); ret = -EPROTONOSUPPORT; if (v1->security_index != RXRPC_SECURITY_RXKAD) goto error; plen = sizeof(*token->kad) + v1->ticket_length; prep->quotalen = plen + sizeof(*token); ret = -ENOMEM; token = kzalloc(sizeof(*token), GFP_KERNEL); if (!token) goto error; token->kad = kzalloc(plen, GFP_KERNEL); if (!token->kad) goto error_free; token->security_index = RXRPC_SECURITY_RXKAD; token->kad->ticket_len = v1->ticket_length; token->kad->expiry = v1->expiry; token->kad->kvno = v1->kvno; memcpy(&token->kad->session_key, &v1->session_key, 8); memcpy(&token->kad->ticket, v1->ticket, v1->ticket_length); /* count the number of tokens attached */ prep->payload.data[1] = (void *)((unsigned long)prep->payload.data[1] + 1); /* attach the data */ pp = (struct rxrpc_key_token **)&prep->payload.data[0]; while (*pp) pp = &(*pp)->next; *pp = token; expiry = rxrpc_u32_to_time64(token->kad->expiry); if (expiry < prep->expiry) prep->expiry = expiry; token = NULL; ret = 0; error_free: kfree(token); error: return ret; } /* * Free token list. */ static void rxrpc_free_token_list(struct rxrpc_key_token *token) { struct rxrpc_key_token *next; for (; token; token = next) { next = token->next; switch (token->security_index) { case RXRPC_SECURITY_RXKAD: kfree(token->kad); break; default: pr_err("Unknown token type %x on rxrpc key\n", token->security_index); BUG(); } kfree(token); } } /* * Clean up preparse data. */ static void rxrpc_free_preparse(struct key_preparsed_payload *prep) { rxrpc_free_token_list(prep->payload.data[0]); } /* * dispose of the data dangling from the corpse of a rxrpc key */ static void rxrpc_destroy(struct key *key) { rxrpc_free_token_list(key->payload.data[0]); } /* * describe the rxrpc key */ static void rxrpc_describe(const struct key *key, struct seq_file *m) { const struct rxrpc_key_token *token; const char *sep = ": "; seq_puts(m, key->description); for (token = key->payload.data[0]; token; token = token->next) { seq_puts(m, sep); switch (token->security_index) { case RXRPC_SECURITY_RXKAD: seq_puts(m, "ka"); break; default: /* we have a ticket we can't encode */ seq_printf(m, "%u", token->security_index); break; } sep = " "; } } /* * grab the security key for a socket */ int rxrpc_request_key(struct rxrpc_sock *rx, sockptr_t optval, int optlen) { struct key *key; char *description; _enter(""); if (optlen <= 0 || optlen > PAGE_SIZE - 1 || rx->securities) return -EINVAL; description = memdup_sockptr_nul(optval, optlen); if (IS_ERR(description)) return PTR_ERR(description); key = request_key_net(&key_type_rxrpc, description, sock_net(&rx->sk), NULL); if (IS_ERR(key)) { kfree(description); _leave(" = %ld", PTR_ERR(key)); return PTR_ERR(key); } rx->key = key; kfree(description); _leave(" = 0 [key %x]", key->serial); return 0; } /* * generate a server data key */ int rxrpc_get_server_data_key(struct rxrpc_connection *conn, const void *session_key, time64_t expiry, u32 kvno) { const struct cred *cred = current_cred(); struct key *key; int ret; struct { u32 kver; struct rxrpc_key_data_v1 v1; } data; _enter(""); key = key_alloc(&key_type_rxrpc, "x", GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, cred, 0, KEY_ALLOC_NOT_IN_QUOTA, NULL); if (IS_ERR(key)) { _leave(" = -ENOMEM [alloc %ld]", PTR_ERR(key)); return -ENOMEM; } _debug("key %d", key_serial(key)); data.kver = 1; data.v1.security_index = RXRPC_SECURITY_RXKAD; data.v1.ticket_length = 0; data.v1.expiry = rxrpc_time64_to_u32(expiry); data.v1.kvno = 0; memcpy(&data.v1.session_key, session_key, sizeof(data.v1.session_key)); ret = key_instantiate_and_link(key, &data, sizeof(data), NULL, NULL); if (ret < 0) goto error; conn->key = key; _leave(" = 0 [%d]", key_serial(key)); return 0; error: key_revoke(key); key_put(key); _leave(" = -ENOMEM [ins %d]", ret); return -ENOMEM; } EXPORT_SYMBOL(rxrpc_get_server_data_key); /** * rxrpc_get_null_key - Generate a null RxRPC key * @keyname: The name to give the key. * * Generate a null RxRPC key that can be used to indicate anonymous security is * required for a particular domain. */ struct key *rxrpc_get_null_key(const char *keyname) { const struct cred *cred = current_cred(); struct key *key; int ret; key = key_alloc(&key_type_rxrpc, keyname, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, cred, KEY_POS_SEARCH, KEY_ALLOC_NOT_IN_QUOTA, NULL); if (IS_ERR(key)) return key; ret = key_instantiate_and_link(key, NULL, 0, NULL, NULL); if (ret < 0) { key_revoke(key); key_put(key); return ERR_PTR(ret); } return key; } EXPORT_SYMBOL(rxrpc_get_null_key); /* * read the contents of an rxrpc key * - this returns the result in XDR form */ static long rxrpc_read(const struct key *key, char *buffer, size_t buflen) { const struct rxrpc_key_token *token; size_t size; __be32 *xdr, *oldxdr; u32 cnlen, toksize, ntoks, tok, zero; u16 toksizes[AFSTOKEN_MAX]; _enter(""); /* we don't know what form we should return non-AFS keys in */ if (memcmp(key->description, "afs@", 4) != 0) return -EOPNOTSUPP; cnlen = strlen(key->description + 4); #define RND(X) (((X) + 3) & ~3) /* AFS keys we return in XDR form, so we need to work out the size of * the XDR */ size = 2 * 4; /* flags, cellname len */ size += RND(cnlen); /* cellname */ size += 1 * 4; /* token count */ ntoks = 0; for (token = key->payload.data[0]; token; token = token->next) { toksize = 4; /* sec index */ switch (token->security_index) { case RXRPC_SECURITY_RXKAD: toksize += 8 * 4; /* viceid, kvno, key*2, begin, * end, primary, tktlen */ if (!token->no_leak_key) toksize += RND(token->kad->ticket_len); break; default: /* we have a ticket we can't encode */ pr_err("Unsupported key token type (%u)\n", token->security_index); return -ENOPKG; } _debug("token[%u]: toksize=%u", ntoks, toksize); if (WARN_ON(toksize > AFSTOKEN_LENGTH_MAX)) return -EIO; toksizes[ntoks++] = toksize; size += toksize + 4; /* each token has a length word */ } #undef RND if (!buffer || buflen < size) return size; xdr = (__be32 *)buffer; zero = 0; #define ENCODE(x) \ do { \ *xdr++ = htonl(x); \ } while(0) #define ENCODE_DATA(l, s) \ do { \ u32 _l = (l); \ ENCODE(l); \ memcpy(xdr, (s), _l); \ if (_l & 3) \ memcpy((u8 *)xdr + _l, &zero, 4 - (_l & 3)); \ xdr += (_l + 3) >> 2; \ } while(0) #define ENCODE_BYTES(l, s) \ do { \ u32 _l = (l); \ memcpy(xdr, (s), _l); \ if (_l & 3) \ memcpy((u8 *)xdr + _l, &zero, 4 - (_l & 3)); \ xdr += (_l + 3) >> 2; \ } while(0) #define ENCODE64(x) \ do { \ __be64 y = cpu_to_be64(x); \ memcpy(xdr, &y, 8); \ xdr += 8 >> 2; \ } while(0) #define ENCODE_STR(s) \ do { \ const char *_s = (s); \ ENCODE_DATA(strlen(_s), _s); \ } while(0) ENCODE(0); /* flags */ ENCODE_DATA(cnlen, key->description + 4); /* cellname */ ENCODE(ntoks); tok = 0; for (token = key->payload.data[0]; token; token = token->next) { toksize = toksizes[tok++]; ENCODE(toksize); oldxdr = xdr; ENCODE(token->security_index); switch (token->security_index) { case RXRPC_SECURITY_RXKAD: ENCODE(token->kad->vice_id); ENCODE(token->kad->kvno); ENCODE_BYTES(8, token->kad->session_key); ENCODE(token->kad->start); ENCODE(token->kad->expiry); ENCODE(token->kad->primary_flag); if (token->no_leak_key) ENCODE(0); else ENCODE_DATA(token->kad->ticket_len, token->kad->ticket); break; default: pr_err("Unsupported key token type (%u)\n", token->security_index); return -ENOPKG; } if (WARN_ON((unsigned long)xdr - (unsigned long)oldxdr != toksize)) return -EIO; } #undef ENCODE_STR #undef ENCODE_DATA #undef ENCODE64 #undef ENCODE if (WARN_ON(tok != ntoks)) return -EIO; if (WARN_ON((unsigned long)xdr - (unsigned long)buffer != size)) return -EIO; _leave(" = %zu", size); return size; } |
| 1 5 5 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 22 18 18 31 31 31 12 12 12 12 12 47 48 48 22 1 9 35 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 22 22 22 22 22 22 22 9 22 22 1 1 1 1 1 1 10 10 1 1 10 10 1 10 10 2 2 2 2 2 2 1 2 2 2 2 2 1 1 1 1 1 1 8 8 8 8 8 8 8 8 8 3 1 2 2 2 2 2 4 1 3 3 3 3 3 4 1 3 3 3 3 3 2 1 2 2 1 1 1 1 1 1 90 87 89 89 88 83 83 3 3 4 4 4 4 9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 | /* BlueZ - Bluetooth protocol stack for Linux Copyright (C) 2000-2001 Qualcomm Incorporated Copyright (C) 2011 ProFUSION Embedded Systems Written 2000,2001 by Maxim Krasnyansky <maxk@qualcomm.com> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) AND AUTHOR(S) BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. ALL LIABILITY, INCLUDING LIABILITY FOR INFRINGEMENT OF ANY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS, RELATING TO USE OF THIS SOFTWARE IS DISCLAIMED. */ /* Bluetooth HCI core. */ #include <linux/export.h> #include <linux/rfkill.h> #include <linux/debugfs.h> #include <linux/crypto.h> #include <linux/kcov.h> #include <linux/property.h> #include <linux/suspend.h> #include <linux/wait.h> #include <linux/unaligned.h> #include <net/bluetooth/bluetooth.h> #include <net/bluetooth/hci_core.h> #include <net/bluetooth/l2cap.h> #include <net/bluetooth/mgmt.h> #include "hci_debugfs.h" #include "smp.h" #include "leds.h" #include "msft.h" #include "aosp.h" #include "hci_codec.h" static void hci_rx_work(struct work_struct *work); static void hci_cmd_work(struct work_struct *work); static void hci_tx_work(struct work_struct *work); /* HCI device list */ LIST_HEAD(hci_dev_list); DEFINE_RWLOCK(hci_dev_list_lock); /* HCI callback list */ LIST_HEAD(hci_cb_list); DEFINE_MUTEX(hci_cb_list_lock); /* HCI ID Numbering */ static DEFINE_IDA(hci_index_ida); /* Get HCI device by index. * Device is held on return. */ struct hci_dev *hci_dev_get(int index) { struct hci_dev *hdev = NULL, *d; BT_DBG("%d", index); if (index < 0) return NULL; read_lock(&hci_dev_list_lock); list_for_each_entry(d, &hci_dev_list, list) { if (d->id == index) { hdev = hci_dev_hold(d); break; } } read_unlock(&hci_dev_list_lock); return hdev; } /* ---- Inquiry support ---- */ bool hci_discovery_active(struct hci_dev *hdev) { struct discovery_state *discov = &hdev->discovery; switch (discov->state) { case DISCOVERY_FINDING: case DISCOVERY_RESOLVING: return true; default: return false; } } void hci_discovery_set_state(struct hci_dev *hdev, int state) { int old_state = hdev->discovery.state; if (old_state == state) return; hdev->discovery.state = state; switch (state) { case DISCOVERY_STOPPED: hci_update_passive_scan(hdev); if (old_state != DISCOVERY_STARTING) mgmt_discovering(hdev, 0); break; case DISCOVERY_STARTING: break; case DISCOVERY_FINDING: mgmt_discovering(hdev, 1); break; case DISCOVERY_RESOLVING: break; case DISCOVERY_STOPPING: break; } bt_dev_dbg(hdev, "state %u -> %u", old_state, state); } void hci_inquiry_cache_flush(struct hci_dev *hdev) { struct discovery_state *cache = &hdev->discovery; struct inquiry_entry *p, *n; list_for_each_entry_safe(p, n, &cache->all, all) { list_del(&p->all); kfree(p); } INIT_LIST_HEAD(&cache->unknown); INIT_LIST_HEAD(&cache->resolve); } struct inquiry_entry *hci_inquiry_cache_lookup(struct hci_dev *hdev, bdaddr_t *bdaddr) { struct discovery_state *cache = &hdev->discovery; struct inquiry_entry *e; BT_DBG("cache %p, %pMR", cache, bdaddr); list_for_each_entry(e, &cache->all, all) { if (!bacmp(&e->data.bdaddr, bdaddr)) return e; } return NULL; } struct inquiry_entry *hci_inquiry_cache_lookup_unknown(struct hci_dev *hdev, bdaddr_t *bdaddr) { struct discovery_state *cache = &hdev->discovery; struct inquiry_entry *e; BT_DBG("cache %p, %pMR", cache, bdaddr); list_for_each_entry(e, &cache->unknown, list) { if (!bacmp(&e->data.bdaddr, bdaddr)) return e; } return NULL; } struct inquiry_entry *hci_inquiry_cache_lookup_resolve(struct hci_dev *hdev, bdaddr_t *bdaddr, int state) { struct discovery_state *cache = &hdev->discovery; struct inquiry_entry *e; BT_DBG("cache %p bdaddr %pMR state %d", cache, bdaddr, state); list_for_each_entry(e, &cache->resolve, list) { if (!bacmp(bdaddr, BDADDR_ANY) && e->name_state == state) return e; if (!bacmp(&e->data.bdaddr, bdaddr)) return e; } return NULL; } void hci_inquiry_cache_update_resolve(struct hci_dev *hdev, struct inquiry_entry *ie) { struct discovery_state *cache = &hdev->discovery; struct list_head *pos = &cache->resolve; struct inquiry_entry *p; list_del(&ie->list); list_for_each_entry(p, &cache->resolve, list) { if (p->name_state != NAME_PENDING && abs(p->data.rssi) >= abs(ie->data.rssi)) break; pos = &p->list; } list_add(&ie->list, pos); } u32 hci_inquiry_cache_update(struct hci_dev *hdev, struct inquiry_data *data, bool name_known) { struct discovery_state *cache = &hdev->discovery; struct inquiry_entry *ie; u32 flags = 0; BT_DBG("cache %p, %pMR", cache, &data->bdaddr); hci_remove_remote_oob_data(hdev, &data->bdaddr, BDADDR_BREDR); if (!data->ssp_mode) flags |= MGMT_DEV_FOUND_LEGACY_PAIRING; ie = hci_inquiry_cache_lookup(hdev, &data->bdaddr); if (ie) { if (!ie->data.ssp_mode) flags |= MGMT_DEV_FOUND_LEGACY_PAIRING; if (ie->name_state == NAME_NEEDED && data->rssi != ie->data.rssi) { ie->data.rssi = data->rssi; hci_inquiry_cache_update_resolve(hdev, ie); } goto update; } /* Entry not in the cache. Add new one. */ ie = kzalloc(sizeof(*ie), GFP_KERNEL); if (!ie) { flags |= MGMT_DEV_FOUND_CONFIRM_NAME; goto done; } list_add(&ie->all, &cache->all); if (name_known) { ie->name_state = NAME_KNOWN; } else { ie->name_state = NAME_NOT_KNOWN; list_add(&ie->list, &cache->unknown); } update: if (name_known && ie->name_state != NAME_KNOWN && ie->name_state != NAME_PENDING) { ie->name_state = NAME_KNOWN; list_del(&ie->list); } memcpy(&ie->data, data, sizeof(*data)); ie->timestamp = jiffies; cache->timestamp = jiffies; if (ie->name_state == NAME_NOT_KNOWN) flags |= MGMT_DEV_FOUND_CONFIRM_NAME; done: return flags; } static int inquiry_cache_dump(struct hci_dev *hdev, int num, __u8 *buf) { struct discovery_state *cache = &hdev->discovery; struct inquiry_info *info = (struct inquiry_info *) buf; struct inquiry_entry *e; int copied = 0; list_for_each_entry(e, &cache->all, all) { struct inquiry_data *data = &e->data; if (copied >= num) break; bacpy(&info->bdaddr, &data->bdaddr); info->pscan_rep_mode = data->pscan_rep_mode; info->pscan_period_mode = data->pscan_period_mode; info->pscan_mode = data->pscan_mode; memcpy(info->dev_class, data->dev_class, 3); info->clock_offset = data->clock_offset; info++; copied++; } BT_DBG("cache %p, copied %d", cache, copied); return copied; } int hci_inquiry(void __user *arg) { __u8 __user *ptr = arg; struct hci_inquiry_req ir; struct hci_dev *hdev; int err = 0, do_inquiry = 0, max_rsp; __u8 *buf; if (copy_from_user(&ir, ptr, sizeof(ir))) return -EFAULT; hdev = hci_dev_get(ir.dev_id); if (!hdev) return -ENODEV; if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) { err = -EBUSY; goto done; } if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) { err = -EOPNOTSUPP; goto done; } if (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED)) { err = -EOPNOTSUPP; goto done; } /* Restrict maximum inquiry length to 60 seconds */ if (ir.length > 60) { err = -EINVAL; goto done; } hci_dev_lock(hdev); if (inquiry_cache_age(hdev) > INQUIRY_CACHE_AGE_MAX || inquiry_cache_empty(hdev) || ir.flags & IREQ_CACHE_FLUSH) { hci_inquiry_cache_flush(hdev); do_inquiry = 1; } hci_dev_unlock(hdev); if (do_inquiry) { hci_req_sync_lock(hdev); err = hci_inquiry_sync(hdev, ir.length, ir.num_rsp); hci_req_sync_unlock(hdev); if (err < 0) goto done; /* Wait until Inquiry procedure finishes (HCI_INQUIRY flag is * cleared). If it is interrupted by a signal, return -EINTR. */ if (wait_on_bit(&hdev->flags, HCI_INQUIRY, TASK_INTERRUPTIBLE)) { err = -EINTR; goto done; } } /* for unlimited number of responses we will use buffer with * 255 entries */ max_rsp = (ir.num_rsp == 0) ? 255 : ir.num_rsp; /* cache_dump can't sleep. Therefore we allocate temp buffer and then * copy it to the user space. */ buf = kmalloc_array(max_rsp, sizeof(struct inquiry_info), GFP_KERNEL); if (!buf) { err = -ENOMEM; goto done; } hci_dev_lock(hdev); ir.num_rsp = inquiry_cache_dump(hdev, max_rsp, buf); hci_dev_unlock(hdev); BT_DBG("num_rsp %d", ir.num_rsp); if (!copy_to_user(ptr, &ir, sizeof(ir))) { ptr += sizeof(ir); if (copy_to_user(ptr, buf, sizeof(struct inquiry_info) * ir.num_rsp)) err = -EFAULT; } else err = -EFAULT; kfree(buf); done: hci_dev_put(hdev); return err; } static int hci_dev_do_open(struct hci_dev *hdev) { int ret = 0; BT_DBG("%s %p", hdev->name, hdev); hci_req_sync_lock(hdev); ret = hci_dev_open_sync(hdev); hci_req_sync_unlock(hdev); return ret; } /* ---- HCI ioctl helpers ---- */ int hci_dev_open(__u16 dev) { struct hci_dev *hdev; int err; hdev = hci_dev_get(dev); if (!hdev) return -ENODEV; /* Devices that are marked as unconfigured can only be powered * up as user channel. Trying to bring them up as normal devices * will result into a failure. Only user channel operation is * possible. * * When this function is called for a user channel, the flag * HCI_USER_CHANNEL will be set first before attempting to * open the device. */ if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED) && !hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) { err = -EOPNOTSUPP; goto done; } /* We need to ensure that no other power on/off work is pending * before proceeding to call hci_dev_do_open. This is * particularly important if the setup procedure has not yet * completed. */ if (hci_dev_test_and_clear_flag(hdev, HCI_AUTO_OFF)) cancel_delayed_work(&hdev->power_off); /* After this call it is guaranteed that the setup procedure * has finished. This means that error conditions like RFKILL * or no valid public or static random address apply. */ flush_workqueue(hdev->req_workqueue); /* For controllers not using the management interface and that * are brought up using legacy ioctl, set the HCI_BONDABLE bit * so that pairing works for them. Once the management interface * is in use this bit will be cleared again and userspace has * to explicitly enable it. */ if (!hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && !hci_dev_test_flag(hdev, HCI_MGMT)) hci_dev_set_flag(hdev, HCI_BONDABLE); err = hci_dev_do_open(hdev); done: hci_dev_put(hdev); return err; } int hci_dev_do_close(struct hci_dev *hdev) { int err; BT_DBG("%s %p", hdev->name, hdev); hci_req_sync_lock(hdev); err = hci_dev_close_sync(hdev); hci_req_sync_unlock(hdev); return err; } int hci_dev_close(__u16 dev) { struct hci_dev *hdev; int err; hdev = hci_dev_get(dev); if (!hdev) return -ENODEV; if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) { err = -EBUSY; goto done; } cancel_work_sync(&hdev->power_on); if (hci_dev_test_and_clear_flag(hdev, HCI_AUTO_OFF)) cancel_delayed_work(&hdev->power_off); err = hci_dev_do_close(hdev); done: hci_dev_put(hdev); return err; } static int hci_dev_do_reset(struct hci_dev *hdev) { int ret; BT_DBG("%s %p", hdev->name, hdev); hci_req_sync_lock(hdev); /* Drop queues */ skb_queue_purge(&hdev->rx_q); skb_queue_purge(&hdev->cmd_q); /* Cancel these to avoid queueing non-chained pending work */ hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE); /* Wait for * * if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE)) * queue_delayed_work(&hdev->{cmd,ncmd}_timer) * * inside RCU section to see the flag or complete scheduling. */ synchronize_rcu(); /* Explicitly cancel works in case scheduled after setting the flag. */ cancel_delayed_work(&hdev->cmd_timer); cancel_delayed_work(&hdev->ncmd_timer); /* Avoid potential lockdep warnings from the *_flush() calls by * ensuring the workqueue is empty up front. */ drain_workqueue(hdev->workqueue); hci_dev_lock(hdev); hci_inquiry_cache_flush(hdev); hci_conn_hash_flush(hdev); hci_dev_unlock(hdev); if (hdev->flush) hdev->flush(hdev); hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE); atomic_set(&hdev->cmd_cnt, 1); hdev->acl_cnt = 0; hdev->sco_cnt = 0; hdev->le_cnt = 0; hdev->iso_cnt = 0; ret = hci_reset_sync(hdev); hci_req_sync_unlock(hdev); return ret; } int hci_dev_reset(__u16 dev) { struct hci_dev *hdev; int err; hdev = hci_dev_get(dev); if (!hdev) return -ENODEV; if (!test_bit(HCI_UP, &hdev->flags)) { err = -ENETDOWN; goto done; } if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) { err = -EBUSY; goto done; } if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) { err = -EOPNOTSUPP; goto done; } err = hci_dev_do_reset(hdev); done: hci_dev_put(hdev); return err; } int hci_dev_reset_stat(__u16 dev) { struct hci_dev *hdev; int ret = 0; hdev = hci_dev_get(dev); if (!hdev) return -ENODEV; if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) { ret = -EBUSY; goto done; } if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) { ret = -EOPNOTSUPP; goto done; } memset(&hdev->stat, 0, sizeof(struct hci_dev_stats)); done: hci_dev_put(hdev); return ret; } static void hci_update_passive_scan_state(struct hci_dev *hdev, u8 scan) { bool conn_changed, discov_changed; BT_DBG("%s scan 0x%02x", hdev->name, scan); if ((scan & SCAN_PAGE)) conn_changed = !hci_dev_test_and_set_flag(hdev, HCI_CONNECTABLE); else conn_changed = hci_dev_test_and_clear_flag(hdev, HCI_CONNECTABLE); if ((scan & SCAN_INQUIRY)) { discov_changed = !hci_dev_test_and_set_flag(hdev, HCI_DISCOVERABLE); } else { hci_dev_clear_flag(hdev, HCI_LIMITED_DISCOVERABLE); discov_changed = hci_dev_test_and_clear_flag(hdev, HCI_DISCOVERABLE); } if (!hci_dev_test_flag(hdev, HCI_MGMT)) return; if (conn_changed || discov_changed) { /* In case this was disabled through mgmt */ hci_dev_set_flag(hdev, HCI_BREDR_ENABLED); if (hci_dev_test_flag(hdev, HCI_LE_ENABLED)) hci_update_adv_data(hdev, hdev->cur_adv_instance); mgmt_new_settings(hdev); } } int hci_dev_cmd(unsigned int cmd, void __user *arg) { struct hci_dev *hdev; struct hci_dev_req dr; __le16 policy; int err = 0; if (copy_from_user(&dr, arg, sizeof(dr))) return -EFAULT; hdev = hci_dev_get(dr.dev_id); if (!hdev) return -ENODEV; if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) { err = -EBUSY; goto done; } if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) { err = -EOPNOTSUPP; goto done; } if (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED)) { err = -EOPNOTSUPP; goto done; } switch (cmd) { case HCISETAUTH: err = hci_cmd_sync_status(hdev, HCI_OP_WRITE_AUTH_ENABLE, 1, &dr.dev_opt, HCI_CMD_TIMEOUT); break; case HCISETENCRYPT: if (!lmp_encrypt_capable(hdev)) { err = -EOPNOTSUPP; break; } if (!test_bit(HCI_AUTH, &hdev->flags)) { /* Auth must be enabled first */ err = hci_cmd_sync_status(hdev, HCI_OP_WRITE_AUTH_ENABLE, 1, &dr.dev_opt, HCI_CMD_TIMEOUT); if (err) break; } err = hci_cmd_sync_status(hdev, HCI_OP_WRITE_ENCRYPT_MODE, 1, &dr.dev_opt, HCI_CMD_TIMEOUT); break; case HCISETSCAN: err = hci_cmd_sync_status(hdev, HCI_OP_WRITE_SCAN_ENABLE, 1, &dr.dev_opt, HCI_CMD_TIMEOUT); /* Ensure that the connectable and discoverable states * get correctly modified as this was a non-mgmt change. */ if (!err) hci_update_passive_scan_state(hdev, dr.dev_opt); break; case HCISETLINKPOL: policy = cpu_to_le16(dr.dev_opt); err = hci_cmd_sync_status(hdev, HCI_OP_WRITE_DEF_LINK_POLICY, 2, &policy, HCI_CMD_TIMEOUT); break; case HCISETLINKMODE: hdev->link_mode = ((__u16) dr.dev_opt) & (HCI_LM_MASTER | HCI_LM_ACCEPT); break; case HCISETPTYPE: if (hdev->pkt_type == (__u16) dr.dev_opt) break; hdev->pkt_type = (__u16) dr.dev_opt; mgmt_phy_configuration_changed(hdev, NULL); break; case HCISETACLMTU: hdev->acl_mtu = *((__u16 *) &dr.dev_opt + 1); hdev->acl_pkts = *((__u16 *) &dr.dev_opt + 0); break; case HCISETSCOMTU: hdev->sco_mtu = *((__u16 *) &dr.dev_opt + 1); hdev->sco_pkts = *((__u16 *) &dr.dev_opt + 0); break; default: err = -EINVAL; break; } done: hci_dev_put(hdev); return err; } int hci_get_dev_list(void __user *arg) { struct hci_dev *hdev; struct hci_dev_list_req *dl; struct hci_dev_req *dr; int n = 0, err; __u16 dev_num; if (get_user(dev_num, (__u16 __user *) arg)) return -EFAULT; if (!dev_num || dev_num > (PAGE_SIZE * 2) / sizeof(*dr)) return -EINVAL; dl = kzalloc(struct_size(dl, dev_req, dev_num), GFP_KERNEL); if (!dl) return -ENOMEM; dl->dev_num = dev_num; dr = dl->dev_req; read_lock(&hci_dev_list_lock); list_for_each_entry(hdev, &hci_dev_list, list) { unsigned long flags = hdev->flags; /* When the auto-off is configured it means the transport * is running, but in that case still indicate that the * device is actually down. */ if (hci_dev_test_flag(hdev, HCI_AUTO_OFF)) flags &= ~BIT(HCI_UP); dr[n].dev_id = hdev->id; dr[n].dev_opt = flags; if (++n >= dev_num) break; } read_unlock(&hci_dev_list_lock); dl->dev_num = n; err = copy_to_user(arg, dl, struct_size(dl, dev_req, n)); kfree(dl); return err ? -EFAULT : 0; } int hci_get_dev_info(void __user *arg) { struct hci_dev *hdev; struct hci_dev_info di; unsigned long flags; int err = 0; if (copy_from_user(&di, arg, sizeof(di))) return -EFAULT; hdev = hci_dev_get(di.dev_id); if (!hdev) return -ENODEV; /* When the auto-off is configured it means the transport * is running, but in that case still indicate that the * device is actually down. */ if (hci_dev_test_flag(hdev, HCI_AUTO_OFF)) flags = hdev->flags & ~BIT(HCI_UP); else flags = hdev->flags; strscpy(di.name, hdev->name, sizeof(di.name)); di.bdaddr = hdev->bdaddr; di.type = (hdev->bus & 0x0f); di.flags = flags; di.pkt_type = hdev->pkt_type; if (lmp_bredr_capable(hdev)) { di.acl_mtu = hdev->acl_mtu; di.acl_pkts = hdev->acl_pkts; di.sco_mtu = hdev->sco_mtu; di.sco_pkts = hdev->sco_pkts; } else { di.acl_mtu = hdev->le_mtu; di.acl_pkts = hdev->le_pkts; di.sco_mtu = 0; di.sco_pkts = 0; } di.link_policy = hdev->link_policy; di.link_mode = hdev->link_mode; memcpy(&di.stat, &hdev->stat, sizeof(di.stat)); memcpy(&di.features, &hdev->features, sizeof(di.features)); if (copy_to_user(arg, &di, sizeof(di))) err = -EFAULT; hci_dev_put(hdev); return err; } /* ---- Interface to HCI drivers ---- */ static int hci_dev_do_poweroff(struct hci_dev *hdev) { int err; BT_DBG("%s %p", hdev->name, hdev); hci_req_sync_lock(hdev); err = hci_set_powered_sync(hdev, false); hci_req_sync_unlock(hdev); return err; } static int hci_rfkill_set_block(void *data, bool blocked) { struct hci_dev *hdev = data; int err; BT_DBG("%p name %s blocked %d", hdev, hdev->name, blocked); if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) return -EBUSY; if (blocked == hci_dev_test_flag(hdev, HCI_RFKILLED)) return 0; if (blocked) { hci_dev_set_flag(hdev, HCI_RFKILLED); if (!hci_dev_test_flag(hdev, HCI_SETUP) && !hci_dev_test_flag(hdev, HCI_CONFIG)) { err = hci_dev_do_poweroff(hdev); if (err) { bt_dev_err(hdev, "Error when powering off device on rfkill (%d)", err); /* Make sure the device is still closed even if * anything during power off sequence (eg. * disconnecting devices) failed. */ hci_dev_do_close(hdev); } } } else { hci_dev_clear_flag(hdev, HCI_RFKILLED); } return 0; } static const struct rfkill_ops hci_rfkill_ops = { .set_block = hci_rfkill_set_block, }; static void hci_power_on(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, power_on); int err; BT_DBG("%s", hdev->name); if (test_bit(HCI_UP, &hdev->flags) && hci_dev_test_flag(hdev, HCI_MGMT) && hci_dev_test_and_clear_flag(hdev, HCI_AUTO_OFF)) { cancel_delayed_work(&hdev->power_off); err = hci_powered_update_sync(hdev); mgmt_power_on(hdev, err); return; } err = hci_dev_do_open(hdev); if (err < 0) { hci_dev_lock(hdev); mgmt_set_powered_failed(hdev, err); hci_dev_unlock(hdev); return; } /* During the HCI setup phase, a few error conditions are * ignored and they need to be checked now. If they are still * valid, it is important to turn the device back off. */ if (hci_dev_test_flag(hdev, HCI_RFKILLED) || hci_dev_test_flag(hdev, HCI_UNCONFIGURED) || (!bacmp(&hdev->bdaddr, BDADDR_ANY) && !bacmp(&hdev->static_addr, BDADDR_ANY))) { hci_dev_clear_flag(hdev, HCI_AUTO_OFF); hci_dev_do_close(hdev); } else if (hci_dev_test_flag(hdev, HCI_AUTO_OFF)) { queue_delayed_work(hdev->req_workqueue, &hdev->power_off, HCI_AUTO_OFF_TIMEOUT); } if (hci_dev_test_and_clear_flag(hdev, HCI_SETUP)) { /* For unconfigured devices, set the HCI_RAW flag * so that userspace can easily identify them. */ if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) set_bit(HCI_RAW, &hdev->flags); /* For fully configured devices, this will send * the Index Added event. For unconfigured devices, * it will send Unconfigued Index Added event. * * Devices with HCI_QUIRK_RAW_DEVICE are ignored * and no event will be send. */ mgmt_index_added(hdev); } else if (hci_dev_test_and_clear_flag(hdev, HCI_CONFIG)) { /* When the controller is now configured, then it * is important to clear the HCI_RAW flag. */ if (!hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) clear_bit(HCI_RAW, &hdev->flags); /* Powering on the controller with HCI_CONFIG set only * happens with the transition from unconfigured to * configured. This will send the Index Added event. */ mgmt_index_added(hdev); } } static void hci_power_off(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, power_off.work); BT_DBG("%s", hdev->name); hci_dev_do_close(hdev); } static void hci_error_reset(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, error_reset); hci_dev_hold(hdev); BT_DBG("%s", hdev->name); if (hdev->hw_error) hdev->hw_error(hdev, hdev->hw_error_code); else bt_dev_err(hdev, "hardware error 0x%2.2x", hdev->hw_error_code); if (!hci_dev_do_close(hdev)) hci_dev_do_open(hdev); hci_dev_put(hdev); } void hci_uuids_clear(struct hci_dev *hdev) { struct bt_uuid *uuid, *tmp; list_for_each_entry_safe(uuid, tmp, &hdev->uuids, list) { list_del(&uuid->list); kfree(uuid); } } void hci_link_keys_clear(struct hci_dev *hdev) { struct link_key *key, *tmp; list_for_each_entry_safe(key, tmp, &hdev->link_keys, list) { list_del_rcu(&key->list); kfree_rcu(key, rcu); } } void hci_smp_ltks_clear(struct hci_dev *hdev) { struct smp_ltk *k, *tmp; list_for_each_entry_safe(k, tmp, &hdev->long_term_keys, list) { list_del_rcu(&k->list); kfree_rcu(k, rcu); } } void hci_smp_irks_clear(struct hci_dev *hdev) { struct smp_irk *k, *tmp; list_for_each_entry_safe(k, tmp, &hdev->identity_resolving_keys, list) { list_del_rcu(&k->list); kfree_rcu(k, rcu); } } void hci_blocked_keys_clear(struct hci_dev *hdev) { struct blocked_key *b, *tmp; list_for_each_entry_safe(b, tmp, &hdev->blocked_keys, list) { list_del_rcu(&b->list); kfree_rcu(b, rcu); } } bool hci_is_blocked_key(struct hci_dev *hdev, u8 type, u8 val[16]) { bool blocked = false; struct blocked_key *b; rcu_read_lock(); list_for_each_entry_rcu(b, &hdev->blocked_keys, list) { if (b->type == type && !memcmp(b->val, val, sizeof(b->val))) { blocked = true; break; } } rcu_read_unlock(); return blocked; } struct link_key *hci_find_link_key(struct hci_dev *hdev, bdaddr_t *bdaddr) { struct link_key *k; rcu_read_lock(); list_for_each_entry_rcu(k, &hdev->link_keys, list) { if (bacmp(bdaddr, &k->bdaddr) == 0) { rcu_read_unlock(); if (hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_LINKKEY, k->val)) { bt_dev_warn_ratelimited(hdev, "Link key blocked for %pMR", &k->bdaddr); return NULL; } return k; } } rcu_read_unlock(); return NULL; } static bool hci_persistent_key(struct hci_dev *hdev, struct hci_conn *conn, u8 key_type, u8 old_key_type) { /* Legacy key */ if (key_type < 0x03) return true; /* Debug keys are insecure so don't store them persistently */ if (key_type == HCI_LK_DEBUG_COMBINATION) return false; /* Changed combination key and there's no previous one */ if (key_type == HCI_LK_CHANGED_COMBINATION && old_key_type == 0xff) return false; /* Security mode 3 case */ if (!conn) return true; /* BR/EDR key derived using SC from an LE link */ if (conn->type == LE_LINK) return true; /* Neither local nor remote side had no-bonding as requirement */ if (conn->auth_type > 0x01 && conn->remote_auth > 0x01) return true; /* Local side had dedicated bonding as requirement */ if (conn->auth_type == 0x02 || conn->auth_type == 0x03) return true; /* Remote side had dedicated bonding as requirement */ if (conn->remote_auth == 0x02 || conn->remote_auth == 0x03) return true; /* If none of the above criteria match, then don't store the key * persistently */ return false; } static u8 ltk_role(u8 type) { if (type == SMP_LTK) return HCI_ROLE_MASTER; return HCI_ROLE_SLAVE; } struct smp_ltk *hci_find_ltk(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 addr_type, u8 role) { struct smp_ltk *k; rcu_read_lock(); list_for_each_entry_rcu(k, &hdev->long_term_keys, list) { if (addr_type != k->bdaddr_type || bacmp(bdaddr, &k->bdaddr)) continue; if (smp_ltk_is_sc(k) || ltk_role(k->type) == role) { rcu_read_unlock(); if (hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_LTK, k->val)) { bt_dev_warn_ratelimited(hdev, "LTK blocked for %pMR", &k->bdaddr); return NULL; } return k; } } rcu_read_unlock(); return NULL; } struct smp_irk *hci_find_irk_by_rpa(struct hci_dev *hdev, bdaddr_t *rpa) { struct smp_irk *irk_to_return = NULL; struct smp_irk *irk; rcu_read_lock(); list_for_each_entry_rcu(irk, &hdev->identity_resolving_keys, list) { if (!bacmp(&irk->rpa, rpa)) { irk_to_return = irk; goto done; } } list_for_each_entry_rcu(irk, &hdev->identity_resolving_keys, list) { if (smp_irk_matches(hdev, irk->val, rpa)) { bacpy(&irk->rpa, rpa); irk_to_return = irk; goto done; } } done: if (irk_to_return && hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_IRK, irk_to_return->val)) { bt_dev_warn_ratelimited(hdev, "Identity key blocked for %pMR", &irk_to_return->bdaddr); irk_to_return = NULL; } rcu_read_unlock(); return irk_to_return; } struct smp_irk *hci_find_irk_by_addr(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 addr_type) { struct smp_irk *irk_to_return = NULL; struct smp_irk *irk; /* Identity Address must be public or static random */ if (addr_type == ADDR_LE_DEV_RANDOM && (bdaddr->b[5] & 0xc0) != 0xc0) return NULL; rcu_read_lock(); list_for_each_entry_rcu(irk, &hdev->identity_resolving_keys, list) { if (addr_type == irk->addr_type && bacmp(bdaddr, &irk->bdaddr) == 0) { irk_to_return = irk; goto done; } } done: if (irk_to_return && hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_IRK, irk_to_return->val)) { bt_dev_warn_ratelimited(hdev, "Identity key blocked for %pMR", &irk_to_return->bdaddr); irk_to_return = NULL; } rcu_read_unlock(); return irk_to_return; } struct link_key *hci_add_link_key(struct hci_dev *hdev, struct hci_conn *conn, bdaddr_t *bdaddr, u8 *val, u8 type, u8 pin_len, bool *persistent) { struct link_key *key, *old_key; u8 old_key_type; old_key = hci_find_link_key(hdev, bdaddr); if (old_key) { old_key_type = old_key->type; key = old_key; } else { old_key_type = conn ? conn->key_type : 0xff; key = kzalloc(sizeof(*key), GFP_KERNEL); if (!key) return NULL; list_add_rcu(&key->list, &hdev->link_keys); } BT_DBG("%s key for %pMR type %u", hdev->name, bdaddr, type); /* Some buggy controller combinations generate a changed * combination key for legacy pairing even when there's no * previous key */ if (type == HCI_LK_CHANGED_COMBINATION && (!conn || conn->remote_auth == 0xff) && old_key_type == 0xff) { type = HCI_LK_COMBINATION; if (conn) conn->key_type = type; } bacpy(&key->bdaddr, bdaddr); memcpy(key->val, val, HCI_LINK_KEY_SIZE); key->pin_len = pin_len; if (type == HCI_LK_CHANGED_COMBINATION) key->type = old_key_type; else key->type = type; if (persistent) *persistent = hci_persistent_key(hdev, conn, type, old_key_type); return key; } struct smp_ltk *hci_add_ltk(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 addr_type, u8 type, u8 authenticated, u8 tk[16], u8 enc_size, __le16 ediv, __le64 rand) { struct smp_ltk *key, *old_key; u8 role = ltk_role(type); old_key = hci_find_ltk(hdev, bdaddr, addr_type, role); if (old_key) key = old_key; else { key = kzalloc(sizeof(*key), GFP_KERNEL); if (!key) return NULL; list_add_rcu(&key->list, &hdev->long_term_keys); } bacpy(&key->bdaddr, bdaddr); key->bdaddr_type = addr_type; memcpy(key->val, tk, sizeof(key->val)); key->authenticated = authenticated; key->ediv = ediv; key->rand = rand; key->enc_size = enc_size; key->type = type; return key; } struct smp_irk *hci_add_irk(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 addr_type, u8 val[16], bdaddr_t *rpa) { struct smp_irk *irk; irk = hci_find_irk_by_addr(hdev, bdaddr, addr_type); if (!irk) { irk = kzalloc(sizeof(*irk), GFP_KERNEL); if (!irk) return NULL; bacpy(&irk->bdaddr, bdaddr); irk->addr_type = addr_type; list_add_rcu(&irk->list, &hdev->identity_resolving_keys); } memcpy(irk->val, val, 16); bacpy(&irk->rpa, rpa); return irk; } int hci_remove_link_key(struct hci_dev *hdev, bdaddr_t *bdaddr) { struct link_key *key; key = hci_find_link_key(hdev, bdaddr); if (!key) return -ENOENT; BT_DBG("%s removing %pMR", hdev->name, bdaddr); list_del_rcu(&key->list); kfree_rcu(key, rcu); return 0; } int hci_remove_ltk(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 bdaddr_type) { struct smp_ltk *k, *tmp; int removed = 0; list_for_each_entry_safe(k, tmp, &hdev->long_term_keys, list) { if (bacmp(bdaddr, &k->bdaddr) || k->bdaddr_type != bdaddr_type) continue; BT_DBG("%s removing %pMR", hdev->name, bdaddr); list_del_rcu(&k->list); kfree_rcu(k, rcu); removed++; } return removed ? 0 : -ENOENT; } void hci_remove_irk(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 addr_type) { struct smp_irk *k, *tmp; list_for_each_entry_safe(k, tmp, &hdev->identity_resolving_keys, list) { if (bacmp(bdaddr, &k->bdaddr) || k->addr_type != addr_type) continue; BT_DBG("%s removing %pMR", hdev->name, bdaddr); list_del_rcu(&k->list); kfree_rcu(k, rcu); } } bool hci_bdaddr_is_paired(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 type) { struct smp_ltk *k; struct smp_irk *irk; u8 addr_type; if (type == BDADDR_BREDR) { if (hci_find_link_key(hdev, bdaddr)) return true; return false; } /* Convert to HCI addr type which struct smp_ltk uses */ if (type == BDADDR_LE_PUBLIC) addr_type = ADDR_LE_DEV_PUBLIC; else addr_type = ADDR_LE_DEV_RANDOM; irk = hci_get_irk(hdev, bdaddr, addr_type); if (irk) { bdaddr = &irk->bdaddr; addr_type = irk->addr_type; } rcu_read_lock(); list_for_each_entry_rcu(k, &hdev->long_term_keys, list) { if (k->bdaddr_type == addr_type && !bacmp(bdaddr, &k->bdaddr)) { rcu_read_unlock(); return true; } } rcu_read_unlock(); return false; } /* HCI command timer function */ static void hci_cmd_timeout(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, cmd_timer.work); if (hdev->req_skb) { u16 opcode = hci_skb_opcode(hdev->req_skb); bt_dev_err(hdev, "command 0x%4.4x tx timeout", opcode); hci_cmd_sync_cancel_sync(hdev, ETIMEDOUT); } else { bt_dev_err(hdev, "command tx timeout"); } if (hdev->reset) hdev->reset(hdev); atomic_set(&hdev->cmd_cnt, 1); queue_work(hdev->workqueue, &hdev->cmd_work); } /* HCI ncmd timer function */ static void hci_ncmd_timeout(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, ncmd_timer.work); bt_dev_err(hdev, "Controller not accepting commands anymore: ncmd = 0"); /* During HCI_INIT phase no events can be injected if the ncmd timer * triggers since the procedure has its own timeout handling. */ if (test_bit(HCI_INIT, &hdev->flags)) return; /* This is an irrecoverable state, inject hardware error event */ hci_reset_dev(hdev); } struct oob_data *hci_find_remote_oob_data(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 bdaddr_type) { struct oob_data *data; list_for_each_entry(data, &hdev->remote_oob_data, list) { if (bacmp(bdaddr, &data->bdaddr) != 0) continue; if (data->bdaddr_type != bdaddr_type) continue; return data; } return NULL; } int hci_remove_remote_oob_data(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 bdaddr_type) { struct oob_data *data; data = hci_find_remote_oob_data(hdev, bdaddr, bdaddr_type); if (!data) return -ENOENT; BT_DBG("%s removing %pMR (%u)", hdev->name, bdaddr, bdaddr_type); list_del(&data->list); kfree(data); return 0; } void hci_remote_oob_data_clear(struct hci_dev *hdev) { struct oob_data *data, *n; list_for_each_entry_safe(data, n, &hdev->remote_oob_data, list) { list_del(&data->list); kfree(data); } } int hci_add_remote_oob_data(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 bdaddr_type, u8 *hash192, u8 *rand192, u8 *hash256, u8 *rand256) { struct oob_data *data; data = hci_find_remote_oob_data(hdev, bdaddr, bdaddr_type); if (!data) { data = kmalloc(sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; bacpy(&data->bdaddr, bdaddr); data->bdaddr_type = bdaddr_type; list_add(&data->list, &hdev->remote_oob_data); } if (hash192 && rand192) { memcpy(data->hash192, hash192, sizeof(data->hash192)); memcpy(data->rand192, rand192, sizeof(data->rand192)); if (hash256 && rand256) data->present = 0x03; } else { memset(data->hash192, 0, sizeof(data->hash192)); memset(data->rand192, 0, sizeof(data->rand192)); if (hash256 && rand256) data->present = 0x02; else data->present = 0x00; } if (hash256 && rand256) { memcpy(data->hash256, hash256, sizeof(data->hash256)); memcpy(data->rand256, rand256, sizeof(data->rand256)); } else { memset(data->hash256, 0, sizeof(data->hash256)); memset(data->rand256, 0, sizeof(data->rand256)); if (hash192 && rand192) data->present = 0x01; } BT_DBG("%s for %pMR", hdev->name, bdaddr); return 0; } /* This function requires the caller holds hdev->lock */ struct adv_info *hci_find_adv_instance(struct hci_dev *hdev, u8 instance) { struct adv_info *adv_instance; list_for_each_entry(adv_instance, &hdev->adv_instances, list) { if (adv_instance->instance == instance) return adv_instance; } return NULL; } /* This function requires the caller holds hdev->lock */ struct adv_info *hci_get_next_instance(struct hci_dev *hdev, u8 instance) { struct adv_info *cur_instance; cur_instance = hci_find_adv_instance(hdev, instance); if (!cur_instance) return NULL; if (cur_instance == list_last_entry(&hdev->adv_instances, struct adv_info, list)) return list_first_entry(&hdev->adv_instances, struct adv_info, list); else return list_next_entry(cur_instance, list); } /* This function requires the caller holds hdev->lock */ int hci_remove_adv_instance(struct hci_dev *hdev, u8 instance) { struct adv_info *adv_instance; adv_instance = hci_find_adv_instance(hdev, instance); if (!adv_instance) return -ENOENT; BT_DBG("%s removing %dMR", hdev->name, instance); if (hdev->cur_adv_instance == instance) { if (hdev->adv_instance_timeout) { cancel_delayed_work(&hdev->adv_instance_expire); hdev->adv_instance_timeout = 0; } hdev->cur_adv_instance = 0x00; } cancel_delayed_work_sync(&adv_instance->rpa_expired_cb); list_del(&adv_instance->list); kfree(adv_instance); hdev->adv_instance_cnt--; return 0; } void hci_adv_instances_set_rpa_expired(struct hci_dev *hdev, bool rpa_expired) { struct adv_info *adv_instance, *n; list_for_each_entry_safe(adv_instance, n, &hdev->adv_instances, list) adv_instance->rpa_expired = rpa_expired; } /* This function requires the caller holds hdev->lock */ void hci_adv_instances_clear(struct hci_dev *hdev) { struct adv_info *adv_instance, *n; if (hdev->adv_instance_timeout) { disable_delayed_work(&hdev->adv_instance_expire); hdev->adv_instance_timeout = 0; } list_for_each_entry_safe(adv_instance, n, &hdev->adv_instances, list) { disable_delayed_work_sync(&adv_instance->rpa_expired_cb); list_del(&adv_instance->list); kfree(adv_instance); } hdev->adv_instance_cnt = 0; hdev->cur_adv_instance = 0x00; } static void adv_instance_rpa_expired(struct work_struct *work) { struct adv_info *adv_instance = container_of(work, struct adv_info, rpa_expired_cb.work); BT_DBG(""); adv_instance->rpa_expired = true; } /* This function requires the caller holds hdev->lock */ struct adv_info *hci_add_adv_instance(struct hci_dev *hdev, u8 instance, u32 flags, u16 adv_data_len, u8 *adv_data, u16 scan_rsp_len, u8 *scan_rsp_data, u16 timeout, u16 duration, s8 tx_power, u32 min_interval, u32 max_interval, u8 mesh_handle) { struct adv_info *adv; adv = hci_find_adv_instance(hdev, instance); if (adv) { memset(adv->adv_data, 0, sizeof(adv->adv_data)); memset(adv->scan_rsp_data, 0, sizeof(adv->scan_rsp_data)); memset(adv->per_adv_data, 0, sizeof(adv->per_adv_data)); } else { if (hdev->adv_instance_cnt >= hdev->le_num_of_adv_sets || instance < 1 || instance > hdev->le_num_of_adv_sets + 1) return ERR_PTR(-EOVERFLOW); adv = kzalloc(sizeof(*adv), GFP_KERNEL); if (!adv) return ERR_PTR(-ENOMEM); adv->pending = true; adv->instance = instance; /* If controller support only one set and the instance is set to * 1 then there is no option other than using handle 0x00. */ if (hdev->le_num_of_adv_sets == 1 && instance == 1) adv->handle = 0x00; else adv->handle = instance; list_add(&adv->list, &hdev->adv_instances); hdev->adv_instance_cnt++; } adv->flags = flags; adv->min_interval = min_interval; adv->max_interval = max_interval; adv->tx_power = tx_power; /* Defining a mesh_handle changes the timing units to ms, * rather than seconds, and ties the instance to the requested * mesh_tx queue. */ adv->mesh = mesh_handle; hci_set_adv_instance_data(hdev, instance, adv_data_len, adv_data, scan_rsp_len, scan_rsp_data); adv->timeout = timeout; adv->remaining_time = timeout; if (duration == 0) adv->duration = hdev->def_multi_adv_rotation_duration; else adv->duration = duration; INIT_DELAYED_WORK(&adv->rpa_expired_cb, adv_instance_rpa_expired); BT_DBG("%s for %dMR", hdev->name, instance); return adv; } /* This function requires the caller holds hdev->lock */ struct adv_info *hci_add_per_instance(struct hci_dev *hdev, u8 instance, u32 flags, u8 data_len, u8 *data, u32 min_interval, u32 max_interval) { struct adv_info *adv; adv = hci_add_adv_instance(hdev, instance, flags, 0, NULL, 0, NULL, 0, 0, HCI_ADV_TX_POWER_NO_PREFERENCE, min_interval, max_interval, 0); if (IS_ERR(adv)) return adv; adv->periodic = true; adv->per_adv_data_len = data_len; if (data) memcpy(adv->per_adv_data, data, data_len); return adv; } /* This function requires the caller holds hdev->lock */ int hci_set_adv_instance_data(struct hci_dev *hdev, u8 instance, u16 adv_data_len, u8 *adv_data, u16 scan_rsp_len, u8 *scan_rsp_data) { struct adv_info *adv; adv = hci_find_adv_instance(hdev, instance); /* If advertisement doesn't exist, we can't modify its data */ if (!adv) return -ENOENT; if (adv_data_len && ADV_DATA_CMP(adv, adv_data, adv_data_len)) { memset(adv->adv_data, 0, sizeof(adv->adv_data)); memcpy(adv->adv_data, adv_data, adv_data_len); adv->adv_data_len = adv_data_len; adv->adv_data_changed = true; } if (scan_rsp_len && SCAN_RSP_CMP(adv, scan_rsp_data, scan_rsp_len)) { memset(adv->scan_rsp_data, 0, sizeof(adv->scan_rsp_data)); memcpy(adv->scan_rsp_data, scan_rsp_data, scan_rsp_len); adv->scan_rsp_len = scan_rsp_len; adv->scan_rsp_changed = true; } /* Mark as changed if there are flags which would affect it */ if (((adv->flags & MGMT_ADV_FLAG_APPEARANCE) && hdev->appearance) || adv->flags & MGMT_ADV_FLAG_LOCAL_NAME) adv->scan_rsp_changed = true; return 0; } /* This function requires the caller holds hdev->lock */ u32 hci_adv_instance_flags(struct hci_dev *hdev, u8 instance) { u32 flags; struct adv_info *adv; if (instance == 0x00) { /* Instance 0 always manages the "Tx Power" and "Flags" * fields */ flags = MGMT_ADV_FLAG_TX_POWER | MGMT_ADV_FLAG_MANAGED_FLAGS; /* For instance 0, the HCI_ADVERTISING_CONNECTABLE setting * corresponds to the "connectable" instance flag. */ if (hci_dev_test_flag(hdev, HCI_ADVERTISING_CONNECTABLE)) flags |= MGMT_ADV_FLAG_CONNECTABLE; if (hci_dev_test_flag(hdev, HCI_LIMITED_DISCOVERABLE)) flags |= MGMT_ADV_FLAG_LIMITED_DISCOV; else if (hci_dev_test_flag(hdev, HCI_DISCOVERABLE)) flags |= MGMT_ADV_FLAG_DISCOV; return flags; } adv = hci_find_adv_instance(hdev, instance); /* Return 0 when we got an invalid instance identifier. */ if (!adv) return 0; return adv->flags; } bool hci_adv_instance_is_scannable(struct hci_dev *hdev, u8 instance) { struct adv_info *adv; /* Instance 0x00 always set local name */ if (instance == 0x00) return true; adv = hci_find_adv_instance(hdev, instance); if (!adv) return false; if (adv->flags & MGMT_ADV_FLAG_APPEARANCE || adv->flags & MGMT_ADV_FLAG_LOCAL_NAME) return true; return adv->scan_rsp_len ? true : false; } /* This function requires the caller holds hdev->lock */ void hci_adv_monitors_clear(struct hci_dev *hdev) { struct adv_monitor *monitor; int handle; idr_for_each_entry(&hdev->adv_monitors_idr, monitor, handle) hci_free_adv_monitor(hdev, monitor); idr_destroy(&hdev->adv_monitors_idr); } /* Frees the monitor structure and do some bookkeepings. * This function requires the caller holds hdev->lock. */ void hci_free_adv_monitor(struct hci_dev *hdev, struct adv_monitor *monitor) { struct adv_pattern *pattern; struct adv_pattern *tmp; if (!monitor) return; list_for_each_entry_safe(pattern, tmp, &monitor->patterns, list) { list_del(&pattern->list); kfree(pattern); } if (monitor->handle) idr_remove(&hdev->adv_monitors_idr, monitor->handle); if (monitor->state != ADV_MONITOR_STATE_NOT_REGISTERED) { hdev->adv_monitors_cnt--; mgmt_adv_monitor_removed(hdev, monitor->handle); } kfree(monitor); } /* Assigns handle to a monitor, and if offloading is supported and power is on, * also attempts to forward the request to the controller. * This function requires the caller holds hci_req_sync_lock. */ int hci_add_adv_monitor(struct hci_dev *hdev, struct adv_monitor *monitor) { int min, max, handle; int status = 0; if (!monitor) return -EINVAL; hci_dev_lock(hdev); min = HCI_MIN_ADV_MONITOR_HANDLE; max = HCI_MIN_ADV_MONITOR_HANDLE + HCI_MAX_ADV_MONITOR_NUM_HANDLES; handle = idr_alloc(&hdev->adv_monitors_idr, monitor, min, max, GFP_KERNEL); hci_dev_unlock(hdev); if (handle < 0) return handle; monitor->handle = handle; if (!hdev_is_powered(hdev)) return status; switch (hci_get_adv_monitor_offload_ext(hdev)) { case HCI_ADV_MONITOR_EXT_NONE: bt_dev_dbg(hdev, "add monitor %d status %d", monitor->handle, status); /* Message was not forwarded to controller - not an error */ break; case HCI_ADV_MONITOR_EXT_MSFT: status = msft_add_monitor_pattern(hdev, monitor); bt_dev_dbg(hdev, "add monitor %d msft status %d", handle, status); break; } return status; } /* Attempts to tell the controller and free the monitor. If somehow the * controller doesn't have a corresponding handle, remove anyway. * This function requires the caller holds hci_req_sync_lock. */ static int hci_remove_adv_monitor(struct hci_dev *hdev, struct adv_monitor *monitor) { int status = 0; int handle; switch (hci_get_adv_monitor_offload_ext(hdev)) { case HCI_ADV_MONITOR_EXT_NONE: /* also goes here when powered off */ bt_dev_dbg(hdev, "remove monitor %d status %d", monitor->handle, status); goto free_monitor; case HCI_ADV_MONITOR_EXT_MSFT: handle = monitor->handle; status = msft_remove_monitor(hdev, monitor); bt_dev_dbg(hdev, "remove monitor %d msft status %d", handle, status); break; } /* In case no matching handle registered, just free the monitor */ if (status == -ENOENT) goto free_monitor; return status; free_monitor: if (status == -ENOENT) bt_dev_warn(hdev, "Removing monitor with no matching handle %d", monitor->handle); hci_free_adv_monitor(hdev, monitor); return status; } /* This function requires the caller holds hci_req_sync_lock */ int hci_remove_single_adv_monitor(struct hci_dev *hdev, u16 handle) { struct adv_monitor *monitor = idr_find(&hdev->adv_monitors_idr, handle); if (!monitor) return -EINVAL; return hci_remove_adv_monitor(hdev, monitor); } /* This function requires the caller holds hci_req_sync_lock */ int hci_remove_all_adv_monitor(struct hci_dev *hdev) { struct adv_monitor *monitor; int idr_next_id = 0; int status = 0; while (1) { monitor = idr_get_next(&hdev->adv_monitors_idr, &idr_next_id); if (!monitor) break; status = hci_remove_adv_monitor(hdev, monitor); if (status) return status; idr_next_id++; } return status; } /* This function requires the caller holds hdev->lock */ bool hci_is_adv_monitoring(struct hci_dev *hdev) { return !idr_is_empty(&hdev->adv_monitors_idr); } int hci_get_adv_monitor_offload_ext(struct hci_dev *hdev) { if (msft_monitor_supported(hdev)) return HCI_ADV_MONITOR_EXT_MSFT; return HCI_ADV_MONITOR_EXT_NONE; } struct bdaddr_list *hci_bdaddr_list_lookup(struct list_head *bdaddr_list, bdaddr_t *bdaddr, u8 type) { struct bdaddr_list *b; list_for_each_entry(b, bdaddr_list, list) { if (!bacmp(&b->bdaddr, bdaddr) && b->bdaddr_type == type) return b; } return NULL; } struct bdaddr_list_with_irk *hci_bdaddr_list_lookup_with_irk( struct list_head *bdaddr_list, bdaddr_t *bdaddr, u8 type) { struct bdaddr_list_with_irk *b; list_for_each_entry(b, bdaddr_list, list) { if (!bacmp(&b->bdaddr, bdaddr) && b->bdaddr_type == type) return b; } return NULL; } struct bdaddr_list_with_flags * hci_bdaddr_list_lookup_with_flags(struct list_head *bdaddr_list, bdaddr_t *bdaddr, u8 type) { struct bdaddr_list_with_flags *b; list_for_each_entry(b, bdaddr_list, list) { if (!bacmp(&b->bdaddr, bdaddr) && b->bdaddr_type == type) return b; } return NULL; } void hci_bdaddr_list_clear(struct list_head *bdaddr_list) { struct bdaddr_list *b, *n; list_for_each_entry_safe(b, n, bdaddr_list, list) { list_del(&b->list); kfree(b); } } int hci_bdaddr_list_add(struct list_head *list, bdaddr_t *bdaddr, u8 type) { struct bdaddr_list *entry; if (!bacmp(bdaddr, BDADDR_ANY)) return -EBADF; if (hci_bdaddr_list_lookup(list, bdaddr, type)) return -EEXIST; entry = kzalloc(sizeof(*entry), GFP_KERNEL); if (!entry) return -ENOMEM; bacpy(&entry->bdaddr, bdaddr); entry->bdaddr_type = type; list_add(&entry->list, list); return 0; } int hci_bdaddr_list_add_with_irk(struct list_head *list, bdaddr_t *bdaddr, u8 type, u8 *peer_irk, u8 *local_irk) { struct bdaddr_list_with_irk *entry; if (!bacmp(bdaddr, BDADDR_ANY)) return -EBADF; if (hci_bdaddr_list_lookup(list, bdaddr, type)) return -EEXIST; entry = kzalloc(sizeof(*entry), GFP_KERNEL); if (!entry) return -ENOMEM; bacpy(&entry->bdaddr, bdaddr); entry->bdaddr_type = type; if (peer_irk) memcpy(entry->peer_irk, peer_irk, 16); if (local_irk) memcpy(entry->local_irk, local_irk, 16); list_add(&entry->list, list); return 0; } int hci_bdaddr_list_add_with_flags(struct list_head *list, bdaddr_t *bdaddr, u8 type, u32 flags) { struct bdaddr_list_with_flags *entry; if (!bacmp(bdaddr, BDADDR_ANY)) return -EBADF; if (hci_bdaddr_list_lookup(list, bdaddr, type)) return -EEXIST; entry = kzalloc(sizeof(*entry), GFP_KERNEL); if (!entry) return -ENOMEM; bacpy(&entry->bdaddr, bdaddr); entry->bdaddr_type = type; entry->flags = flags; list_add(&entry->list, list); return 0; } int hci_bdaddr_list_del(struct list_head *list, bdaddr_t *bdaddr, u8 type) { struct bdaddr_list *entry; if (!bacmp(bdaddr, BDADDR_ANY)) { hci_bdaddr_list_clear(list); return 0; } entry = hci_bdaddr_list_lookup(list, bdaddr, type); if (!entry) return -ENOENT; list_del(&entry->list); kfree(entry); return 0; } int hci_bdaddr_list_del_with_irk(struct list_head *list, bdaddr_t *bdaddr, u8 type) { struct bdaddr_list_with_irk *entry; if (!bacmp(bdaddr, BDADDR_ANY)) { hci_bdaddr_list_clear(list); return 0; } entry = hci_bdaddr_list_lookup_with_irk(list, bdaddr, type); if (!entry) return -ENOENT; list_del(&entry->list); kfree(entry); return 0; } /* This function requires the caller holds hdev->lock */ struct hci_conn_params *hci_conn_params_lookup(struct hci_dev *hdev, bdaddr_t *addr, u8 addr_type) { struct hci_conn_params *params; list_for_each_entry(params, &hdev->le_conn_params, list) { if (bacmp(¶ms->addr, addr) == 0 && params->addr_type == addr_type) { return params; } } return NULL; } /* This function requires the caller holds hdev->lock or rcu_read_lock */ struct hci_conn_params *hci_pend_le_action_lookup(struct list_head *list, bdaddr_t *addr, u8 addr_type) { struct hci_conn_params *param; rcu_read_lock(); list_for_each_entry_rcu(param, list, action) { if (bacmp(¶m->addr, addr) == 0 && param->addr_type == addr_type) { rcu_read_unlock(); return param; } } rcu_read_unlock(); return NULL; } /* This function requires the caller holds hdev->lock */ void hci_pend_le_list_del_init(struct hci_conn_params *param) { if (list_empty(¶m->action)) return; list_del_rcu(¶m->action); synchronize_rcu(); INIT_LIST_HEAD(¶m->action); } /* This function requires the caller holds hdev->lock */ void hci_pend_le_list_add(struct hci_conn_params *param, struct list_head *list) { list_add_rcu(¶m->action, list); } /* This function requires the caller holds hdev->lock */ struct hci_conn_params *hci_conn_params_add(struct hci_dev *hdev, bdaddr_t *addr, u8 addr_type) { struct hci_conn_params *params; params = hci_conn_params_lookup(hdev, addr, addr_type); if (params) return params; params = kzalloc(sizeof(*params), GFP_KERNEL); if (!params) { bt_dev_err(hdev, "out of memory"); return NULL; } bacpy(¶ms->addr, addr); params->addr_type = addr_type; list_add(¶ms->list, &hdev->le_conn_params); INIT_LIST_HEAD(¶ms->action); params->conn_min_interval = hdev->le_conn_min_interval; params->conn_max_interval = hdev->le_conn_max_interval; params->conn_latency = hdev->le_conn_latency; params->supervision_timeout = hdev->le_supv_timeout; params->auto_connect = HCI_AUTO_CONN_DISABLED; BT_DBG("addr %pMR (type %u)", addr, addr_type); return params; } void hci_conn_params_free(struct hci_conn_params *params) { hci_pend_le_list_del_init(params); if (params->conn) { hci_conn_drop(params->conn); hci_conn_put(params->conn); } list_del(¶ms->list); kfree(params); } /* This function requires the caller holds hdev->lock */ void hci_conn_params_del(struct hci_dev *hdev, bdaddr_t *addr, u8 addr_type) { struct hci_conn_params *params; params = hci_conn_params_lookup(hdev, addr, addr_type); if (!params) return; hci_conn_params_free(params); hci_update_passive_scan(hdev); BT_DBG("addr %pMR (type %u)", addr, addr_type); } /* This function requires the caller holds hdev->lock */ void hci_conn_params_clear_disabled(struct hci_dev *hdev) { struct hci_conn_params *params, *tmp; list_for_each_entry_safe(params, tmp, &hdev->le_conn_params, list) { if (params->auto_connect != HCI_AUTO_CONN_DISABLED) continue; /* If trying to establish one time connection to disabled * device, leave the params, but mark them as just once. */ if (params->explicit_connect) { params->auto_connect = HCI_AUTO_CONN_EXPLICIT; continue; } hci_conn_params_free(params); } BT_DBG("All LE disabled connection parameters were removed"); } /* This function requires the caller holds hdev->lock */ static void hci_conn_params_clear_all(struct hci_dev *hdev) { struct hci_conn_params *params, *tmp; list_for_each_entry_safe(params, tmp, &hdev->le_conn_params, list) hci_conn_params_free(params); BT_DBG("All LE connection parameters were removed"); } /* Copy the Identity Address of the controller. * * If the controller has a public BD_ADDR, then by default use that one. * If this is a LE only controller without a public address, default to * the static random address. * * For debugging purposes it is possible to force controllers with a * public address to use the static random address instead. * * In case BR/EDR has been disabled on a dual-mode controller and * userspace has configured a static address, then that address * becomes the identity address instead of the public BR/EDR address. */ void hci_copy_identity_address(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 *bdaddr_type) { if (hci_dev_test_flag(hdev, HCI_FORCE_STATIC_ADDR) || !bacmp(&hdev->bdaddr, BDADDR_ANY) || (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED) && bacmp(&hdev->static_addr, BDADDR_ANY))) { bacpy(bdaddr, &hdev->static_addr); *bdaddr_type = ADDR_LE_DEV_RANDOM; } else { bacpy(bdaddr, &hdev->bdaddr); *bdaddr_type = ADDR_LE_DEV_PUBLIC; } } static void hci_clear_wake_reason(struct hci_dev *hdev) { hci_dev_lock(hdev); hdev->wake_reason = 0; bacpy(&hdev->wake_addr, BDADDR_ANY); hdev->wake_addr_type = 0; hci_dev_unlock(hdev); } static int hci_suspend_notifier(struct notifier_block *nb, unsigned long action, void *data) { struct hci_dev *hdev = container_of(nb, struct hci_dev, suspend_notifier); int ret = 0; /* Userspace has full control of this device. Do nothing. */ if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) return NOTIFY_DONE; /* To avoid a potential race with hci_unregister_dev. */ hci_dev_hold(hdev); switch (action) { case PM_HIBERNATION_PREPARE: case PM_SUSPEND_PREPARE: ret = hci_suspend_dev(hdev); break; case PM_POST_HIBERNATION: case PM_POST_SUSPEND: ret = hci_resume_dev(hdev); break; } if (ret) bt_dev_err(hdev, "Suspend notifier action (%lu) failed: %d", action, ret); hci_dev_put(hdev); return NOTIFY_DONE; } /* Alloc HCI device */ struct hci_dev *hci_alloc_dev_priv(int sizeof_priv) { struct hci_dev *hdev; unsigned int alloc_size; alloc_size = sizeof(*hdev); if (sizeof_priv) { /* Fixme: May need ALIGN-ment? */ alloc_size += sizeof_priv; } hdev = kzalloc(alloc_size, GFP_KERNEL); if (!hdev) return NULL; hdev->pkt_type = (HCI_DM1 | HCI_DH1 | HCI_HV1); hdev->esco_type = (ESCO_HV1); hdev->link_mode = (HCI_LM_ACCEPT); hdev->num_iac = 0x01; /* One IAC support is mandatory */ hdev->io_capability = 0x03; /* No Input No Output */ hdev->manufacturer = 0xffff; /* Default to internal use */ hdev->inq_tx_power = HCI_TX_POWER_INVALID; hdev->adv_tx_power = HCI_TX_POWER_INVALID; hdev->adv_instance_cnt = 0; hdev->cur_adv_instance = 0x00; hdev->adv_instance_timeout = 0; hdev->advmon_allowlist_duration = 300; hdev->advmon_no_filter_duration = 500; hdev->enable_advmon_interleave_scan = 0x00; /* Default to disable */ hdev->sniff_max_interval = 800; hdev->sniff_min_interval = 80; hdev->le_adv_channel_map = 0x07; hdev->le_adv_min_interval = 0x0800; hdev->le_adv_max_interval = 0x0800; hdev->le_scan_interval = DISCOV_LE_SCAN_INT_FAST; hdev->le_scan_window = DISCOV_LE_SCAN_WIN_FAST; hdev->le_scan_int_suspend = DISCOV_LE_SCAN_INT_SLOW1; hdev->le_scan_window_suspend = DISCOV_LE_SCAN_WIN_SLOW1; hdev->le_scan_int_discovery = DISCOV_LE_SCAN_INT; hdev->le_scan_window_discovery = DISCOV_LE_SCAN_WIN; hdev->le_scan_int_adv_monitor = DISCOV_LE_SCAN_INT_FAST; hdev->le_scan_window_adv_monitor = DISCOV_LE_SCAN_WIN_FAST; hdev->le_scan_int_connect = DISCOV_LE_SCAN_INT_CONN; hdev->le_scan_window_connect = DISCOV_LE_SCAN_WIN_CONN; hdev->le_conn_min_interval = 0x0018; hdev->le_conn_max_interval = 0x0028; hdev->le_conn_latency = 0x0000; hdev->le_supv_timeout = 0x002a; hdev->le_def_tx_len = 0x001b; hdev->le_def_tx_time = 0x0148; hdev->le_max_tx_len = 0x001b; hdev->le_max_tx_time = 0x0148; hdev->le_max_rx_len = 0x001b; hdev->le_max_rx_time = 0x0148; hdev->le_max_key_size = SMP_MAX_ENC_KEY_SIZE; hdev->le_min_key_size = SMP_MIN_ENC_KEY_SIZE; hdev->le_tx_def_phys = HCI_LE_SET_PHY_1M; hdev->le_rx_def_phys = HCI_LE_SET_PHY_1M; hdev->le_num_of_adv_sets = HCI_MAX_ADV_INSTANCES; hdev->def_multi_adv_rotation_duration = HCI_DEFAULT_ADV_DURATION; hdev->def_le_autoconnect_timeout = HCI_LE_CONN_TIMEOUT; hdev->min_le_tx_power = HCI_TX_POWER_INVALID; hdev->max_le_tx_power = HCI_TX_POWER_INVALID; hdev->rpa_timeout = HCI_DEFAULT_RPA_TIMEOUT; hdev->discov_interleaved_timeout = DISCOV_INTERLEAVED_TIMEOUT; hdev->conn_info_min_age = DEFAULT_CONN_INFO_MIN_AGE; hdev->conn_info_max_age = DEFAULT_CONN_INFO_MAX_AGE; hdev->auth_payload_timeout = DEFAULT_AUTH_PAYLOAD_TIMEOUT; hdev->min_enc_key_size = HCI_MIN_ENC_KEY_SIZE; /* default 1.28 sec page scan */ hdev->def_page_scan_type = PAGE_SCAN_TYPE_STANDARD; hdev->def_page_scan_int = 0x0800; hdev->def_page_scan_window = 0x0012; mutex_init(&hdev->lock); mutex_init(&hdev->req_lock); ida_init(&hdev->unset_handle_ida); INIT_LIST_HEAD(&hdev->mesh_pending); INIT_LIST_HEAD(&hdev->mgmt_pending); INIT_LIST_HEAD(&hdev->reject_list); INIT_LIST_HEAD(&hdev->accept_list); INIT_LIST_HEAD(&hdev->uuids); INIT_LIST_HEAD(&hdev->link_keys); INIT_LIST_HEAD(&hdev->long_term_keys); INIT_LIST_HEAD(&hdev->identity_resolving_keys); INIT_LIST_HEAD(&hdev->remote_oob_data); INIT_LIST_HEAD(&hdev->le_accept_list); INIT_LIST_HEAD(&hdev->le_resolv_list); INIT_LIST_HEAD(&hdev->le_conn_params); INIT_LIST_HEAD(&hdev->pend_le_conns); INIT_LIST_HEAD(&hdev->pend_le_reports); INIT_LIST_HEAD(&hdev->conn_hash.list); INIT_LIST_HEAD(&hdev->adv_instances); INIT_LIST_HEAD(&hdev->blocked_keys); INIT_LIST_HEAD(&hdev->monitored_devices); INIT_LIST_HEAD(&hdev->local_codecs); INIT_WORK(&hdev->rx_work, hci_rx_work); INIT_WORK(&hdev->cmd_work, hci_cmd_work); INIT_WORK(&hdev->tx_work, hci_tx_work); INIT_WORK(&hdev->power_on, hci_power_on); INIT_WORK(&hdev->error_reset, hci_error_reset); hci_cmd_sync_init(hdev); INIT_DELAYED_WORK(&hdev->power_off, hci_power_off); skb_queue_head_init(&hdev->rx_q); skb_queue_head_init(&hdev->cmd_q); skb_queue_head_init(&hdev->raw_q); init_waitqueue_head(&hdev->req_wait_q); INIT_DELAYED_WORK(&hdev->cmd_timer, hci_cmd_timeout); INIT_DELAYED_WORK(&hdev->ncmd_timer, hci_ncmd_timeout); hci_devcd_setup(hdev); hci_init_sysfs(hdev); discovery_init(hdev); return hdev; } EXPORT_SYMBOL(hci_alloc_dev_priv); /* Free HCI device */ void hci_free_dev(struct hci_dev *hdev) { /* will free via device release */ put_device(&hdev->dev); } EXPORT_SYMBOL(hci_free_dev); /* Register HCI device */ int hci_register_dev(struct hci_dev *hdev) { int id, error; if (!hdev->open || !hdev->close || !hdev->send) return -EINVAL; id = ida_alloc_max(&hci_index_ida, HCI_MAX_ID - 1, GFP_KERNEL); if (id < 0) return id; error = dev_set_name(&hdev->dev, "hci%u", id); if (error) return error; hdev->name = dev_name(&hdev->dev); hdev->id = id; BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus); hdev->workqueue = alloc_ordered_workqueue("%s", WQ_HIGHPRI, hdev->name); if (!hdev->workqueue) { error = -ENOMEM; goto err; } hdev->req_workqueue = alloc_ordered_workqueue("%s", WQ_HIGHPRI, hdev->name); if (!hdev->req_workqueue) { destroy_workqueue(hdev->workqueue); error = -ENOMEM; goto err; } if (!IS_ERR_OR_NULL(bt_debugfs)) hdev->debugfs = debugfs_create_dir(hdev->name, bt_debugfs); error = device_add(&hdev->dev); if (error < 0) goto err_wqueue; hci_leds_init(hdev); hdev->rfkill = rfkill_alloc(hdev->name, &hdev->dev, RFKILL_TYPE_BLUETOOTH, &hci_rfkill_ops, hdev); if (hdev->rfkill) { if (rfkill_register(hdev->rfkill) < 0) { rfkill_destroy(hdev->rfkill); hdev->rfkill = NULL; } } if (hdev->rfkill && rfkill_blocked(hdev->rfkill)) hci_dev_set_flag(hdev, HCI_RFKILLED); hci_dev_set_flag(hdev, HCI_SETUP); hci_dev_set_flag(hdev, HCI_AUTO_OFF); /* Assume BR/EDR support until proven otherwise (such as * through reading supported features during init. */ hci_dev_set_flag(hdev, HCI_BREDR_ENABLED); write_lock(&hci_dev_list_lock); list_add(&hdev->list, &hci_dev_list); write_unlock(&hci_dev_list_lock); /* Devices that are marked for raw-only usage are unconfigured * and should not be included in normal operation. */ if (test_bit(HCI_QUIRK_RAW_DEVICE, &hdev->quirks)) hci_dev_set_flag(hdev, HCI_UNCONFIGURED); /* Mark Remote Wakeup connection flag as supported if driver has wakeup * callback. */ if (hdev->wakeup) hdev->conn_flags |= HCI_CONN_FLAG_REMOTE_WAKEUP; hci_sock_dev_event(hdev, HCI_DEV_REG); hci_dev_hold(hdev); error = hci_register_suspend_notifier(hdev); if (error) BT_WARN("register suspend notifier failed error:%d\n", error); queue_work(hdev->req_workqueue, &hdev->power_on); idr_init(&hdev->adv_monitors_idr); msft_register(hdev); return id; err_wqueue: debugfs_remove_recursive(hdev->debugfs); destroy_workqueue(hdev->workqueue); destroy_workqueue(hdev->req_workqueue); err: ida_free(&hci_index_ida, hdev->id); return error; } EXPORT_SYMBOL(hci_register_dev); /* Unregister HCI device */ void hci_unregister_dev(struct hci_dev *hdev) { BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus); mutex_lock(&hdev->unregister_lock); hci_dev_set_flag(hdev, HCI_UNREGISTER); mutex_unlock(&hdev->unregister_lock); write_lock(&hci_dev_list_lock); list_del(&hdev->list); write_unlock(&hci_dev_list_lock); disable_work_sync(&hdev->rx_work); disable_work_sync(&hdev->cmd_work); disable_work_sync(&hdev->tx_work); disable_work_sync(&hdev->power_on); disable_work_sync(&hdev->error_reset); hci_cmd_sync_clear(hdev); hci_unregister_suspend_notifier(hdev); hci_dev_do_close(hdev); if (!test_bit(HCI_INIT, &hdev->flags) && !hci_dev_test_flag(hdev, HCI_SETUP) && !hci_dev_test_flag(hdev, HCI_CONFIG)) { hci_dev_lock(hdev); mgmt_index_removed(hdev); hci_dev_unlock(hdev); } /* mgmt_index_removed should take care of emptying the * pending list */ BUG_ON(!list_empty(&hdev->mgmt_pending)); hci_sock_dev_event(hdev, HCI_DEV_UNREG); if (hdev->rfkill) { rfkill_unregister(hdev->rfkill); rfkill_destroy(hdev->rfkill); } device_del(&hdev->dev); /* Actual cleanup is deferred until hci_release_dev(). */ hci_dev_put(hdev); } EXPORT_SYMBOL(hci_unregister_dev); /* Release HCI device */ void hci_release_dev(struct hci_dev *hdev) { debugfs_remove_recursive(hdev->debugfs); kfree_const(hdev->hw_info); kfree_const(hdev->fw_info); destroy_workqueue(hdev->workqueue); destroy_workqueue(hdev->req_workqueue); hci_dev_lock(hdev); hci_bdaddr_list_clear(&hdev->reject_list); hci_bdaddr_list_clear(&hdev->accept_list); hci_uuids_clear(hdev); hci_link_keys_clear(hdev); hci_smp_ltks_clear(hdev); hci_smp_irks_clear(hdev); hci_remote_oob_data_clear(hdev); hci_adv_instances_clear(hdev); hci_adv_monitors_clear(hdev); hci_bdaddr_list_clear(&hdev->le_accept_list); hci_bdaddr_list_clear(&hdev->le_resolv_list); hci_conn_params_clear_all(hdev); hci_discovery_filter_clear(hdev); hci_blocked_keys_clear(hdev); hci_codec_list_clear(&hdev->local_codecs); msft_release(hdev); hci_dev_unlock(hdev); ida_destroy(&hdev->unset_handle_ida); ida_free(&hci_index_ida, hdev->id); kfree_skb(hdev->sent_cmd); kfree_skb(hdev->req_skb); kfree_skb(hdev->recv_event); kfree(hdev); } EXPORT_SYMBOL(hci_release_dev); int hci_register_suspend_notifier(struct hci_dev *hdev) { int ret = 0; if (!hdev->suspend_notifier.notifier_call && !test_bit(HCI_QUIRK_NO_SUSPEND_NOTIFIER, &hdev->quirks)) { hdev->suspend_notifier.notifier_call = hci_suspend_notifier; ret = register_pm_notifier(&hdev->suspend_notifier); } return ret; } int hci_unregister_suspend_notifier(struct hci_dev *hdev) { int ret = 0; if (hdev->suspend_notifier.notifier_call) { ret = unregister_pm_notifier(&hdev->suspend_notifier); if (!ret) hdev->suspend_notifier.notifier_call = NULL; } return ret; } /* Cancel ongoing command synchronously: * * - Cancel command timer * - Reset command counter * - Cancel command request */ static void hci_cancel_cmd_sync(struct hci_dev *hdev, int err) { bt_dev_dbg(hdev, "err 0x%2.2x", err); if (hci_dev_test_flag(hdev, HCI_UNREGISTER)) { disable_delayed_work_sync(&hdev->cmd_timer); disable_delayed_work_sync(&hdev->ncmd_timer); } else { cancel_delayed_work_sync(&hdev->cmd_timer); cancel_delayed_work_sync(&hdev->ncmd_timer); } atomic_set(&hdev->cmd_cnt, 1); hci_cmd_sync_cancel_sync(hdev, err); } /* Suspend HCI device */ int hci_suspend_dev(struct hci_dev *hdev) { int ret; bt_dev_dbg(hdev, ""); /* Suspend should only act on when powered. */ if (!hdev_is_powered(hdev) || hci_dev_test_flag(hdev, HCI_UNREGISTER)) return 0; /* If powering down don't attempt to suspend */ if (mgmt_powering_down(hdev)) return 0; /* Cancel potentially blocking sync operation before suspend */ hci_cancel_cmd_sync(hdev, EHOSTDOWN); hci_req_sync_lock(hdev); ret = hci_suspend_sync(hdev); hci_req_sync_unlock(hdev); hci_clear_wake_reason(hdev); mgmt_suspending(hdev, hdev->suspend_state); hci_sock_dev_event(hdev, HCI_DEV_SUSPEND); return ret; } EXPORT_SYMBOL(hci_suspend_dev); /* Resume HCI device */ int hci_resume_dev(struct hci_dev *hdev) { int ret; bt_dev_dbg(hdev, ""); /* Resume should only act on when powered. */ if (!hdev_is_powered(hdev) || hci_dev_test_flag(hdev, HCI_UNREGISTER)) return 0; /* If powering down don't attempt to resume */ if (mgmt_powering_down(hdev)) return 0; hci_req_sync_lock(hdev); ret = hci_resume_sync(hdev); hci_req_sync_unlock(hdev); mgmt_resuming(hdev, hdev->wake_reason, &hdev->wake_addr, hdev->wake_addr_type); hci_sock_dev_event(hdev, HCI_DEV_RESUME); return ret; } EXPORT_SYMBOL(hci_resume_dev); /* Reset HCI device */ int hci_reset_dev(struct hci_dev *hdev) { static const u8 hw_err[] = { HCI_EV_HARDWARE_ERROR, 0x01, 0x00 }; struct sk_buff *skb; skb = bt_skb_alloc(3, GFP_ATOMIC); if (!skb) return -ENOMEM; hci_skb_pkt_type(skb) = HCI_EVENT_PKT; skb_put_data(skb, hw_err, 3); bt_dev_err(hdev, "Injecting HCI hardware error event"); /* Send Hardware Error to upper stack */ return hci_recv_frame(hdev, skb); } EXPORT_SYMBOL(hci_reset_dev); static u8 hci_dev_classify_pkt_type(struct hci_dev *hdev, struct sk_buff *skb) { if (hdev->classify_pkt_type) return hdev->classify_pkt_type(hdev, skb); return hci_skb_pkt_type(skb); } /* Receive frame from HCI drivers */ int hci_recv_frame(struct hci_dev *hdev, struct sk_buff *skb) { u8 dev_pkt_type; if (!hdev || (!test_bit(HCI_UP, &hdev->flags) && !test_bit(HCI_INIT, &hdev->flags))) { kfree_skb(skb); return -ENXIO; } /* Check if the driver agree with packet type classification */ dev_pkt_type = hci_dev_classify_pkt_type(hdev, skb); if (hci_skb_pkt_type(skb) != dev_pkt_type) { hci_skb_pkt_type(skb) = dev_pkt_type; } switch (hci_skb_pkt_type(skb)) { case HCI_EVENT_PKT: break; case HCI_ACLDATA_PKT: /* Detect if ISO packet has been sent as ACL */ if (hci_conn_num(hdev, ISO_LINK)) { __u16 handle = __le16_to_cpu(hci_acl_hdr(skb)->handle); __u8 type; type = hci_conn_lookup_type(hdev, hci_handle(handle)); if (type == ISO_LINK) hci_skb_pkt_type(skb) = HCI_ISODATA_PKT; } break; case HCI_SCODATA_PKT: break; case HCI_ISODATA_PKT: break; default: kfree_skb(skb); return -EINVAL; } /* Incoming skb */ bt_cb(skb)->incoming = 1; /* Time stamp */ __net_timestamp(skb); skb_queue_tail(&hdev->rx_q, skb); queue_work(hdev->workqueue, &hdev->rx_work); return 0; } EXPORT_SYMBOL(hci_recv_frame); /* Receive diagnostic message from HCI drivers */ int hci_recv_diag(struct hci_dev *hdev, struct sk_buff *skb) { /* Mark as diagnostic packet */ hci_skb_pkt_type(skb) = HCI_DIAG_PKT; /* Time stamp */ __net_timestamp(skb); skb_queue_tail(&hdev->rx_q, skb); queue_work(hdev->workqueue, &hdev->rx_work); return 0; } EXPORT_SYMBOL(hci_recv_diag); void hci_set_hw_info(struct hci_dev *hdev, const char *fmt, ...) { va_list vargs; va_start(vargs, fmt); kfree_const(hdev->hw_info); hdev->hw_info = kvasprintf_const(GFP_KERNEL, fmt, vargs); va_end(vargs); } EXPORT_SYMBOL(hci_set_hw_info); void hci_set_fw_info(struct hci_dev *hdev, const char *fmt, ...) { va_list vargs; va_start(vargs, fmt); kfree_const(hdev->fw_info); hdev->fw_info = kvasprintf_const(GFP_KERNEL, fmt, vargs); va_end(vargs); } EXPORT_SYMBOL(hci_set_fw_info); /* ---- Interface to upper protocols ---- */ int hci_register_cb(struct hci_cb *cb) { BT_DBG("%p name %s", cb, cb->name); mutex_lock(&hci_cb_list_lock); list_add_tail(&cb->list, &hci_cb_list); mutex_unlock(&hci_cb_list_lock); return 0; } EXPORT_SYMBOL(hci_register_cb); int hci_unregister_cb(struct hci_cb *cb) { BT_DBG("%p name %s", cb, cb->name); mutex_lock(&hci_cb_list_lock); list_del(&cb->list); mutex_unlock(&hci_cb_list_lock); return 0; } EXPORT_SYMBOL(hci_unregister_cb); static int hci_send_frame(struct hci_dev *hdev, struct sk_buff *skb) { int err; BT_DBG("%s type %d len %d", hdev->name, hci_skb_pkt_type(skb), skb->len); /* Time stamp */ __net_timestamp(skb); /* Send copy to monitor */ hci_send_to_monitor(hdev, skb); if (atomic_read(&hdev->promisc)) { /* Send copy to the sockets */ hci_send_to_sock(hdev, skb); } /* Get rid of skb owner, prior to sending to the driver. */ skb_orphan(skb); if (!test_bit(HCI_RUNNING, &hdev->flags)) { kfree_skb(skb); return -EINVAL; } err = hdev->send(hdev, skb); if (err < 0) { bt_dev_err(hdev, "sending frame failed (%d)", err); kfree_skb(skb); return err; } return 0; } static int hci_send_conn_frame(struct hci_dev *hdev, struct hci_conn *conn, struct sk_buff *skb) { hci_conn_tx_queue(conn, skb); return hci_send_frame(hdev, skb); } /* Send HCI command */ int hci_send_cmd(struct hci_dev *hdev, __u16 opcode, __u32 plen, const void *param) { struct sk_buff *skb; BT_DBG("%s opcode 0x%4.4x plen %d", hdev->name, opcode, plen); skb = hci_cmd_sync_alloc(hdev, opcode, plen, param, NULL); if (!skb) { bt_dev_err(hdev, "no memory for command"); return -ENOMEM; } /* Stand-alone HCI commands must be flagged as * single-command requests. */ bt_cb(skb)->hci.req_flags |= HCI_REQ_START; skb_queue_tail(&hdev->cmd_q, skb); queue_work(hdev->workqueue, &hdev->cmd_work); return 0; } int __hci_cmd_send(struct hci_dev *hdev, u16 opcode, u32 plen, const void *param) { struct sk_buff *skb; if (hci_opcode_ogf(opcode) != 0x3f) { /* A controller receiving a command shall respond with either * a Command Status Event or a Command Complete Event. * Therefore, all standard HCI commands must be sent via the * standard API, using hci_send_cmd or hci_cmd_sync helpers. * Some vendors do not comply with this rule for vendor-specific * commands and do not return any event. We want to support * unresponded commands for such cases only. */ bt_dev_err(hdev, "unresponded command not supported"); return -EINVAL; } skb = hci_cmd_sync_alloc(hdev, opcode, plen, param, NULL); if (!skb) { bt_dev_err(hdev, "no memory for command (opcode 0x%4.4x)", opcode); return -ENOMEM; } hci_send_frame(hdev, skb); return 0; } EXPORT_SYMBOL(__hci_cmd_send); /* Get data from the previously sent command */ static void *hci_cmd_data(struct sk_buff *skb, __u16 opcode) { struct hci_command_hdr *hdr; if (!skb || skb->len < HCI_COMMAND_HDR_SIZE) return NULL; hdr = (void *)skb->data; if (hdr->opcode != cpu_to_le16(opcode)) return NULL; return skb->data + HCI_COMMAND_HDR_SIZE; } /* Get data from the previously sent command */ void *hci_sent_cmd_data(struct hci_dev *hdev, __u16 opcode) { void *data; /* Check if opcode matches last sent command */ data = hci_cmd_data(hdev->sent_cmd, opcode); if (!data) /* Check if opcode matches last request */ data = hci_cmd_data(hdev->req_skb, opcode); return data; } /* Get data from last received event */ void *hci_recv_event_data(struct hci_dev *hdev, __u8 event) { struct hci_event_hdr *hdr; int offset; if (!hdev->recv_event) return NULL; hdr = (void *)hdev->recv_event->data; offset = sizeof(*hdr); if (hdr->evt != event) { /* In case of LE metaevent check the subevent match */ if (hdr->evt == HCI_EV_LE_META) { struct hci_ev_le_meta *ev; ev = (void *)hdev->recv_event->data + offset; offset += sizeof(*ev); if (ev->subevent == event) goto found; } return NULL; } found: bt_dev_dbg(hdev, "event 0x%2.2x", event); return hdev->recv_event->data + offset; } /* Send ACL data */ static void hci_add_acl_hdr(struct sk_buff *skb, __u16 handle, __u16 flags) { struct hci_acl_hdr *hdr; int len = skb->len; skb_push(skb, HCI_ACL_HDR_SIZE); skb_reset_transport_header(skb); hdr = (struct hci_acl_hdr *)skb_transport_header(skb); hdr->handle = cpu_to_le16(hci_handle_pack(handle, flags)); hdr->dlen = cpu_to_le16(len); } static void hci_queue_acl(struct hci_chan *chan, struct sk_buff_head *queue, struct sk_buff *skb, __u16 flags) { struct hci_conn *conn = chan->conn; struct hci_dev *hdev = conn->hdev; struct sk_buff *list; skb->len = skb_headlen(skb); skb->data_len = 0; hci_skb_pkt_type(skb) = HCI_ACLDATA_PKT; hci_add_acl_hdr(skb, conn->handle, flags); list = skb_shinfo(skb)->frag_list; if (!list) { /* Non fragmented */ BT_DBG("%s nonfrag skb %p len %d", hdev->name, skb, skb->len); skb_queue_tail(queue, skb); } else { /* Fragmented */ BT_DBG("%s frag %p len %d", hdev->name, skb, skb->len); skb_shinfo(skb)->frag_list = NULL; /* Queue all fragments atomically. We need to use spin_lock_bh * here because of 6LoWPAN links, as there this function is * called from softirq and using normal spin lock could cause * deadlocks. */ spin_lock_bh(&queue->lock); __skb_queue_tail(queue, skb); flags &= ~ACL_START; flags |= ACL_CONT; do { skb = list; list = list->next; hci_skb_pkt_type(skb) = HCI_ACLDATA_PKT; hci_add_acl_hdr(skb, conn->handle, flags); BT_DBG("%s frag %p len %d", hdev->name, skb, skb->len); __skb_queue_tail(queue, skb); } while (list); spin_unlock_bh(&queue->lock); } } void hci_send_acl(struct hci_chan *chan, struct sk_buff *skb, __u16 flags) { struct hci_dev *hdev = chan->conn->hdev; BT_DBG("%s chan %p flags 0x%4.4x", hdev->name, chan, flags); hci_queue_acl(chan, &chan->data_q, skb, flags); queue_work(hdev->workqueue, &hdev->tx_work); } /* Send SCO data */ void hci_send_sco(struct hci_conn *conn, struct sk_buff *skb) { struct hci_dev *hdev = conn->hdev; struct hci_sco_hdr hdr; BT_DBG("%s len %d", hdev->name, skb->len); hdr.handle = cpu_to_le16(conn->handle); hdr.dlen = skb->len; skb_push(skb, HCI_SCO_HDR_SIZE); skb_reset_transport_header(skb); memcpy(skb_transport_header(skb), &hdr, HCI_SCO_HDR_SIZE); hci_skb_pkt_type(skb) = HCI_SCODATA_PKT; skb_queue_tail(&conn->data_q, skb); queue_work(hdev->workqueue, &hdev->tx_work); } /* Send ISO data */ static void hci_add_iso_hdr(struct sk_buff *skb, __u16 handle, __u8 flags) { struct hci_iso_hdr *hdr; int len = skb->len; skb_push(skb, HCI_ISO_HDR_SIZE); skb_reset_transport_header(skb); hdr = (struct hci_iso_hdr *)skb_transport_header(skb); hdr->handle = cpu_to_le16(hci_handle_pack(handle, flags)); hdr->dlen = cpu_to_le16(len); } static void hci_queue_iso(struct hci_conn *conn, struct sk_buff_head *queue, struct sk_buff *skb) { struct hci_dev *hdev = conn->hdev; struct sk_buff *list; __u16 flags; skb->len = skb_headlen(skb); skb->data_len = 0; hci_skb_pkt_type(skb) = HCI_ISODATA_PKT; list = skb_shinfo(skb)->frag_list; flags = hci_iso_flags_pack(list ? ISO_START : ISO_SINGLE, 0x00); hci_add_iso_hdr(skb, conn->handle, flags); if (!list) { /* Non fragmented */ BT_DBG("%s nonfrag skb %p len %d", hdev->name, skb, skb->len); skb_queue_tail(queue, skb); } else { /* Fragmented */ BT_DBG("%s frag %p len %d", hdev->name, skb, skb->len); skb_shinfo(skb)->frag_list = NULL; __skb_queue_tail(queue, skb); do { skb = list; list = list->next; hci_skb_pkt_type(skb) = HCI_ISODATA_PKT; flags = hci_iso_flags_pack(list ? ISO_CONT : ISO_END, 0x00); hci_add_iso_hdr(skb, conn->handle, flags); BT_DBG("%s frag %p len %d", hdev->name, skb, skb->len); __skb_queue_tail(queue, skb); } while (list); } } void hci_send_iso(struct hci_conn *conn, struct sk_buff *skb) { struct hci_dev *hdev = conn->hdev; BT_DBG("%s len %d", hdev->name, skb->len); hci_queue_iso(conn, &conn->data_q, skb); queue_work(hdev->workqueue, &hdev->tx_work); } /* ---- HCI TX task (outgoing data) ---- */ /* HCI Connection scheduler */ static inline void hci_quote_sent(struct hci_conn *conn, int num, int *quote) { struct hci_dev *hdev; int cnt, q; if (!conn) { *quote = 0; return; } hdev = conn->hdev; switch (conn->type) { case ACL_LINK: cnt = hdev->acl_cnt; break; case SCO_LINK: case ESCO_LINK: cnt = hdev->sco_cnt; break; case LE_LINK: cnt = hdev->le_mtu ? hdev->le_cnt : hdev->acl_cnt; break; case ISO_LINK: cnt = hdev->iso_mtu ? hdev->iso_cnt : hdev->le_mtu ? hdev->le_cnt : hdev->acl_cnt; break; default: cnt = 0; bt_dev_err(hdev, "unknown link type %d", conn->type); } q = cnt / num; *quote = q ? q : 1; } static struct hci_conn *hci_low_sent(struct hci_dev *hdev, __u8 type, int *quote) { struct hci_conn_hash *h = &hdev->conn_hash; struct hci_conn *conn = NULL, *c; unsigned int num = 0, min = ~0; /* We don't have to lock device here. Connections are always * added and removed with TX task disabled. */ rcu_read_lock(); list_for_each_entry_rcu(c, &h->list, list) { if (c->type != type || skb_queue_empty(&c->data_q)) continue; if (c->state != BT_CONNECTED && c->state != BT_CONFIG) continue; num++; if (c->sent < min) { min = c->sent; conn = c; } if (hci_conn_num(hdev, type) == num) break; } rcu_read_unlock(); hci_quote_sent(conn, num, quote); BT_DBG("conn %p quote %d", conn, *quote); return conn; } static void hci_link_tx_to(struct hci_dev *hdev, __u8 type) { struct hci_conn_hash *h = &hdev->conn_hash; struct hci_conn *c; bt_dev_err(hdev, "link tx timeout"); rcu_read_lock(); /* Kill stalled connections */ list_for_each_entry_rcu(c, &h->list, list) { if (c->type == type && c->sent) { bt_dev_err(hdev, "killing stalled connection %pMR", &c->dst); /* hci_disconnect might sleep, so, we have to release * the RCU read lock before calling it. */ rcu_read_unlock(); hci_disconnect(c, HCI_ERROR_REMOTE_USER_TERM); rcu_read_lock(); } } rcu_read_unlock(); } static struct hci_chan *hci_chan_sent(struct hci_dev *hdev, __u8 type, int *quote) { struct hci_conn_hash *h = &hdev->conn_hash; struct hci_chan *chan = NULL; unsigned int num = 0, min = ~0, cur_prio = 0; struct hci_conn *conn; int conn_num = 0; BT_DBG("%s", hdev->name); rcu_read_lock(); list_for_each_entry_rcu(conn, &h->list, list) { struct hci_chan *tmp; if (conn->type != type) continue; if (conn->state != BT_CONNECTED && conn->state != BT_CONFIG) continue; conn_num++; list_for_each_entry_rcu(tmp, &conn->chan_list, list) { struct sk_buff *skb; if (skb_queue_empty(&tmp->data_q)) continue; skb = skb_peek(&tmp->data_q); if (skb->priority < cur_prio) continue; if (skb->priority > cur_prio) { num = 0; min = ~0; cur_prio = skb->priority; } num++; if (conn->sent < min) { min = conn->sent; chan = tmp; } } if (hci_conn_num(hdev, type) == conn_num) break; } rcu_read_unlock(); if (!chan) return NULL; hci_quote_sent(chan->conn, num, quote); BT_DBG("chan %p quote %d", chan, *quote); return chan; } static void hci_prio_recalculate(struct hci_dev *hdev, __u8 type) { struct hci_conn_hash *h = &hdev->conn_hash; struct hci_conn *conn; int num = 0; BT_DBG("%s", hdev->name); rcu_read_lock(); list_for_each_entry_rcu(conn, &h->list, list) { struct hci_chan *chan; if (conn->type != type) continue; if (conn->state != BT_CONNECTED && conn->state != BT_CONFIG) continue; num++; list_for_each_entry_rcu(chan, &conn->chan_list, list) { struct sk_buff *skb; if (chan->sent) { chan->sent = 0; continue; } if (skb_queue_empty(&chan->data_q)) continue; skb = skb_peek(&chan->data_q); if (skb->priority >= HCI_PRIO_MAX - 1) continue; skb->priority = HCI_PRIO_MAX - 1; BT_DBG("chan %p skb %p promoted to %d", chan, skb, skb->priority); } if (hci_conn_num(hdev, type) == num) break; } rcu_read_unlock(); } static void __check_timeout(struct hci_dev *hdev, unsigned int cnt, u8 type) { unsigned long last_tx; if (hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) return; switch (type) { case LE_LINK: last_tx = hdev->le_last_tx; break; default: last_tx = hdev->acl_last_tx; break; } /* tx timeout must be longer than maximum link supervision timeout * (40.9 seconds) */ if (!cnt && time_after(jiffies, last_tx + HCI_ACL_TX_TIMEOUT)) hci_link_tx_to(hdev, type); } /* Schedule SCO */ static void hci_sched_sco(struct hci_dev *hdev, __u8 type) { struct hci_conn *conn; struct sk_buff *skb; int quote, *cnt; unsigned int pkts = hdev->sco_pkts; bt_dev_dbg(hdev, "type %u", type); if (!hci_conn_num(hdev, type) || !pkts) return; /* Use sco_pkts if flow control has not been enabled which will limit * the amount of buffer sent in a row. */ if (!hci_dev_test_flag(hdev, HCI_SCO_FLOWCTL)) cnt = &pkts; else cnt = &hdev->sco_cnt; while (*cnt && (conn = hci_low_sent(hdev, type, "e))) { while (quote-- && (skb = skb_dequeue(&conn->data_q))) { BT_DBG("skb %p len %d", skb, skb->len); hci_send_conn_frame(hdev, conn, skb); conn->sent++; if (conn->sent == ~0) conn->sent = 0; (*cnt)--; } } /* Rescheduled if all packets were sent and flow control is not enabled * as there could be more packets queued that could not be sent and * since no HCI_EV_NUM_COMP_PKTS event will be generated the reschedule * needs to be forced. */ if (!pkts && !hci_dev_test_flag(hdev, HCI_SCO_FLOWCTL)) queue_work(hdev->workqueue, &hdev->tx_work); } static void hci_sched_acl_pkt(struct hci_dev *hdev) { unsigned int cnt = hdev->acl_cnt; struct hci_chan *chan; struct sk_buff *skb; int quote; __check_timeout(hdev, cnt, ACL_LINK); while (hdev->acl_cnt && (chan = hci_chan_sent(hdev, ACL_LINK, "e))) { u32 priority = (skb_peek(&chan->data_q))->priority; while (quote-- && (skb = skb_peek(&chan->data_q))) { BT_DBG("chan %p skb %p len %d priority %u", chan, skb, skb->len, skb->priority); /* Stop if priority has changed */ if (skb->priority < priority) break; skb = skb_dequeue(&chan->data_q); hci_conn_enter_active_mode(chan->conn, bt_cb(skb)->force_active); hci_send_conn_frame(hdev, chan->conn, skb); hdev->acl_last_tx = jiffies; hdev->acl_cnt--; chan->sent++; chan->conn->sent++; /* Send pending SCO packets right away */ hci_sched_sco(hdev, SCO_LINK); hci_sched_sco(hdev, ESCO_LINK); } } if (cnt != hdev->acl_cnt) hci_prio_recalculate(hdev, ACL_LINK); } static void hci_sched_acl(struct hci_dev *hdev) { BT_DBG("%s", hdev->name); /* No ACL link over BR/EDR controller */ if (!hci_conn_num(hdev, ACL_LINK)) return; hci_sched_acl_pkt(hdev); } static void hci_sched_le(struct hci_dev *hdev) { struct hci_chan *chan; struct sk_buff *skb; int quote, *cnt, tmp; BT_DBG("%s", hdev->name); if (!hci_conn_num(hdev, LE_LINK)) return; cnt = hdev->le_pkts ? &hdev->le_cnt : &hdev->acl_cnt; __check_timeout(hdev, *cnt, LE_LINK); tmp = *cnt; while (*cnt && (chan = hci_chan_sent(hdev, LE_LINK, "e))) { u32 priority = (skb_peek(&chan->data_q))->priority; while (quote-- && (skb = skb_peek(&chan->data_q))) { BT_DBG("chan %p skb %p len %d priority %u", chan, skb, skb->len, skb->priority); /* Stop if priority has changed */ if (skb->priority < priority) break; skb = skb_dequeue(&chan->data_q); hci_send_conn_frame(hdev, chan->conn, skb); hdev->le_last_tx = jiffies; (*cnt)--; chan->sent++; chan->conn->sent++; /* Send pending SCO packets right away */ hci_sched_sco(hdev, SCO_LINK); hci_sched_sco(hdev, ESCO_LINK); } } if (*cnt != tmp) hci_prio_recalculate(hdev, LE_LINK); } /* Schedule CIS */ static void hci_sched_iso(struct hci_dev *hdev) { struct hci_conn *conn; struct sk_buff *skb; int quote, *cnt; BT_DBG("%s", hdev->name); if (!hci_conn_num(hdev, ISO_LINK)) return; cnt = hdev->iso_pkts ? &hdev->iso_cnt : hdev->le_pkts ? &hdev->le_cnt : &hdev->acl_cnt; while (*cnt && (conn = hci_low_sent(hdev, ISO_LINK, "e))) { while (quote-- && (skb = skb_dequeue(&conn->data_q))) { BT_DBG("skb %p len %d", skb, skb->len); hci_send_conn_frame(hdev, conn, skb); conn->sent++; if (conn->sent == ~0) conn->sent = 0; (*cnt)--; } } } static void hci_tx_work(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, tx_work); struct sk_buff *skb; BT_DBG("%s acl %d sco %d le %d iso %d", hdev->name, hdev->acl_cnt, hdev->sco_cnt, hdev->le_cnt, hdev->iso_cnt); if (!hci_dev_test_flag(hdev, HCI_USER_CHANNEL)) { /* Schedule queues and send stuff to HCI driver */ hci_sched_sco(hdev, SCO_LINK); hci_sched_sco(hdev, ESCO_LINK); hci_sched_iso(hdev); hci_sched_acl(hdev); hci_sched_le(hdev); } /* Send next queued raw (unknown type) packet */ while ((skb = skb_dequeue(&hdev->raw_q))) hci_send_frame(hdev, skb); } /* ----- HCI RX task (incoming data processing) ----- */ /* ACL data packet */ static void hci_acldata_packet(struct hci_dev *hdev, struct sk_buff *skb) { struct hci_acl_hdr *hdr; struct hci_conn *conn; __u16 handle, flags; hdr = skb_pull_data(skb, sizeof(*hdr)); if (!hdr) { bt_dev_err(hdev, "ACL packet too small"); goto drop; } handle = __le16_to_cpu(hdr->handle); flags = hci_flags(handle); handle = hci_handle(handle); bt_dev_dbg(hdev, "len %d handle 0x%4.4x flags 0x%4.4x", skb->len, handle, flags); hdev->stat.acl_rx++; hci_dev_lock(hdev); conn = hci_conn_hash_lookup_handle(hdev, handle); hci_dev_unlock(hdev); if (conn) { hci_conn_enter_active_mode(conn, BT_POWER_FORCE_ACTIVE_OFF); /* Send to upper protocol */ l2cap_recv_acldata(conn, skb, flags); return; } else { bt_dev_err(hdev, "ACL packet for unknown connection handle %d", handle); } drop: kfree_skb(skb); } /* SCO data packet */ static void hci_scodata_packet(struct hci_dev *hdev, struct sk_buff *skb) { struct hci_sco_hdr *hdr; struct hci_conn *conn; __u16 handle, flags; hdr = skb_pull_data(skb, sizeof(*hdr)); if (!hdr) { bt_dev_err(hdev, "SCO packet too small"); goto drop; } handle = __le16_to_cpu(hdr->handle); flags = hci_flags(handle); handle = hci_handle(handle); bt_dev_dbg(hdev, "len %d handle 0x%4.4x flags 0x%4.4x", skb->len, handle, flags); hdev->stat.sco_rx++; hci_dev_lock(hdev); conn = hci_conn_hash_lookup_handle(hdev, handle); hci_dev_unlock(hdev); if (conn) { /* Send to upper protocol */ hci_skb_pkt_status(skb) = flags & 0x03; sco_recv_scodata(conn, skb); return; } else { bt_dev_err_ratelimited(hdev, "SCO packet for unknown connection handle %d", handle); } drop: kfree_skb(skb); } static void hci_isodata_packet(struct hci_dev *hdev, struct sk_buff *skb) { struct hci_iso_hdr *hdr; struct hci_conn *conn; __u16 handle, flags; hdr = skb_pull_data(skb, sizeof(*hdr)); if (!hdr) { bt_dev_err(hdev, "ISO packet too small"); goto drop; } handle = __le16_to_cpu(hdr->handle); flags = hci_flags(handle); handle = hci_handle(handle); bt_dev_dbg(hdev, "len %d handle 0x%4.4x flags 0x%4.4x", skb->len, handle, flags); hci_dev_lock(hdev); conn = hci_conn_hash_lookup_handle(hdev, handle); hci_dev_unlock(hdev); if (!conn) { bt_dev_err(hdev, "ISO packet for unknown connection handle %d", handle); goto drop; } /* Send to upper protocol */ iso_recv(conn, skb, flags); return; drop: kfree_skb(skb); } static bool hci_req_is_complete(struct hci_dev *hdev) { struct sk_buff *skb; skb = skb_peek(&hdev->cmd_q); if (!skb) return true; return (bt_cb(skb)->hci.req_flags & HCI_REQ_START); } static void hci_resend_last(struct hci_dev *hdev) { struct hci_command_hdr *sent; struct sk_buff *skb; u16 opcode; if (!hdev->sent_cmd) return; sent = (void *) hdev->sent_cmd->data; opcode = __le16_to_cpu(sent->opcode); if (opcode == HCI_OP_RESET) return; skb = skb_clone(hdev->sent_cmd, GFP_KERNEL); if (!skb) return; skb_queue_head(&hdev->cmd_q, skb); queue_work(hdev->workqueue, &hdev->cmd_work); } void hci_req_cmd_complete(struct hci_dev *hdev, u16 opcode, u8 status, hci_req_complete_t *req_complete, hci_req_complete_skb_t *req_complete_skb) { struct sk_buff *skb; unsigned long flags; BT_DBG("opcode 0x%04x status 0x%02x", opcode, status); /* If the completed command doesn't match the last one that was * sent we need to do special handling of it. */ if (!hci_sent_cmd_data(hdev, opcode)) { /* Some CSR based controllers generate a spontaneous * reset complete event during init and any pending * command will never be completed. In such a case we * need to resend whatever was the last sent * command. */ if (test_bit(HCI_INIT, &hdev->flags) && opcode == HCI_OP_RESET) hci_resend_last(hdev); return; } /* If we reach this point this event matches the last command sent */ hci_dev_clear_flag(hdev, HCI_CMD_PENDING); /* If the command succeeded and there's still more commands in * this request the request is not yet complete. */ if (!status && !hci_req_is_complete(hdev)) return; skb = hdev->req_skb; /* If this was the last command in a request the complete * callback would be found in hdev->req_skb instead of the * command queue (hdev->cmd_q). */ if (skb && bt_cb(skb)->hci.req_flags & HCI_REQ_SKB) { *req_complete_skb = bt_cb(skb)->hci.req_complete_skb; return; } if (skb && bt_cb(skb)->hci.req_complete) { *req_complete = bt_cb(skb)->hci.req_complete; return; } /* Remove all pending commands belonging to this request */ spin_lock_irqsave(&hdev->cmd_q.lock, flags); while ((skb = __skb_dequeue(&hdev->cmd_q))) { if (bt_cb(skb)->hci.req_flags & HCI_REQ_START) { __skb_queue_head(&hdev->cmd_q, skb); break; } if (bt_cb(skb)->hci.req_flags & HCI_REQ_SKB) *req_complete_skb = bt_cb(skb)->hci.req_complete_skb; else *req_complete = bt_cb(skb)->hci.req_complete; dev_kfree_skb_irq(skb); } spin_unlock_irqrestore(&hdev->cmd_q.lock, flags); } static void hci_rx_work(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, rx_work); struct sk_buff *skb; BT_DBG("%s", hdev->name); /* The kcov_remote functions used for collecting packet parsing * coverage information from this background thread and associate * the coverage with the syscall's thread which originally injected * the packet. This helps fuzzing the kernel. */ for (; (skb = skb_dequeue(&hdev->rx_q)); kcov_remote_stop()) { kcov_remote_start_common(skb_get_kcov_handle(skb)); /* Send copy to monitor */ hci_send_to_monitor(hdev, skb); if (atomic_read(&hdev->promisc)) { /* Send copy to the sockets */ hci_send_to_sock(hdev, skb); } /* If the device has been opened in HCI_USER_CHANNEL, * the userspace has exclusive access to device. * When device is HCI_INIT, we still need to process * the data packets to the driver in order * to complete its setup(). */ if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL) && !test_bit(HCI_INIT, &hdev->flags)) { kfree_skb(skb); continue; } if (test_bit(HCI_INIT, &hdev->flags)) { /* Don't process data packets in this states. */ switch (hci_skb_pkt_type(skb)) { case HCI_ACLDATA_PKT: case HCI_SCODATA_PKT: case HCI_ISODATA_PKT: kfree_skb(skb); continue; } } /* Process frame */ switch (hci_skb_pkt_type(skb)) { case HCI_EVENT_PKT: BT_DBG("%s Event packet", hdev->name); hci_event_packet(hdev, skb); break; case HCI_ACLDATA_PKT: BT_DBG("%s ACL data packet", hdev->name); hci_acldata_packet(hdev, skb); break; case HCI_SCODATA_PKT: BT_DBG("%s SCO data packet", hdev->name); hci_scodata_packet(hdev, skb); break; case HCI_ISODATA_PKT: BT_DBG("%s ISO data packet", hdev->name); hci_isodata_packet(hdev, skb); break; default: kfree_skb(skb); break; } } } static void hci_send_cmd_sync(struct hci_dev *hdev, struct sk_buff *skb) { int err; bt_dev_dbg(hdev, "skb %p", skb); kfree_skb(hdev->sent_cmd); hdev->sent_cmd = skb_clone(skb, GFP_KERNEL); if (!hdev->sent_cmd) { skb_queue_head(&hdev->cmd_q, skb); queue_work(hdev->workqueue, &hdev->cmd_work); return; } err = hci_send_frame(hdev, skb); if (err < 0) { hci_cmd_sync_cancel_sync(hdev, -err); return; } if (hdev->req_status == HCI_REQ_PEND && !hci_dev_test_and_set_flag(hdev, HCI_CMD_PENDING)) { kfree_skb(hdev->req_skb); hdev->req_skb = skb_clone(hdev->sent_cmd, GFP_KERNEL); } atomic_dec(&hdev->cmd_cnt); } static void hci_cmd_work(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, cmd_work); struct sk_buff *skb; BT_DBG("%s cmd_cnt %d cmd queued %d", hdev->name, atomic_read(&hdev->cmd_cnt), skb_queue_len(&hdev->cmd_q)); /* Send queued commands */ if (atomic_read(&hdev->cmd_cnt)) { skb = skb_dequeue(&hdev->cmd_q); if (!skb) return; hci_send_cmd_sync(hdev, skb); rcu_read_lock(); if (test_bit(HCI_RESET, &hdev->flags) || hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE)) cancel_delayed_work(&hdev->cmd_timer); else queue_delayed_work(hdev->workqueue, &hdev->cmd_timer, HCI_CMD_TIMEOUT); rcu_read_unlock(); } } |
| 8 8 8 8 8 8 8 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 | // SPDX-License-Identifier: GPL-2.0-only /* * HT handling * * Copyright 2003, Jouni Malinen <jkmaline@cc.hut.fi> * Copyright 2002-2005, Instant802 Networks, Inc. * Copyright 2005-2006, Devicescape Software, Inc. * Copyright 2006-2007 Jiri Benc <jbenc@suse.cz> * Copyright 2007, Michael Wu <flamingice@sourmilk.net> * Copyright 2007-2010, Intel Corporation * Copyright 2017 Intel Deutschland GmbH * Copyright(c) 2020-2024 Intel Corporation */ #include <linux/ieee80211.h> #include <linux/export.h> #include <net/mac80211.h> #include "ieee80211_i.h" #include "rate.h" static void __check_htcap_disable(struct ieee80211_ht_cap *ht_capa, struct ieee80211_ht_cap *ht_capa_mask, struct ieee80211_sta_ht_cap *ht_cap, u16 flag) { __le16 le_flag = cpu_to_le16(flag); if (ht_capa_mask->cap_info & le_flag) { if (!(ht_capa->cap_info & le_flag)) ht_cap->cap &= ~flag; } } static void __check_htcap_enable(struct ieee80211_ht_cap *ht_capa, struct ieee80211_ht_cap *ht_capa_mask, struct ieee80211_sta_ht_cap *ht_cap, u16 flag) { __le16 le_flag = cpu_to_le16(flag); if ((ht_capa_mask->cap_info & le_flag) && (ht_capa->cap_info & le_flag)) ht_cap->cap |= flag; } void ieee80211_apply_htcap_overrides(struct ieee80211_sub_if_data *sdata, struct ieee80211_sta_ht_cap *ht_cap) { struct ieee80211_ht_cap *ht_capa, *ht_capa_mask; u8 *scaps, *smask; int i; if (!ht_cap->ht_supported) return; switch (sdata->vif.type) { case NL80211_IFTYPE_STATION: ht_capa = &sdata->u.mgd.ht_capa; ht_capa_mask = &sdata->u.mgd.ht_capa_mask; break; case NL80211_IFTYPE_ADHOC: ht_capa = &sdata->u.ibss.ht_capa; ht_capa_mask = &sdata->u.ibss.ht_capa_mask; break; default: WARN_ON_ONCE(1); return; } scaps = (u8 *)(&ht_capa->mcs.rx_mask); smask = (u8 *)(&ht_capa_mask->mcs.rx_mask); /* NOTE: If you add more over-rides here, update register_hw * ht_capa_mod_mask logic in main.c as well. * And, if this method can ever change ht_cap.ht_supported, fix * the check in ieee80211_add_ht_ie. */ /* check for HT over-rides, MCS rates first. */ for (i = 0; i < IEEE80211_HT_MCS_MASK_LEN; i++) { u8 m = smask[i]; ht_cap->mcs.rx_mask[i] &= ~m; /* turn off all masked bits */ /* Add back rates that are supported */ ht_cap->mcs.rx_mask[i] |= (m & scaps[i]); } /* Force removal of HT-40 capabilities? */ __check_htcap_disable(ht_capa, ht_capa_mask, ht_cap, IEEE80211_HT_CAP_SUP_WIDTH_20_40); __check_htcap_disable(ht_capa, ht_capa_mask, ht_cap, IEEE80211_HT_CAP_SGI_40); /* Allow user to disable SGI-20 (SGI-40 is handled above) */ __check_htcap_disable(ht_capa, ht_capa_mask, ht_cap, IEEE80211_HT_CAP_SGI_20); /* Allow user to disable the max-AMSDU bit. */ __check_htcap_disable(ht_capa, ht_capa_mask, ht_cap, IEEE80211_HT_CAP_MAX_AMSDU); /* Allow user to disable LDPC */ __check_htcap_disable(ht_capa, ht_capa_mask, ht_cap, IEEE80211_HT_CAP_LDPC_CODING); /* Allow user to enable 40 MHz intolerant bit. */ __check_htcap_enable(ht_capa, ht_capa_mask, ht_cap, IEEE80211_HT_CAP_40MHZ_INTOLERANT); /* Allow user to enable TX STBC bit */ __check_htcap_enable(ht_capa, ht_capa_mask, ht_cap, IEEE80211_HT_CAP_TX_STBC); /* Allow user to configure RX STBC bits */ if (ht_capa_mask->cap_info & cpu_to_le16(IEEE80211_HT_CAP_RX_STBC)) ht_cap->cap |= le16_to_cpu(ht_capa->cap_info) & IEEE80211_HT_CAP_RX_STBC; /* Allow user to decrease AMPDU factor */ if (ht_capa_mask->ampdu_params_info & IEEE80211_HT_AMPDU_PARM_FACTOR) { u8 n = ht_capa->ampdu_params_info & IEEE80211_HT_AMPDU_PARM_FACTOR; if (n < ht_cap->ampdu_factor) ht_cap->ampdu_factor = n; } /* Allow the user to increase AMPDU density. */ if (ht_capa_mask->ampdu_params_info & IEEE80211_HT_AMPDU_PARM_DENSITY) { u8 n = (ht_capa->ampdu_params_info & IEEE80211_HT_AMPDU_PARM_DENSITY) >> IEEE80211_HT_AMPDU_PARM_DENSITY_SHIFT; if (n > ht_cap->ampdu_density) ht_cap->ampdu_density = n; } } bool ieee80211_ht_cap_ie_to_sta_ht_cap(struct ieee80211_sub_if_data *sdata, struct ieee80211_supported_band *sband, const struct ieee80211_ht_cap *ht_cap_ie, struct link_sta_info *link_sta) { struct ieee80211_bss_conf *link_conf; struct sta_info *sta = link_sta->sta; struct ieee80211_sta_ht_cap ht_cap, own_cap; u8 ampdu_info, tx_mcs_set_cap; int i, max_tx_streams; bool changed; enum ieee80211_sta_rx_bandwidth bw; enum nl80211_chan_width width; memset(&ht_cap, 0, sizeof(ht_cap)); if (!ht_cap_ie || !sband->ht_cap.ht_supported) goto apply; ht_cap.ht_supported = true; own_cap = sband->ht_cap; /* * If user has specified capability over-rides, take care * of that if the station we're setting up is the AP or TDLS peer that * we advertised a restricted capability set to. Override * our own capabilities and then use those below. */ if (sdata->vif.type == NL80211_IFTYPE_STATION || sdata->vif.type == NL80211_IFTYPE_ADHOC) ieee80211_apply_htcap_overrides(sdata, &own_cap); /* * The bits listed in this expression should be * the same for the peer and us, if the station * advertises more then we can't use those thus * we mask them out. */ ht_cap.cap = le16_to_cpu(ht_cap_ie->cap_info) & (own_cap.cap | ~(IEEE80211_HT_CAP_LDPC_CODING | IEEE80211_HT_CAP_SUP_WIDTH_20_40 | IEEE80211_HT_CAP_GRN_FLD | IEEE80211_HT_CAP_SGI_20 | IEEE80211_HT_CAP_SGI_40 | IEEE80211_HT_CAP_DSSSCCK40)); /* * The STBC bits are asymmetric -- if we don't have * TX then mask out the peer's RX and vice versa. */ if (!(own_cap.cap & IEEE80211_HT_CAP_TX_STBC)) ht_cap.cap &= ~IEEE80211_HT_CAP_RX_STBC; if (!(own_cap.cap & IEEE80211_HT_CAP_RX_STBC)) ht_cap.cap &= ~IEEE80211_HT_CAP_TX_STBC; ampdu_info = ht_cap_ie->ampdu_params_info; ht_cap.ampdu_factor = ampdu_info & IEEE80211_HT_AMPDU_PARM_FACTOR; ht_cap.ampdu_density = (ampdu_info & IEEE80211_HT_AMPDU_PARM_DENSITY) >> 2; /* own MCS TX capabilities */ tx_mcs_set_cap = own_cap.mcs.tx_params; /* Copy peer MCS TX capabilities, the driver might need them. */ ht_cap.mcs.tx_params = ht_cap_ie->mcs.tx_params; /* can we TX with MCS rates? */ if (!(tx_mcs_set_cap & IEEE80211_HT_MCS_TX_DEFINED)) goto apply; /* Counting from 0, therefore +1 */ if (tx_mcs_set_cap & IEEE80211_HT_MCS_TX_RX_DIFF) max_tx_streams = ((tx_mcs_set_cap & IEEE80211_HT_MCS_TX_MAX_STREAMS_MASK) >> IEEE80211_HT_MCS_TX_MAX_STREAMS_SHIFT) + 1; else max_tx_streams = IEEE80211_HT_MCS_TX_MAX_STREAMS; /* * 802.11n-2009 20.3.5 / 20.6 says: * - indices 0 to 7 and 32 are single spatial stream * - 8 to 31 are multiple spatial streams using equal modulation * [8..15 for two streams, 16..23 for three and 24..31 for four] * - remainder are multiple spatial streams using unequal modulation */ for (i = 0; i < max_tx_streams; i++) ht_cap.mcs.rx_mask[i] = own_cap.mcs.rx_mask[i] & ht_cap_ie->mcs.rx_mask[i]; if (tx_mcs_set_cap & IEEE80211_HT_MCS_TX_UNEQUAL_MODULATION) for (i = IEEE80211_HT_MCS_UNEQUAL_MODULATION_START_BYTE; i < IEEE80211_HT_MCS_MASK_LEN; i++) ht_cap.mcs.rx_mask[i] = own_cap.mcs.rx_mask[i] & ht_cap_ie->mcs.rx_mask[i]; /* handle MCS rate 32 too */ if (own_cap.mcs.rx_mask[32/8] & ht_cap_ie->mcs.rx_mask[32/8] & 1) ht_cap.mcs.rx_mask[32/8] |= 1; /* set Rx highest rate */ ht_cap.mcs.rx_highest = ht_cap_ie->mcs.rx_highest; if (ht_cap.cap & IEEE80211_HT_CAP_MAX_AMSDU) link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_7935; else link_sta->pub->agg.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_HT_3839; ieee80211_sta_recalc_aggregates(&sta->sta); apply: changed = memcmp(&link_sta->pub->ht_cap, &ht_cap, sizeof(ht_cap)); memcpy(&link_sta->pub->ht_cap, &ht_cap, sizeof(ht_cap)); rcu_read_lock(); link_conf = rcu_dereference(sdata->vif.link_conf[link_sta->link_id]); if (WARN_ON(!link_conf)) width = NL80211_CHAN_WIDTH_20_NOHT; else width = link_conf->chanreq.oper.width; switch (width) { default: WARN_ON_ONCE(1); fallthrough; case NL80211_CHAN_WIDTH_20_NOHT: case NL80211_CHAN_WIDTH_20: bw = IEEE80211_STA_RX_BW_20; break; case NL80211_CHAN_WIDTH_40: case NL80211_CHAN_WIDTH_80: case NL80211_CHAN_WIDTH_80P80: case NL80211_CHAN_WIDTH_160: case NL80211_CHAN_WIDTH_320: bw = ht_cap.cap & IEEE80211_HT_CAP_SUP_WIDTH_20_40 ? IEEE80211_STA_RX_BW_40 : IEEE80211_STA_RX_BW_20; break; } rcu_read_unlock(); link_sta->pub->bandwidth = bw; link_sta->cur_max_bandwidth = ht_cap.cap & IEEE80211_HT_CAP_SUP_WIDTH_20_40 ? IEEE80211_STA_RX_BW_40 : IEEE80211_STA_RX_BW_20; if (sta->sdata->vif.type == NL80211_IFTYPE_AP || sta->sdata->vif.type == NL80211_IFTYPE_AP_VLAN) { enum ieee80211_smps_mode smps_mode; switch ((ht_cap.cap & IEEE80211_HT_CAP_SM_PS) >> IEEE80211_HT_CAP_SM_PS_SHIFT) { case WLAN_HT_CAP_SM_PS_INVALID: case WLAN_HT_CAP_SM_PS_STATIC: smps_mode = IEEE80211_SMPS_STATIC; break; case WLAN_HT_CAP_SM_PS_DYNAMIC: smps_mode = IEEE80211_SMPS_DYNAMIC; break; case WLAN_HT_CAP_SM_PS_DISABLED: smps_mode = IEEE80211_SMPS_OFF; break; } if (smps_mode != link_sta->pub->smps_mode) changed = true; link_sta->pub->smps_mode = smps_mode; } else { link_sta->pub->smps_mode = IEEE80211_SMPS_OFF; } return changed; } void ieee80211_sta_tear_down_BA_sessions(struct sta_info *sta, enum ieee80211_agg_stop_reason reason) { int i; lockdep_assert_wiphy(sta->local->hw.wiphy); for (i = 0; i < IEEE80211_NUM_TIDS; i++) __ieee80211_stop_rx_ba_session(sta, i, WLAN_BACK_RECIPIENT, WLAN_REASON_QSTA_LEAVE_QBSS, reason != AGG_STOP_DESTROY_STA && reason != AGG_STOP_PEER_REQUEST); for (i = 0; i < IEEE80211_NUM_TIDS; i++) __ieee80211_stop_tx_ba_session(sta, i, reason); /* * In case the tear down is part of a reconfigure due to HW restart * request, it is possible that the low level driver requested to stop * the BA session, so handle it to properly clean tid_tx data. */ if(reason == AGG_STOP_DESTROY_STA) { wiphy_work_cancel(sta->local->hw.wiphy, &sta->ampdu_mlme.work); for (i = 0; i < IEEE80211_NUM_TIDS; i++) { struct tid_ampdu_tx *tid_tx = rcu_dereference_protected_tid_tx(sta, i); if (!tid_tx) continue; if (test_and_clear_bit(HT_AGG_STATE_STOP_CB, &tid_tx->state)) ieee80211_stop_tx_ba_cb(sta, i, tid_tx); } } } void ieee80211_ba_session_work(struct wiphy *wiphy, struct wiphy_work *work) { struct sta_info *sta = container_of(work, struct sta_info, ampdu_mlme.work); struct tid_ampdu_tx *tid_tx; bool blocked; int tid; lockdep_assert_wiphy(sta->local->hw.wiphy); /* When this flag is set, new sessions should be blocked. */ blocked = test_sta_flag(sta, WLAN_STA_BLOCK_BA); for (tid = 0; tid < IEEE80211_NUM_TIDS; tid++) { if (test_and_clear_bit(tid, sta->ampdu_mlme.tid_rx_timer_expired)) __ieee80211_stop_rx_ba_session( sta, tid, WLAN_BACK_RECIPIENT, WLAN_REASON_QSTA_TIMEOUT, true); if (test_and_clear_bit(tid, sta->ampdu_mlme.tid_rx_stop_requested)) __ieee80211_stop_rx_ba_session( sta, tid, WLAN_BACK_RECIPIENT, WLAN_REASON_UNSPECIFIED, true); if (!blocked && test_and_clear_bit(tid, sta->ampdu_mlme.tid_rx_manage_offl)) __ieee80211_start_rx_ba_session(sta, 0, 0, 0, 1, tid, IEEE80211_MAX_AMPDU_BUF_HT, false, true, 0); if (test_and_clear_bit(tid + IEEE80211_NUM_TIDS, sta->ampdu_mlme.tid_rx_manage_offl)) __ieee80211_stop_rx_ba_session( sta, tid, WLAN_BACK_RECIPIENT, 0, false); spin_lock_bh(&sta->lock); tid_tx = sta->ampdu_mlme.tid_start_tx[tid]; if (!blocked && tid_tx) { struct txq_info *txqi = to_txq_info(sta->sta.txq[tid]); struct ieee80211_sub_if_data *sdata = vif_to_sdata(txqi->txq.vif); struct fq *fq = &sdata->local->fq; spin_lock_bh(&fq->lock); /* Allow only frags to be dequeued */ set_bit(IEEE80211_TXQ_STOP, &txqi->flags); if (!skb_queue_empty(&txqi->frags)) { /* Fragmented Tx is ongoing, wait for it to * finish. Reschedule worker to retry later. */ spin_unlock_bh(&fq->lock); spin_unlock_bh(&sta->lock); /* Give the task working on the txq a chance * to send out the queued frags */ synchronize_net(); wiphy_work_queue(sdata->local->hw.wiphy, work); return; } spin_unlock_bh(&fq->lock); /* * Assign it over to the normal tid_tx array * where it "goes live". */ sta->ampdu_mlme.tid_start_tx[tid] = NULL; /* could there be a race? */ if (sta->ampdu_mlme.tid_tx[tid]) kfree(tid_tx); else ieee80211_assign_tid_tx(sta, tid, tid_tx); spin_unlock_bh(&sta->lock); ieee80211_tx_ba_session_handle_start(sta, tid); continue; } spin_unlock_bh(&sta->lock); tid_tx = rcu_dereference_protected_tid_tx(sta, tid); if (!tid_tx) continue; if (!blocked && test_and_clear_bit(HT_AGG_STATE_START_CB, &tid_tx->state)) ieee80211_start_tx_ba_cb(sta, tid, tid_tx); if (test_and_clear_bit(HT_AGG_STATE_WANT_STOP, &tid_tx->state)) __ieee80211_stop_tx_ba_session(sta, tid, AGG_STOP_LOCAL_REQUEST); if (test_and_clear_bit(HT_AGG_STATE_STOP_CB, &tid_tx->state)) ieee80211_stop_tx_ba_cb(sta, tid, tid_tx); } } void ieee80211_send_delba(struct ieee80211_sub_if_data *sdata, const u8 *da, u16 tid, u16 initiator, u16 reason_code) { struct ieee80211_local *local = sdata->local; struct sk_buff *skb; struct ieee80211_mgmt *mgmt; u16 params; skb = dev_alloc_skb(sizeof(*mgmt) + local->hw.extra_tx_headroom); if (!skb) return; skb_reserve(skb, local->hw.extra_tx_headroom); mgmt = ieee80211_mgmt_ba(skb, da, sdata); skb_put(skb, 1 + sizeof(mgmt->u.action.u.delba)); mgmt->u.action.category = WLAN_CATEGORY_BACK; mgmt->u.action.u.delba.action_code = WLAN_ACTION_DELBA; params = (u16)(initiator << 11); /* bit 11 initiator */ params |= (u16)(tid << 12); /* bit 15:12 TID number */ mgmt->u.action.u.delba.params = cpu_to_le16(params); mgmt->u.action.u.delba.reason_code = cpu_to_le16(reason_code); ieee80211_tx_skb(sdata, skb); } void ieee80211_process_delba(struct ieee80211_sub_if_data *sdata, struct sta_info *sta, struct ieee80211_mgmt *mgmt, size_t len) { u16 tid, params; u16 initiator; params = le16_to_cpu(mgmt->u.action.u.delba.params); tid = (params & IEEE80211_DELBA_PARAM_TID_MASK) >> 12; initiator = (params & IEEE80211_DELBA_PARAM_INITIATOR_MASK) >> 11; ht_dbg_ratelimited(sdata, "delba from %pM (%s) tid %d reason code %d\n", mgmt->sa, initiator ? "initiator" : "recipient", tid, le16_to_cpu(mgmt->u.action.u.delba.reason_code)); if (initiator == WLAN_BACK_INITIATOR) __ieee80211_stop_rx_ba_session(sta, tid, WLAN_BACK_INITIATOR, 0, true); else __ieee80211_stop_tx_ba_session(sta, tid, AGG_STOP_PEER_REQUEST); } enum nl80211_smps_mode ieee80211_smps_mode_to_smps_mode(enum ieee80211_smps_mode smps) { switch (smps) { case IEEE80211_SMPS_OFF: return NL80211_SMPS_OFF; case IEEE80211_SMPS_STATIC: return NL80211_SMPS_STATIC; case IEEE80211_SMPS_DYNAMIC: return NL80211_SMPS_DYNAMIC; default: return NL80211_SMPS_OFF; } } int ieee80211_send_smps_action(struct ieee80211_sub_if_data *sdata, enum ieee80211_smps_mode smps, const u8 *da, const u8 *bssid, int link_id) { struct ieee80211_local *local = sdata->local; struct sk_buff *skb; struct ieee80211_mgmt *action_frame; struct ieee80211_tx_info *info; u8 status_link_id = link_id < 0 ? 0 : link_id; /* 27 = header + category + action + smps mode */ skb = dev_alloc_skb(27 + local->hw.extra_tx_headroom); if (!skb) return -ENOMEM; skb_reserve(skb, local->hw.extra_tx_headroom); action_frame = skb_put(skb, 27); memcpy(action_frame->da, da, ETH_ALEN); memcpy(action_frame->sa, sdata->dev->dev_addr, ETH_ALEN); memcpy(action_frame->bssid, bssid, ETH_ALEN); action_frame->frame_control = cpu_to_le16(IEEE80211_FTYPE_MGMT | IEEE80211_STYPE_ACTION); action_frame->u.action.category = WLAN_CATEGORY_HT; action_frame->u.action.u.ht_smps.action = WLAN_HT_ACTION_SMPS; switch (smps) { case IEEE80211_SMPS_AUTOMATIC: case IEEE80211_SMPS_NUM_MODES: WARN_ON(1); smps = IEEE80211_SMPS_OFF; fallthrough; case IEEE80211_SMPS_OFF: action_frame->u.action.u.ht_smps.smps_control = WLAN_HT_SMPS_CONTROL_DISABLED; break; case IEEE80211_SMPS_STATIC: action_frame->u.action.u.ht_smps.smps_control = WLAN_HT_SMPS_CONTROL_STATIC; break; case IEEE80211_SMPS_DYNAMIC: action_frame->u.action.u.ht_smps.smps_control = WLAN_HT_SMPS_CONTROL_DYNAMIC; break; } /* we'll do more on status of this frame */ info = IEEE80211_SKB_CB(skb); info->flags |= IEEE80211_TX_CTL_REQ_TX_STATUS; /* we have 13 bits, and need 6: link_id 4, smps 2 */ info->status_data = IEEE80211_STATUS_TYPE_SMPS | u16_encode_bits(status_link_id << 2 | smps, IEEE80211_STATUS_SUBDATA_MASK); ieee80211_tx_skb_tid(sdata, skb, 7, link_id); return 0; } void ieee80211_request_smps(struct ieee80211_vif *vif, unsigned int link_id, enum ieee80211_smps_mode smps_mode) { struct ieee80211_sub_if_data *sdata = vif_to_sdata(vif); struct ieee80211_link_data *link; if (WARN_ON_ONCE(vif->type != NL80211_IFTYPE_STATION)) return; rcu_read_lock(); link = rcu_dereference(sdata->link[link_id]); if (WARN_ON(!link)) goto out; trace_api_request_smps(sdata->local, sdata, link, smps_mode); if (link->u.mgd.driver_smps_mode == smps_mode) goto out; link->u.mgd.driver_smps_mode = smps_mode; wiphy_work_queue(sdata->local->hw.wiphy, &link->u.mgd.request_smps_work); out: rcu_read_unlock(); } /* this might change ... don't want non-open drivers using it */ EXPORT_SYMBOL_GPL(ieee80211_request_smps); |
| 1 1 1 1 3 2 2 1 2 2 2 2 2 1 3 35 22 13 21 35 33 1 1 35 3 3 2 1 3 2 3 3 3 4 2 2 3 3 9 10 10 1 9 1 1 2 5 3 3 4 2 4 1 1 1 1 1 1 1 31 31 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 11 11 11 11 11 2 7 5 2 1 1 1 2 2 6 3 2 2 2 1 2 210 210 115 1 116 17 10 106 93 75 116 115 115 116 113 7 116 116 114 116 112 113 112 28 5 110 114 114 112 115 114 205 34 14 3 2 20 8 7 1 1 207 15 192 131 20 132 191 192 190 20 81 61 61 223 207 222 18 207 205 10 9 10 10 220 210 180 65 61 116 117 81 9 9 82 221 219 116 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Video capture interface for Linux version 2 * * A generic framework to process V4L2 ioctl commands. * * Authors: Alan Cox, <alan@lxorguk.ukuu.org.uk> (version 1) * Mauro Carvalho Chehab <mchehab@kernel.org> (version 2) */ #include <linux/compat.h> #include <linux/mm.h> #include <linux/module.h> #include <linux/slab.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/version.h> #include <linux/v4l2-subdev.h> #include <linux/videodev2.h> #include <media/media-device.h> /* for media_set_bus_info() */ #include <media/v4l2-common.h> #include <media/v4l2-ioctl.h> #include <media/v4l2-ctrls.h> #include <media/v4l2-fh.h> #include <media/v4l2-event.h> #include <media/v4l2-device.h> #include <media/videobuf2-v4l2.h> #include <media/v4l2-mc.h> #include <media/v4l2-mem2mem.h> #include <trace/events/v4l2.h> #define is_valid_ioctl(vfd, cmd) test_bit(_IOC_NR(cmd), (vfd)->valid_ioctls) struct std_descr { v4l2_std_id std; const char *descr; }; static const struct std_descr standards[] = { { V4L2_STD_NTSC, "NTSC" }, { V4L2_STD_NTSC_M, "NTSC-M" }, { V4L2_STD_NTSC_M_JP, "NTSC-M-JP" }, { V4L2_STD_NTSC_M_KR, "NTSC-M-KR" }, { V4L2_STD_NTSC_443, "NTSC-443" }, { V4L2_STD_PAL, "PAL" }, { V4L2_STD_PAL_BG, "PAL-BG" }, { V4L2_STD_PAL_B, "PAL-B" }, { V4L2_STD_PAL_B1, "PAL-B1" }, { V4L2_STD_PAL_G, "PAL-G" }, { V4L2_STD_PAL_H, "PAL-H" }, { V4L2_STD_PAL_I, "PAL-I" }, { V4L2_STD_PAL_DK, "PAL-DK" }, { V4L2_STD_PAL_D, "PAL-D" }, { V4L2_STD_PAL_D1, "PAL-D1" }, { V4L2_STD_PAL_K, "PAL-K" }, { V4L2_STD_PAL_M, "PAL-M" }, { V4L2_STD_PAL_N, "PAL-N" }, { V4L2_STD_PAL_Nc, "PAL-Nc" }, { V4L2_STD_PAL_60, "PAL-60" }, { V4L2_STD_SECAM, "SECAM" }, { V4L2_STD_SECAM_B, "SECAM-B" }, { V4L2_STD_SECAM_G, "SECAM-G" }, { V4L2_STD_SECAM_H, "SECAM-H" }, { V4L2_STD_SECAM_DK, "SECAM-DK" }, { V4L2_STD_SECAM_D, "SECAM-D" }, { V4L2_STD_SECAM_K, "SECAM-K" }, { V4L2_STD_SECAM_K1, "SECAM-K1" }, { V4L2_STD_SECAM_L, "SECAM-L" }, { V4L2_STD_SECAM_LC, "SECAM-Lc" }, { 0, "Unknown" } }; /* video4linux standard ID conversion to standard name */ const char *v4l2_norm_to_name(v4l2_std_id id) { u32 myid = id; int i; /* HACK: ppc32 architecture doesn't have __ucmpdi2 function to handle 64 bit comparisons. So, on that architecture, with some gcc variants, compilation fails. Currently, the max value is 30bit wide. */ BUG_ON(myid != id); for (i = 0; standards[i].std; i++) if (myid == standards[i].std) break; return standards[i].descr; } EXPORT_SYMBOL(v4l2_norm_to_name); /* Returns frame period for the given standard */ void v4l2_video_std_frame_period(int id, struct v4l2_fract *frameperiod) { if (id & V4L2_STD_525_60) { frameperiod->numerator = 1001; frameperiod->denominator = 30000; } else { frameperiod->numerator = 1; frameperiod->denominator = 25; } } EXPORT_SYMBOL(v4l2_video_std_frame_period); /* Fill in the fields of a v4l2_standard structure according to the 'id' and 'transmission' parameters. Returns negative on error. */ int v4l2_video_std_construct(struct v4l2_standard *vs, int id, const char *name) { vs->id = id; v4l2_video_std_frame_period(id, &vs->frameperiod); vs->framelines = (id & V4L2_STD_525_60) ? 525 : 625; strscpy(vs->name, name, sizeof(vs->name)); return 0; } EXPORT_SYMBOL(v4l2_video_std_construct); /* Fill in the fields of a v4l2_standard structure according to the * 'id' and 'vs->index' parameters. Returns negative on error. */ int v4l_video_std_enumstd(struct v4l2_standard *vs, v4l2_std_id id) { v4l2_std_id curr_id = 0; unsigned int index = vs->index, i, j = 0; const char *descr = ""; /* Return -ENODATA if the id for the current input or output is 0, meaning that it doesn't support this API. */ if (id == 0) return -ENODATA; /* Return norm array in a canonical way */ for (i = 0; i <= index && id; i++) { /* last std value in the standards array is 0, so this while always ends there since (id & 0) == 0. */ while ((id & standards[j].std) != standards[j].std) j++; curr_id = standards[j].std; descr = standards[j].descr; j++; if (curr_id == 0) break; if (curr_id != V4L2_STD_PAL && curr_id != V4L2_STD_SECAM && curr_id != V4L2_STD_NTSC) id &= ~curr_id; } if (i <= index) return -EINVAL; v4l2_video_std_construct(vs, curr_id, descr); return 0; } /* ----------------------------------------------------------------- */ /* some arrays for pretty-printing debug messages of enum types */ const char *v4l2_field_names[] = { [V4L2_FIELD_ANY] = "any", [V4L2_FIELD_NONE] = "none", [V4L2_FIELD_TOP] = "top", [V4L2_FIELD_BOTTOM] = "bottom", [V4L2_FIELD_INTERLACED] = "interlaced", [V4L2_FIELD_SEQ_TB] = "seq-tb", [V4L2_FIELD_SEQ_BT] = "seq-bt", [V4L2_FIELD_ALTERNATE] = "alternate", [V4L2_FIELD_INTERLACED_TB] = "interlaced-tb", [V4L2_FIELD_INTERLACED_BT] = "interlaced-bt", }; EXPORT_SYMBOL(v4l2_field_names); const char *v4l2_type_names[] = { [0] = "0", [V4L2_BUF_TYPE_VIDEO_CAPTURE] = "vid-cap", [V4L2_BUF_TYPE_VIDEO_OVERLAY] = "vid-overlay", [V4L2_BUF_TYPE_VIDEO_OUTPUT] = "vid-out", [V4L2_BUF_TYPE_VBI_CAPTURE] = "vbi-cap", [V4L2_BUF_TYPE_VBI_OUTPUT] = "vbi-out", [V4L2_BUF_TYPE_SLICED_VBI_CAPTURE] = "sliced-vbi-cap", [V4L2_BUF_TYPE_SLICED_VBI_OUTPUT] = "sliced-vbi-out", [V4L2_BUF_TYPE_VIDEO_OUTPUT_OVERLAY] = "vid-out-overlay", [V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE] = "vid-cap-mplane", [V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE] = "vid-out-mplane", [V4L2_BUF_TYPE_SDR_CAPTURE] = "sdr-cap", [V4L2_BUF_TYPE_SDR_OUTPUT] = "sdr-out", [V4L2_BUF_TYPE_META_CAPTURE] = "meta-cap", [V4L2_BUF_TYPE_META_OUTPUT] = "meta-out", }; EXPORT_SYMBOL(v4l2_type_names); static const char *v4l2_memory_names[] = { [V4L2_MEMORY_MMAP] = "mmap", [V4L2_MEMORY_USERPTR] = "userptr", [V4L2_MEMORY_OVERLAY] = "overlay", [V4L2_MEMORY_DMABUF] = "dmabuf", }; #define prt_names(a, arr) (((unsigned)(a)) < ARRAY_SIZE(arr) ? arr[a] : "unknown") /* ------------------------------------------------------------------ */ /* debug help functions */ static void v4l_print_querycap(const void *arg, bool write_only) { const struct v4l2_capability *p = arg; pr_cont("driver=%.*s, card=%.*s, bus=%.*s, version=0x%08x, capabilities=0x%08x, device_caps=0x%08x\n", (int)sizeof(p->driver), p->driver, (int)sizeof(p->card), p->card, (int)sizeof(p->bus_info), p->bus_info, p->version, p->capabilities, p->device_caps); } static void v4l_print_enuminput(const void *arg, bool write_only) { const struct v4l2_input *p = arg; pr_cont("index=%u, name=%.*s, type=%u, audioset=0x%x, tuner=%u, std=0x%08Lx, status=0x%x, capabilities=0x%x\n", p->index, (int)sizeof(p->name), p->name, p->type, p->audioset, p->tuner, (unsigned long long)p->std, p->status, p->capabilities); } static void v4l_print_enumoutput(const void *arg, bool write_only) { const struct v4l2_output *p = arg; pr_cont("index=%u, name=%.*s, type=%u, audioset=0x%x, modulator=%u, std=0x%08Lx, capabilities=0x%x\n", p->index, (int)sizeof(p->name), p->name, p->type, p->audioset, p->modulator, (unsigned long long)p->std, p->capabilities); } static void v4l_print_audio(const void *arg, bool write_only) { const struct v4l2_audio *p = arg; if (write_only) pr_cont("index=%u, mode=0x%x\n", p->index, p->mode); else pr_cont("index=%u, name=%.*s, capability=0x%x, mode=0x%x\n", p->index, (int)sizeof(p->name), p->name, p->capability, p->mode); } static void v4l_print_audioout(const void *arg, bool write_only) { const struct v4l2_audioout *p = arg; if (write_only) pr_cont("index=%u\n", p->index); else pr_cont("index=%u, name=%.*s, capability=0x%x, mode=0x%x\n", p->index, (int)sizeof(p->name), p->name, p->capability, p->mode); } static void v4l_print_fmtdesc(const void *arg, bool write_only) { const struct v4l2_fmtdesc *p = arg; pr_cont("index=%u, type=%s, flags=0x%x, pixelformat=%p4cc, mbus_code=0x%04x, description='%.*s'\n", p->index, prt_names(p->type, v4l2_type_names), p->flags, &p->pixelformat, p->mbus_code, (int)sizeof(p->description), p->description); } static void v4l_print_format(const void *arg, bool write_only) { const struct v4l2_format *p = arg; const struct v4l2_pix_format *pix; const struct v4l2_pix_format_mplane *mp; const struct v4l2_vbi_format *vbi; const struct v4l2_sliced_vbi_format *sliced; const struct v4l2_window *win; const struct v4l2_meta_format *meta; u32 pixelformat; u32 planes; unsigned i; pr_cont("type=%s", prt_names(p->type, v4l2_type_names)); switch (p->type) { case V4L2_BUF_TYPE_VIDEO_CAPTURE: case V4L2_BUF_TYPE_VIDEO_OUTPUT: pix = &p->fmt.pix; pr_cont(", width=%u, height=%u, pixelformat=%p4cc, field=%s, bytesperline=%u, sizeimage=%u, colorspace=%d, flags=0x%x, ycbcr_enc=%u, quantization=%u, xfer_func=%u\n", pix->width, pix->height, &pix->pixelformat, prt_names(pix->field, v4l2_field_names), pix->bytesperline, pix->sizeimage, pix->colorspace, pix->flags, pix->ycbcr_enc, pix->quantization, pix->xfer_func); break; case V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE: case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: mp = &p->fmt.pix_mp; pixelformat = mp->pixelformat; pr_cont(", width=%u, height=%u, format=%p4cc, field=%s, colorspace=%d, num_planes=%u, flags=0x%x, ycbcr_enc=%u, quantization=%u, xfer_func=%u\n", mp->width, mp->height, &pixelformat, prt_names(mp->field, v4l2_field_names), mp->colorspace, mp->num_planes, mp->flags, mp->ycbcr_enc, mp->quantization, mp->xfer_func); planes = min_t(u32, mp->num_planes, VIDEO_MAX_PLANES); for (i = 0; i < planes; i++) printk(KERN_DEBUG "plane %u: bytesperline=%u sizeimage=%u\n", i, mp->plane_fmt[i].bytesperline, mp->plane_fmt[i].sizeimage); break; case V4L2_BUF_TYPE_VIDEO_OVERLAY: case V4L2_BUF_TYPE_VIDEO_OUTPUT_OVERLAY: win = &p->fmt.win; pr_cont(", (%d,%d)/%ux%u, field=%s, chromakey=0x%08x, global_alpha=0x%02x\n", win->w.left, win->w.top, win->w.width, win->w.height, prt_names(win->field, v4l2_field_names), win->chromakey, win->global_alpha); break; case V4L2_BUF_TYPE_VBI_CAPTURE: case V4L2_BUF_TYPE_VBI_OUTPUT: vbi = &p->fmt.vbi; pr_cont(", sampling_rate=%u, offset=%u, samples_per_line=%u, sample_format=%p4cc, start=%u,%u, count=%u,%u\n", vbi->sampling_rate, vbi->offset, vbi->samples_per_line, &vbi->sample_format, vbi->start[0], vbi->start[1], vbi->count[0], vbi->count[1]); break; case V4L2_BUF_TYPE_SLICED_VBI_CAPTURE: case V4L2_BUF_TYPE_SLICED_VBI_OUTPUT: sliced = &p->fmt.sliced; pr_cont(", service_set=0x%08x, io_size=%d\n", sliced->service_set, sliced->io_size); for (i = 0; i < 24; i++) printk(KERN_DEBUG "line[%02u]=0x%04x, 0x%04x\n", i, sliced->service_lines[0][i], sliced->service_lines[1][i]); break; case V4L2_BUF_TYPE_SDR_CAPTURE: case V4L2_BUF_TYPE_SDR_OUTPUT: pixelformat = p->fmt.sdr.pixelformat; pr_cont(", pixelformat=%p4cc\n", &pixelformat); break; case V4L2_BUF_TYPE_META_CAPTURE: case V4L2_BUF_TYPE_META_OUTPUT: meta = &p->fmt.meta; pixelformat = meta->dataformat; pr_cont(", dataformat=%p4cc, buffersize=%u, width=%u, height=%u, bytesperline=%u\n", &pixelformat, meta->buffersize, meta->width, meta->height, meta->bytesperline); break; } } static void v4l_print_framebuffer(const void *arg, bool write_only) { const struct v4l2_framebuffer *p = arg; pr_cont("capability=0x%x, flags=0x%x, base=0x%p, width=%u, height=%u, pixelformat=%p4cc, bytesperline=%u, sizeimage=%u, colorspace=%d\n", p->capability, p->flags, p->base, p->fmt.width, p->fmt.height, &p->fmt.pixelformat, p->fmt.bytesperline, p->fmt.sizeimage, p->fmt.colorspace); } static void v4l_print_buftype(const void *arg, bool write_only) { pr_cont("type=%s\n", prt_names(*(u32 *)arg, v4l2_type_names)); } static void v4l_print_modulator(const void *arg, bool write_only) { const struct v4l2_modulator *p = arg; if (write_only) pr_cont("index=%u, txsubchans=0x%x\n", p->index, p->txsubchans); else pr_cont("index=%u, name=%.*s, capability=0x%x, rangelow=%u, rangehigh=%u, txsubchans=0x%x\n", p->index, (int)sizeof(p->name), p->name, p->capability, p->rangelow, p->rangehigh, p->txsubchans); } static void v4l_print_tuner(const void *arg, bool write_only) { const struct v4l2_tuner *p = arg; if (write_only) pr_cont("index=%u, audmode=%u\n", p->index, p->audmode); else pr_cont("index=%u, name=%.*s, type=%u, capability=0x%x, rangelow=%u, rangehigh=%u, signal=%u, afc=%d, rxsubchans=0x%x, audmode=%u\n", p->index, (int)sizeof(p->name), p->name, p->type, p->capability, p->rangelow, p->rangehigh, p->signal, p->afc, p->rxsubchans, p->audmode); } static void v4l_print_frequency(const void *arg, bool write_only) { const struct v4l2_frequency *p = arg; pr_cont("tuner=%u, type=%u, frequency=%u\n", p->tuner, p->type, p->frequency); } static void v4l_print_standard(const void *arg, bool write_only) { const struct v4l2_standard *p = arg; pr_cont("index=%u, id=0x%Lx, name=%.*s, fps=%u/%u, framelines=%u\n", p->index, (unsigned long long)p->id, (int)sizeof(p->name), p->name, p->frameperiod.numerator, p->frameperiod.denominator, p->framelines); } static void v4l_print_std(const void *arg, bool write_only) { pr_cont("std=0x%08Lx\n", *(const long long unsigned *)arg); } static void v4l_print_hw_freq_seek(const void *arg, bool write_only) { const struct v4l2_hw_freq_seek *p = arg; pr_cont("tuner=%u, type=%u, seek_upward=%u, wrap_around=%u, spacing=%u, rangelow=%u, rangehigh=%u\n", p->tuner, p->type, p->seek_upward, p->wrap_around, p->spacing, p->rangelow, p->rangehigh); } static void v4l_print_requestbuffers(const void *arg, bool write_only) { const struct v4l2_requestbuffers *p = arg; pr_cont("count=%d, type=%s, memory=%s\n", p->count, prt_names(p->type, v4l2_type_names), prt_names(p->memory, v4l2_memory_names)); } static void v4l_print_buffer(const void *arg, bool write_only) { const struct v4l2_buffer *p = arg; const struct v4l2_timecode *tc = &p->timecode; const struct v4l2_plane *plane; int i; pr_cont("%02d:%02d:%02d.%06ld index=%d, type=%s, request_fd=%d, flags=0x%08x, field=%s, sequence=%d, memory=%s", (int)p->timestamp.tv_sec / 3600, ((int)p->timestamp.tv_sec / 60) % 60, ((int)p->timestamp.tv_sec % 60), (long)p->timestamp.tv_usec, p->index, prt_names(p->type, v4l2_type_names), p->request_fd, p->flags, prt_names(p->field, v4l2_field_names), p->sequence, prt_names(p->memory, v4l2_memory_names)); if (V4L2_TYPE_IS_MULTIPLANAR(p->type) && p->m.planes) { pr_cont("\n"); for (i = 0; i < p->length; ++i) { plane = &p->m.planes[i]; printk(KERN_DEBUG "plane %d: bytesused=%d, data_offset=0x%08x, offset/userptr=0x%lx, length=%d\n", i, plane->bytesused, plane->data_offset, plane->m.userptr, plane->length); } } else { pr_cont(", bytesused=%d, offset/userptr=0x%lx, length=%d\n", p->bytesused, p->m.userptr, p->length); } printk(KERN_DEBUG "timecode=%02d:%02d:%02d type=%d, flags=0x%08x, frames=%d, userbits=0x%08x\n", tc->hours, tc->minutes, tc->seconds, tc->type, tc->flags, tc->frames, *(__u32 *)tc->userbits); } static void v4l_print_exportbuffer(const void *arg, bool write_only) { const struct v4l2_exportbuffer *p = arg; pr_cont("fd=%d, type=%s, index=%u, plane=%u, flags=0x%08x\n", p->fd, prt_names(p->type, v4l2_type_names), p->index, p->plane, p->flags); } static void v4l_print_create_buffers(const void *arg, bool write_only) { const struct v4l2_create_buffers *p = arg; pr_cont("index=%d, count=%d, memory=%s, capabilities=0x%08x, max num buffers=%u, ", p->index, p->count, prt_names(p->memory, v4l2_memory_names), p->capabilities, p->max_num_buffers); v4l_print_format(&p->format, write_only); } static void v4l_print_remove_buffers(const void *arg, bool write_only) { const struct v4l2_remove_buffers *p = arg; pr_cont("type=%s, index=%u, count=%u\n", prt_names(p->type, v4l2_type_names), p->index, p->count); } static void v4l_print_streamparm(const void *arg, bool write_only) { const struct v4l2_streamparm *p = arg; pr_cont("type=%s", prt_names(p->type, v4l2_type_names)); if (p->type == V4L2_BUF_TYPE_VIDEO_CAPTURE || p->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) { const struct v4l2_captureparm *c = &p->parm.capture; pr_cont(", capability=0x%x, capturemode=0x%x, timeperframe=%d/%d, extendedmode=%d, readbuffers=%d\n", c->capability, c->capturemode, c->timeperframe.numerator, c->timeperframe.denominator, c->extendedmode, c->readbuffers); } else if (p->type == V4L2_BUF_TYPE_VIDEO_OUTPUT || p->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) { const struct v4l2_outputparm *c = &p->parm.output; pr_cont(", capability=0x%x, outputmode=0x%x, timeperframe=%d/%d, extendedmode=%d, writebuffers=%d\n", c->capability, c->outputmode, c->timeperframe.numerator, c->timeperframe.denominator, c->extendedmode, c->writebuffers); } else { pr_cont("\n"); } } static void v4l_print_queryctrl(const void *arg, bool write_only) { const struct v4l2_queryctrl *p = arg; pr_cont("id=0x%x, type=%d, name=%.*s, min/max=%d/%d, step=%d, default=%d, flags=0x%08x\n", p->id, p->type, (int)sizeof(p->name), p->name, p->minimum, p->maximum, p->step, p->default_value, p->flags); } static void v4l_print_query_ext_ctrl(const void *arg, bool write_only) { const struct v4l2_query_ext_ctrl *p = arg; pr_cont("id=0x%x, type=%d, name=%.*s, min/max=%lld/%lld, step=%lld, default=%lld, flags=0x%08x, elem_size=%u, elems=%u, nr_of_dims=%u, dims=%u,%u,%u,%u\n", p->id, p->type, (int)sizeof(p->name), p->name, p->minimum, p->maximum, p->step, p->default_value, p->flags, p->elem_size, p->elems, p->nr_of_dims, p->dims[0], p->dims[1], p->dims[2], p->dims[3]); } static void v4l_print_querymenu(const void *arg, bool write_only) { const struct v4l2_querymenu *p = arg; pr_cont("id=0x%x, index=%d\n", p->id, p->index); } static void v4l_print_control(const void *arg, bool write_only) { const struct v4l2_control *p = arg; const char *name = v4l2_ctrl_get_name(p->id); if (name) pr_cont("name=%s, ", name); pr_cont("id=0x%x, value=%d\n", p->id, p->value); } static void v4l_print_ext_controls(const void *arg, bool write_only) { const struct v4l2_ext_controls *p = arg; int i; pr_cont("which=0x%x, count=%d, error_idx=%d, request_fd=%d", p->which, p->count, p->error_idx, p->request_fd); for (i = 0; i < p->count; i++) { unsigned int id = p->controls[i].id; const char *name = v4l2_ctrl_get_name(id); if (name) pr_cont(", name=%s", name); if (!p->controls[i].size) pr_cont(", id/val=0x%x/0x%x", id, p->controls[i].value); else pr_cont(", id/size=0x%x/%u", id, p->controls[i].size); } pr_cont("\n"); } static void v4l_print_cropcap(const void *arg, bool write_only) { const struct v4l2_cropcap *p = arg; pr_cont("type=%s, bounds (%d,%d)/%ux%u, defrect (%d,%d)/%ux%u, pixelaspect %d/%d\n", prt_names(p->type, v4l2_type_names), p->bounds.left, p->bounds.top, p->bounds.width, p->bounds.height, p->defrect.left, p->defrect.top, p->defrect.width, p->defrect.height, p->pixelaspect.numerator, p->pixelaspect.denominator); } static void v4l_print_crop(const void *arg, bool write_only) { const struct v4l2_crop *p = arg; pr_cont("type=%s, crop=(%d,%d)/%ux%u\n", prt_names(p->type, v4l2_type_names), p->c.left, p->c.top, p->c.width, p->c.height); } static void v4l_print_selection(const void *arg, bool write_only) { const struct v4l2_selection *p = arg; pr_cont("type=%s, target=%d, flags=0x%x, rect=(%d,%d)/%ux%u\n", prt_names(p->type, v4l2_type_names), p->target, p->flags, p->r.left, p->r.top, p->r.width, p->r.height); } static void v4l_print_jpegcompression(const void *arg, bool write_only) { const struct v4l2_jpegcompression *p = arg; pr_cont("quality=%d, APPn=%d, APP_len=%d, COM_len=%d, jpeg_markers=0x%x\n", p->quality, p->APPn, p->APP_len, p->COM_len, p->jpeg_markers); } static void v4l_print_enc_idx(const void *arg, bool write_only) { const struct v4l2_enc_idx *p = arg; pr_cont("entries=%d, entries_cap=%d\n", p->entries, p->entries_cap); } static void v4l_print_encoder_cmd(const void *arg, bool write_only) { const struct v4l2_encoder_cmd *p = arg; pr_cont("cmd=%d, flags=0x%x\n", p->cmd, p->flags); } static void v4l_print_decoder_cmd(const void *arg, bool write_only) { const struct v4l2_decoder_cmd *p = arg; pr_cont("cmd=%d, flags=0x%x\n", p->cmd, p->flags); if (p->cmd == V4L2_DEC_CMD_START) pr_info("speed=%d, format=%u\n", p->start.speed, p->start.format); else if (p->cmd == V4L2_DEC_CMD_STOP) pr_info("pts=%llu\n", p->stop.pts); } static void v4l_print_dbg_chip_info(const void *arg, bool write_only) { const struct v4l2_dbg_chip_info *p = arg; pr_cont("type=%u, ", p->match.type); if (p->match.type == V4L2_CHIP_MATCH_I2C_DRIVER) pr_cont("name=%.*s, ", (int)sizeof(p->match.name), p->match.name); else pr_cont("addr=%u, ", p->match.addr); pr_cont("name=%.*s\n", (int)sizeof(p->name), p->name); } static void v4l_print_dbg_register(const void *arg, bool write_only) { const struct v4l2_dbg_register *p = arg; pr_cont("type=%u, ", p->match.type); if (p->match.type == V4L2_CHIP_MATCH_I2C_DRIVER) pr_cont("name=%.*s, ", (int)sizeof(p->match.name), p->match.name); else pr_cont("addr=%u, ", p->match.addr); pr_cont("reg=0x%llx, val=0x%llx\n", p->reg, p->val); } static void v4l_print_dv_timings(const void *arg, bool write_only) { const struct v4l2_dv_timings *p = arg; switch (p->type) { case V4L2_DV_BT_656_1120: pr_cont("type=bt-656/1120, interlaced=%u, pixelclock=%llu, width=%u, height=%u, polarities=0x%x, hfrontporch=%u, hsync=%u, hbackporch=%u, vfrontporch=%u, vsync=%u, vbackporch=%u, il_vfrontporch=%u, il_vsync=%u, il_vbackporch=%u, standards=0x%x, flags=0x%x\n", p->bt.interlaced, p->bt.pixelclock, p->bt.width, p->bt.height, p->bt.polarities, p->bt.hfrontporch, p->bt.hsync, p->bt.hbackporch, p->bt.vfrontporch, p->bt.vsync, p->bt.vbackporch, p->bt.il_vfrontporch, p->bt.il_vsync, p->bt.il_vbackporch, p->bt.standards, p->bt.flags); break; default: pr_cont("type=%d\n", p->type); break; } } static void v4l_print_enum_dv_timings(const void *arg, bool write_only) { const struct v4l2_enum_dv_timings *p = arg; pr_cont("index=%u, ", p->index); v4l_print_dv_timings(&p->timings, write_only); } static void v4l_print_dv_timings_cap(const void *arg, bool write_only) { const struct v4l2_dv_timings_cap *p = arg; switch (p->type) { case V4L2_DV_BT_656_1120: pr_cont("type=bt-656/1120, width=%u-%u, height=%u-%u, pixelclock=%llu-%llu, standards=0x%x, capabilities=0x%x\n", p->bt.min_width, p->bt.max_width, p->bt.min_height, p->bt.max_height, p->bt.min_pixelclock, p->bt.max_pixelclock, p->bt.standards, p->bt.capabilities); break; default: pr_cont("type=%u\n", p->type); break; } } static void v4l_print_frmsizeenum(const void *arg, bool write_only) { const struct v4l2_frmsizeenum *p = arg; pr_cont("index=%u, pixelformat=%p4cc, type=%u", p->index, &p->pixel_format, p->type); switch (p->type) { case V4L2_FRMSIZE_TYPE_DISCRETE: pr_cont(", wxh=%ux%u\n", p->discrete.width, p->discrete.height); break; case V4L2_FRMSIZE_TYPE_STEPWISE: pr_cont(", min=%ux%u, max=%ux%u, step=%ux%u\n", p->stepwise.min_width, p->stepwise.min_height, p->stepwise.max_width, p->stepwise.max_height, p->stepwise.step_width, p->stepwise.step_height); break; case V4L2_FRMSIZE_TYPE_CONTINUOUS: default: pr_cont("\n"); break; } } static void v4l_print_frmivalenum(const void *arg, bool write_only) { const struct v4l2_frmivalenum *p = arg; pr_cont("index=%u, pixelformat=%p4cc, wxh=%ux%u, type=%u", p->index, &p->pixel_format, p->width, p->height, p->type); switch (p->type) { case V4L2_FRMIVAL_TYPE_DISCRETE: pr_cont(", fps=%d/%d\n", p->discrete.numerator, p->discrete.denominator); break; case V4L2_FRMIVAL_TYPE_STEPWISE: pr_cont(", min=%d/%d, max=%d/%d, step=%d/%d\n", p->stepwise.min.numerator, p->stepwise.min.denominator, p->stepwise.max.numerator, p->stepwise.max.denominator, p->stepwise.step.numerator, p->stepwise.step.denominator); break; case V4L2_FRMIVAL_TYPE_CONTINUOUS: default: pr_cont("\n"); break; } } static void v4l_print_event(const void *arg, bool write_only) { const struct v4l2_event *p = arg; const struct v4l2_event_ctrl *c; pr_cont("type=0x%x, pending=%u, sequence=%u, id=%u, timestamp=%llu.%9.9llu\n", p->type, p->pending, p->sequence, p->id, p->timestamp.tv_sec, p->timestamp.tv_nsec); switch (p->type) { case V4L2_EVENT_VSYNC: printk(KERN_DEBUG "field=%s\n", prt_names(p->u.vsync.field, v4l2_field_names)); break; case V4L2_EVENT_CTRL: c = &p->u.ctrl; printk(KERN_DEBUG "changes=0x%x, type=%u, ", c->changes, c->type); if (c->type == V4L2_CTRL_TYPE_INTEGER64) pr_cont("value64=%lld, ", c->value64); else pr_cont("value=%d, ", c->value); pr_cont("flags=0x%x, minimum=%d, maximum=%d, step=%d, default_value=%d\n", c->flags, c->minimum, c->maximum, c->step, c->default_value); break; case V4L2_EVENT_FRAME_SYNC: pr_cont("frame_sequence=%u\n", p->u.frame_sync.frame_sequence); break; } } static void v4l_print_event_subscription(const void *arg, bool write_only) { const struct v4l2_event_subscription *p = arg; pr_cont("type=0x%x, id=0x%x, flags=0x%x\n", p->type, p->id, p->flags); } static void v4l_print_sliced_vbi_cap(const void *arg, bool write_only) { const struct v4l2_sliced_vbi_cap *p = arg; int i; pr_cont("type=%s, service_set=0x%08x\n", prt_names(p->type, v4l2_type_names), p->service_set); for (i = 0; i < 24; i++) printk(KERN_DEBUG "line[%02u]=0x%04x, 0x%04x\n", i, p->service_lines[0][i], p->service_lines[1][i]); } static void v4l_print_freq_band(const void *arg, bool write_only) { const struct v4l2_frequency_band *p = arg; pr_cont("tuner=%u, type=%u, index=%u, capability=0x%x, rangelow=%u, rangehigh=%u, modulation=0x%x\n", p->tuner, p->type, p->index, p->capability, p->rangelow, p->rangehigh, p->modulation); } static void v4l_print_edid(const void *arg, bool write_only) { const struct v4l2_edid *p = arg; pr_cont("pad=%u, start_block=%u, blocks=%u\n", p->pad, p->start_block, p->blocks); } static void v4l_print_u32(const void *arg, bool write_only) { pr_cont("value=%u\n", *(const u32 *)arg); } static void v4l_print_newline(const void *arg, bool write_only) { pr_cont("\n"); } static void v4l_print_default(const void *arg, bool write_only) { pr_cont("driver-specific ioctl\n"); } static bool check_ext_ctrls(struct v4l2_ext_controls *c, unsigned long ioctl) { __u32 i; /* zero the reserved fields */ c->reserved[0] = 0; for (i = 0; i < c->count; i++) c->controls[i].reserved2[0] = 0; switch (c->which) { case V4L2_CID_PRIVATE_BASE: /* * V4L2_CID_PRIVATE_BASE cannot be used as control class * when using extended controls. * Only when passed in through VIDIOC_G_CTRL and VIDIOC_S_CTRL * is it allowed for backwards compatibility. */ if (ioctl == VIDIOC_G_CTRL || ioctl == VIDIOC_S_CTRL) return false; break; case V4L2_CTRL_WHICH_DEF_VAL: case V4L2_CTRL_WHICH_MIN_VAL: case V4L2_CTRL_WHICH_MAX_VAL: /* Default, minimum or maximum value cannot be changed */ if (ioctl == VIDIOC_S_EXT_CTRLS || ioctl == VIDIOC_TRY_EXT_CTRLS) { c->error_idx = c->count; return false; } return true; case V4L2_CTRL_WHICH_CUR_VAL: return true; case V4L2_CTRL_WHICH_REQUEST_VAL: c->error_idx = c->count; return false; } /* Check that all controls are from the same control class. */ for (i = 0; i < c->count; i++) { if (V4L2_CTRL_ID2WHICH(c->controls[i].id) != c->which) { c->error_idx = ioctl == VIDIOC_TRY_EXT_CTRLS ? i : c->count; return false; } } return true; } static int check_fmt(struct file *file, enum v4l2_buf_type type) { const u32 vid_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_VIDEO_CAPTURE_MPLANE | V4L2_CAP_VIDEO_OUTPUT | V4L2_CAP_VIDEO_OUTPUT_MPLANE | V4L2_CAP_VIDEO_M2M | V4L2_CAP_VIDEO_M2M_MPLANE; const u32 meta_caps = V4L2_CAP_META_CAPTURE | V4L2_CAP_META_OUTPUT; struct video_device *vfd = video_devdata(file); const struct v4l2_ioctl_ops *ops = vfd->ioctl_ops; bool is_vid = vfd->vfl_type == VFL_TYPE_VIDEO && (vfd->device_caps & vid_caps); bool is_vbi = vfd->vfl_type == VFL_TYPE_VBI; bool is_sdr = vfd->vfl_type == VFL_TYPE_SDR; bool is_tch = vfd->vfl_type == VFL_TYPE_TOUCH; bool is_meta = vfd->vfl_type == VFL_TYPE_VIDEO && (vfd->device_caps & meta_caps); bool is_rx = vfd->vfl_dir != VFL_DIR_TX; bool is_tx = vfd->vfl_dir != VFL_DIR_RX; if (ops == NULL) return -EINVAL; switch (type) { case V4L2_BUF_TYPE_VIDEO_CAPTURE: if ((is_vid || is_tch) && is_rx && (ops->vidioc_g_fmt_vid_cap || ops->vidioc_g_fmt_vid_cap_mplane)) return 0; break; case V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE: if ((is_vid || is_tch) && is_rx && ops->vidioc_g_fmt_vid_cap_mplane) return 0; break; case V4L2_BUF_TYPE_VIDEO_OVERLAY: if (is_vid && is_rx && ops->vidioc_g_fmt_vid_overlay) return 0; break; case V4L2_BUF_TYPE_VIDEO_OUTPUT: if (is_vid && is_tx && (ops->vidioc_g_fmt_vid_out || ops->vidioc_g_fmt_vid_out_mplane)) return 0; break; case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: if (is_vid && is_tx && ops->vidioc_g_fmt_vid_out_mplane) return 0; break; case V4L2_BUF_TYPE_VIDEO_OUTPUT_OVERLAY: if (is_vid && is_tx && ops->vidioc_g_fmt_vid_out_overlay) return 0; break; case V4L2_BUF_TYPE_VBI_CAPTURE: if (is_vbi && is_rx && ops->vidioc_g_fmt_vbi_cap) return 0; break; case V4L2_BUF_TYPE_VBI_OUTPUT: if (is_vbi && is_tx && ops->vidioc_g_fmt_vbi_out) return 0; break; case V4L2_BUF_TYPE_SLICED_VBI_CAPTURE: if (is_vbi && is_rx && ops->vidioc_g_fmt_sliced_vbi_cap) return 0; break; case V4L2_BUF_TYPE_SLICED_VBI_OUTPUT: if (is_vbi && is_tx && ops->vidioc_g_fmt_sliced_vbi_out) return 0; break; case V4L2_BUF_TYPE_SDR_CAPTURE: if (is_sdr && is_rx && ops->vidioc_g_fmt_sdr_cap) return 0; break; case V4L2_BUF_TYPE_SDR_OUTPUT: if (is_sdr && is_tx && ops->vidioc_g_fmt_sdr_out) return 0; break; case V4L2_BUF_TYPE_META_CAPTURE: if (is_meta && is_rx && ops->vidioc_g_fmt_meta_cap) return 0; break; case V4L2_BUF_TYPE_META_OUTPUT: if (is_meta && is_tx && ops->vidioc_g_fmt_meta_out) return 0; break; default: break; } return -EINVAL; } static void v4l_sanitize_colorspace(u32 pixelformat, u32 *colorspace, u32 *encoding, u32 *quantization, u32 *xfer_func) { bool is_hsv = pixelformat == V4L2_PIX_FMT_HSV24 || pixelformat == V4L2_PIX_FMT_HSV32; if (!v4l2_is_colorspace_valid(*colorspace)) { *colorspace = V4L2_COLORSPACE_DEFAULT; *encoding = V4L2_YCBCR_ENC_DEFAULT; *quantization = V4L2_QUANTIZATION_DEFAULT; *xfer_func = V4L2_XFER_FUNC_DEFAULT; } if ((!is_hsv && !v4l2_is_ycbcr_enc_valid(*encoding)) || (is_hsv && !v4l2_is_hsv_enc_valid(*encoding))) *encoding = V4L2_YCBCR_ENC_DEFAULT; if (!v4l2_is_quant_valid(*quantization)) *quantization = V4L2_QUANTIZATION_DEFAULT; if (!v4l2_is_xfer_func_valid(*xfer_func)) *xfer_func = V4L2_XFER_FUNC_DEFAULT; } static void v4l_sanitize_format(struct v4l2_format *fmt) { unsigned int offset; /* Make sure num_planes is not bogus */ if (fmt->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE || fmt->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) fmt->fmt.pix_mp.num_planes = min_t(u32, fmt->fmt.pix_mp.num_planes, VIDEO_MAX_PLANES); /* * The v4l2_pix_format structure has been extended with fields that were * not previously required to be set to zero by applications. The priv * field, when set to a magic value, indicates that the extended fields * are valid. Otherwise they will contain undefined values. To simplify * the API towards drivers zero the extended fields and set the priv * field to the magic value when the extended pixel format structure * isn't used by applications. */ if (fmt->type == V4L2_BUF_TYPE_VIDEO_CAPTURE || fmt->type == V4L2_BUF_TYPE_VIDEO_OUTPUT) { if (fmt->fmt.pix.priv != V4L2_PIX_FMT_PRIV_MAGIC) { fmt->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; offset = offsetof(struct v4l2_pix_format, priv) + sizeof(fmt->fmt.pix.priv); memset(((void *)&fmt->fmt.pix) + offset, 0, sizeof(fmt->fmt.pix) - offset); } } /* Replace invalid colorspace values with defaults. */ if (fmt->type == V4L2_BUF_TYPE_VIDEO_CAPTURE || fmt->type == V4L2_BUF_TYPE_VIDEO_OUTPUT) { v4l_sanitize_colorspace(fmt->fmt.pix.pixelformat, &fmt->fmt.pix.colorspace, &fmt->fmt.pix.ycbcr_enc, &fmt->fmt.pix.quantization, &fmt->fmt.pix.xfer_func); } else if (fmt->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE || fmt->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) { u32 ycbcr_enc = fmt->fmt.pix_mp.ycbcr_enc; u32 quantization = fmt->fmt.pix_mp.quantization; u32 xfer_func = fmt->fmt.pix_mp.xfer_func; v4l_sanitize_colorspace(fmt->fmt.pix_mp.pixelformat, &fmt->fmt.pix_mp.colorspace, &ycbcr_enc, &quantization, &xfer_func); fmt->fmt.pix_mp.ycbcr_enc = ycbcr_enc; fmt->fmt.pix_mp.quantization = quantization; fmt->fmt.pix_mp.xfer_func = xfer_func; } } static int v4l_querycap(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_capability *cap = (struct v4l2_capability *)arg; struct video_device *vfd = video_devdata(file); int ret; cap->version = LINUX_VERSION_CODE; cap->device_caps = vfd->device_caps; cap->capabilities = vfd->device_caps | V4L2_CAP_DEVICE_CAPS; media_set_bus_info(cap->bus_info, sizeof(cap->bus_info), vfd->dev_parent); ret = ops->vidioc_querycap(file, fh, cap); /* * Drivers must not change device_caps, so check for this and * warn if this happened. */ WARN_ON(cap->device_caps != vfd->device_caps); /* * Check that capabilities is a superset of * vfd->device_caps | V4L2_CAP_DEVICE_CAPS */ WARN_ON((cap->capabilities & (vfd->device_caps | V4L2_CAP_DEVICE_CAPS)) != (vfd->device_caps | V4L2_CAP_DEVICE_CAPS)); cap->capabilities |= V4L2_CAP_EXT_PIX_FORMAT; cap->device_caps |= V4L2_CAP_EXT_PIX_FORMAT; return ret; } static int v4l_g_input(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); if (vfd->device_caps & V4L2_CAP_IO_MC) { *(int *)arg = 0; return 0; } return ops->vidioc_g_input(file, fh, arg); } static int v4l_g_output(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); if (vfd->device_caps & V4L2_CAP_IO_MC) { *(int *)arg = 0; return 0; } return ops->vidioc_g_output(file, fh, arg); } static int v4l_s_input(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); int ret; ret = v4l_enable_media_source(vfd); if (ret) return ret; if (vfd->device_caps & V4L2_CAP_IO_MC) return *(int *)arg ? -EINVAL : 0; return ops->vidioc_s_input(file, fh, *(unsigned int *)arg); } static int v4l_s_output(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); if (vfd->device_caps & V4L2_CAP_IO_MC) return *(int *)arg ? -EINVAL : 0; return ops->vidioc_s_output(file, fh, *(unsigned int *)arg); } static int v4l_g_priority(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd; u32 *p = arg; vfd = video_devdata(file); *p = v4l2_prio_max(vfd->prio); return 0; } static int v4l_s_priority(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd; struct v4l2_fh *vfh; u32 *p = arg; vfd = video_devdata(file); if (!test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags)) return -ENOTTY; vfh = file->private_data; return v4l2_prio_change(vfd->prio, &vfh->prio, *p); } static int v4l_enuminput(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_input *p = arg; /* * We set the flags for CAP_DV_TIMINGS & * CAP_STD here based on ioctl handler provided by the * driver. If the driver doesn't support these * for a specific input, it must override these flags. */ if (is_valid_ioctl(vfd, VIDIOC_S_STD)) p->capabilities |= V4L2_IN_CAP_STD; if (vfd->device_caps & V4L2_CAP_IO_MC) { if (p->index) return -EINVAL; strscpy(p->name, vfd->name, sizeof(p->name)); p->type = V4L2_INPUT_TYPE_CAMERA; return 0; } return ops->vidioc_enum_input(file, fh, p); } static int v4l_enumoutput(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_output *p = arg; /* * We set the flags for CAP_DV_TIMINGS & * CAP_STD here based on ioctl handler provided by the * driver. If the driver doesn't support these * for a specific output, it must override these flags. */ if (is_valid_ioctl(vfd, VIDIOC_S_STD)) p->capabilities |= V4L2_OUT_CAP_STD; if (vfd->device_caps & V4L2_CAP_IO_MC) { if (p->index) return -EINVAL; strscpy(p->name, vfd->name, sizeof(p->name)); p->type = V4L2_OUTPUT_TYPE_ANALOG; return 0; } return ops->vidioc_enum_output(file, fh, p); } static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt) { const unsigned sz = sizeof(fmt->description); const char *descr = NULL; u32 flags = 0; /* * We depart from the normal coding style here since the descriptions * should be aligned so it is easy to see which descriptions will be * longer than 31 characters (the max length for a description). * And frankly, this is easier to read anyway. * * Note that gcc will use O(log N) comparisons to find the right case. */ switch (fmt->pixelformat) { /* Max description length mask: descr = "0123456789012345678901234567890" */ case V4L2_PIX_FMT_RGB332: descr = "8-bit RGB 3-3-2"; break; case V4L2_PIX_FMT_RGB444: descr = "16-bit A/XRGB 4-4-4-4"; break; case V4L2_PIX_FMT_ARGB444: descr = "16-bit ARGB 4-4-4-4"; break; case V4L2_PIX_FMT_XRGB444: descr = "16-bit XRGB 4-4-4-4"; break; case V4L2_PIX_FMT_RGBA444: descr = "16-bit RGBA 4-4-4-4"; break; case V4L2_PIX_FMT_RGBX444: descr = "16-bit RGBX 4-4-4-4"; break; case V4L2_PIX_FMT_ABGR444: descr = "16-bit ABGR 4-4-4-4"; break; case V4L2_PIX_FMT_XBGR444: descr = "16-bit XBGR 4-4-4-4"; break; case V4L2_PIX_FMT_BGRA444: descr = "16-bit BGRA 4-4-4-4"; break; case V4L2_PIX_FMT_BGRX444: descr = "16-bit BGRX 4-4-4-4"; break; case V4L2_PIX_FMT_RGB555: descr = "16-bit A/XRGB 1-5-5-5"; break; case V4L2_PIX_FMT_ARGB555: descr = "16-bit ARGB 1-5-5-5"; break; case V4L2_PIX_FMT_XRGB555: descr = "16-bit XRGB 1-5-5-5"; break; case V4L2_PIX_FMT_ABGR555: descr = "16-bit ABGR 1-5-5-5"; break; case V4L2_PIX_FMT_XBGR555: descr = "16-bit XBGR 1-5-5-5"; break; case V4L2_PIX_FMT_RGBA555: descr = "16-bit RGBA 5-5-5-1"; break; case V4L2_PIX_FMT_RGBX555: descr = "16-bit RGBX 5-5-5-1"; break; case V4L2_PIX_FMT_BGRA555: descr = "16-bit BGRA 5-5-5-1"; break; case V4L2_PIX_FMT_BGRX555: descr = "16-bit BGRX 5-5-5-1"; break; case V4L2_PIX_FMT_RGB565: descr = "16-bit RGB 5-6-5"; break; case V4L2_PIX_FMT_RGB555X: descr = "16-bit A/XRGB 1-5-5-5 BE"; break; case V4L2_PIX_FMT_ARGB555X: descr = "16-bit ARGB 1-5-5-5 BE"; break; case V4L2_PIX_FMT_XRGB555X: descr = "16-bit XRGB 1-5-5-5 BE"; break; case V4L2_PIX_FMT_RGB565X: descr = "16-bit RGB 5-6-5 BE"; break; case V4L2_PIX_FMT_BGR666: descr = "18-bit BGRX 6-6-6-14"; break; case V4L2_PIX_FMT_BGR24: descr = "24-bit BGR 8-8-8"; break; case V4L2_PIX_FMT_RGB24: descr = "24-bit RGB 8-8-8"; break; case V4L2_PIX_FMT_BGR32: descr = "32-bit BGRA/X 8-8-8-8"; break; case V4L2_PIX_FMT_ABGR32: descr = "32-bit BGRA 8-8-8-8"; break; case V4L2_PIX_FMT_XBGR32: descr = "32-bit BGRX 8-8-8-8"; break; case V4L2_PIX_FMT_RGB32: descr = "32-bit A/XRGB 8-8-8-8"; break; case V4L2_PIX_FMT_ARGB32: descr = "32-bit ARGB 8-8-8-8"; break; case V4L2_PIX_FMT_XRGB32: descr = "32-bit XRGB 8-8-8-8"; break; case V4L2_PIX_FMT_BGRA32: descr = "32-bit ABGR 8-8-8-8"; break; case V4L2_PIX_FMT_BGRX32: descr = "32-bit XBGR 8-8-8-8"; break; case V4L2_PIX_FMT_RGBA32: descr = "32-bit RGBA 8-8-8-8"; break; case V4L2_PIX_FMT_RGBX32: descr = "32-bit RGBX 8-8-8-8"; break; case V4L2_PIX_FMT_RGBX1010102: descr = "32-bit RGBX 10-10-10-2"; break; case V4L2_PIX_FMT_RGBA1010102: descr = "32-bit RGBA 10-10-10-2"; break; case V4L2_PIX_FMT_ARGB2101010: descr = "32-bit ARGB 2-10-10-10"; break; case V4L2_PIX_FMT_BGR48: descr = "48-bit BGR 16-16-16"; break; case V4L2_PIX_FMT_RGB48: descr = "48-bit RGB 16-16-16"; break; case V4L2_PIX_FMT_BGR48_12: descr = "12-bit Depth BGR"; break; case V4L2_PIX_FMT_ABGR64_12: descr = "12-bit Depth BGRA"; break; case V4L2_PIX_FMT_GREY: descr = "8-bit Greyscale"; break; case V4L2_PIX_FMT_Y4: descr = "4-bit Greyscale"; break; case V4L2_PIX_FMT_Y6: descr = "6-bit Greyscale"; break; case V4L2_PIX_FMT_Y10: descr = "10-bit Greyscale"; break; case V4L2_PIX_FMT_Y12: descr = "12-bit Greyscale"; break; case V4L2_PIX_FMT_Y012: descr = "12-bit Greyscale (bits 15-4)"; break; case V4L2_PIX_FMT_Y14: descr = "14-bit Greyscale"; break; case V4L2_PIX_FMT_Y16: descr = "16-bit Greyscale"; break; case V4L2_PIX_FMT_Y16_BE: descr = "16-bit Greyscale BE"; break; case V4L2_PIX_FMT_Y10BPACK: descr = "10-bit Greyscale (Packed)"; break; case V4L2_PIX_FMT_Y10P: descr = "10-bit Greyscale (MIPI Packed)"; break; case V4L2_PIX_FMT_IPU3_Y10: descr = "10-bit greyscale (IPU3 Packed)"; break; case V4L2_PIX_FMT_Y12P: descr = "12-bit Greyscale (MIPI Packed)"; break; case V4L2_PIX_FMT_Y14P: descr = "14-bit Greyscale (MIPI Packed)"; break; case V4L2_PIX_FMT_Y8I: descr = "Interleaved 8-bit Greyscale"; break; case V4L2_PIX_FMT_Y12I: descr = "Interleaved 12-bit Greyscale"; break; case V4L2_PIX_FMT_Y16I: descr = "Interleaved 16-bit Greyscale"; break; case V4L2_PIX_FMT_Z16: descr = "16-bit Depth"; break; case V4L2_PIX_FMT_INZI: descr = "Planar 10:16 Greyscale Depth"; break; case V4L2_PIX_FMT_CNF4: descr = "4-bit Depth Confidence (Packed)"; break; case V4L2_PIX_FMT_PAL8: descr = "8-bit Palette"; break; case V4L2_PIX_FMT_UV8: descr = "8-bit Chrominance UV 4-4"; break; case V4L2_PIX_FMT_YVU410: descr = "Planar YVU 4:1:0"; break; case V4L2_PIX_FMT_YVU420: descr = "Planar YVU 4:2:0"; break; case V4L2_PIX_FMT_YUYV: descr = "YUYV 4:2:2"; break; case V4L2_PIX_FMT_YYUV: descr = "YYUV 4:2:2"; break; case V4L2_PIX_FMT_YVYU: descr = "YVYU 4:2:2"; break; case V4L2_PIX_FMT_UYVY: descr = "UYVY 4:2:2"; break; case V4L2_PIX_FMT_VYUY: descr = "VYUY 4:2:2"; break; case V4L2_PIX_FMT_YUV422P: descr = "Planar YUV 4:2:2"; break; case V4L2_PIX_FMT_YUV411P: descr = "Planar YUV 4:1:1"; break; case V4L2_PIX_FMT_Y41P: descr = "YUV 4:1:1 (Packed)"; break; case V4L2_PIX_FMT_YUV444: descr = "16-bit A/XYUV 4-4-4-4"; break; case V4L2_PIX_FMT_YUV555: descr = "16-bit A/XYUV 1-5-5-5"; break; case V4L2_PIX_FMT_YUV565: descr = "16-bit YUV 5-6-5"; break; case V4L2_PIX_FMT_YUV24: descr = "24-bit YUV 4:4:4 8-8-8"; break; case V4L2_PIX_FMT_YUV32: descr = "32-bit A/XYUV 8-8-8-8"; break; case V4L2_PIX_FMT_AYUV32: descr = "32-bit AYUV 8-8-8-8"; break; case V4L2_PIX_FMT_XYUV32: descr = "32-bit XYUV 8-8-8-8"; break; case V4L2_PIX_FMT_VUYA32: descr = "32-bit VUYA 8-8-8-8"; break; case V4L2_PIX_FMT_VUYX32: descr = "32-bit VUYX 8-8-8-8"; break; case V4L2_PIX_FMT_YUVA32: descr = "32-bit YUVA 8-8-8-8"; break; case V4L2_PIX_FMT_YUVX32: descr = "32-bit YUVX 8-8-8-8"; break; case V4L2_PIX_FMT_YUV410: descr = "Planar YUV 4:1:0"; break; case V4L2_PIX_FMT_YUV420: descr = "Planar YUV 4:2:0"; break; case V4L2_PIX_FMT_HI240: descr = "8-bit Dithered RGB (BTTV)"; break; case V4L2_PIX_FMT_M420: descr = "YUV 4:2:0 (M420)"; break; case V4L2_PIX_FMT_YUV48_12: descr = "12-bit YUV 4:4:4 Packed"; break; case V4L2_PIX_FMT_NV12: descr = "Y/UV 4:2:0"; break; case V4L2_PIX_FMT_NV21: descr = "Y/VU 4:2:0"; break; case V4L2_PIX_FMT_NV16: descr = "Y/UV 4:2:2"; break; case V4L2_PIX_FMT_NV61: descr = "Y/VU 4:2:2"; break; case V4L2_PIX_FMT_NV24: descr = "Y/UV 4:4:4"; break; case V4L2_PIX_FMT_NV42: descr = "Y/VU 4:4:4"; break; case V4L2_PIX_FMT_P010: descr = "10-bit Y/UV 4:2:0"; break; case V4L2_PIX_FMT_P012: descr = "12-bit Y/UV 4:2:0"; break; case V4L2_PIX_FMT_NV12_4L4: descr = "Y/UV 4:2:0 (4x4 Linear)"; break; case V4L2_PIX_FMT_NV12_16L16: descr = "Y/UV 4:2:0 (16x16 Linear)"; break; case V4L2_PIX_FMT_NV12_32L32: descr = "Y/UV 4:2:0 (32x32 Linear)"; break; case V4L2_PIX_FMT_NV15_4L4: descr = "10-bit Y/UV 4:2:0 (4x4 Linear)"; break; case V4L2_PIX_FMT_P010_4L4: descr = "10-bit Y/UV 4:2:0 (4x4 Linear)"; break; case V4L2_PIX_FMT_NV12M: descr = "Y/UV 4:2:0 (N-C)"; break; case V4L2_PIX_FMT_NV21M: descr = "Y/VU 4:2:0 (N-C)"; break; case V4L2_PIX_FMT_NV16M: descr = "Y/UV 4:2:2 (N-C)"; break; case V4L2_PIX_FMT_NV61M: descr = "Y/VU 4:2:2 (N-C)"; break; case V4L2_PIX_FMT_NV12MT: descr = "Y/UV 4:2:0 (64x32 MB, N-C)"; break; case V4L2_PIX_FMT_NV12MT_16X16: descr = "Y/UV 4:2:0 (16x16 MB, N-C)"; break; case V4L2_PIX_FMT_P012M: descr = "12-bit Y/UV 4:2:0 (N-C)"; break; case V4L2_PIX_FMT_YUV420M: descr = "Planar YUV 4:2:0 (N-C)"; break; case V4L2_PIX_FMT_YVU420M: descr = "Planar YVU 4:2:0 (N-C)"; break; case V4L2_PIX_FMT_YUV422M: descr = "Planar YUV 4:2:2 (N-C)"; break; case V4L2_PIX_FMT_YVU422M: descr = "Planar YVU 4:2:2 (N-C)"; break; case V4L2_PIX_FMT_YUV444M: descr = "Planar YUV 4:4:4 (N-C)"; break; case V4L2_PIX_FMT_YVU444M: descr = "Planar YVU 4:4:4 (N-C)"; break; case V4L2_PIX_FMT_SBGGR8: descr = "8-bit Bayer BGBG/GRGR"; break; case V4L2_PIX_FMT_SGBRG8: descr = "8-bit Bayer GBGB/RGRG"; break; case V4L2_PIX_FMT_SGRBG8: descr = "8-bit Bayer GRGR/BGBG"; break; case V4L2_PIX_FMT_SRGGB8: descr = "8-bit Bayer RGRG/GBGB"; break; case V4L2_PIX_FMT_SBGGR10: descr = "10-bit Bayer BGBG/GRGR"; break; case V4L2_PIX_FMT_SGBRG10: descr = "10-bit Bayer GBGB/RGRG"; break; case V4L2_PIX_FMT_SGRBG10: descr = "10-bit Bayer GRGR/BGBG"; break; case V4L2_PIX_FMT_SRGGB10: descr = "10-bit Bayer RGRG/GBGB"; break; case V4L2_PIX_FMT_SBGGR10P: descr = "10-bit Bayer BGBG/GRGR Packed"; break; case V4L2_PIX_FMT_SGBRG10P: descr = "10-bit Bayer GBGB/RGRG Packed"; break; case V4L2_PIX_FMT_SGRBG10P: descr = "10-bit Bayer GRGR/BGBG Packed"; break; case V4L2_PIX_FMT_SRGGB10P: descr = "10-bit Bayer RGRG/GBGB Packed"; break; case V4L2_PIX_FMT_IPU3_SBGGR10: descr = "10-bit bayer BGGR IPU3 Packed"; break; case V4L2_PIX_FMT_IPU3_SGBRG10: descr = "10-bit bayer GBRG IPU3 Packed"; break; case V4L2_PIX_FMT_IPU3_SGRBG10: descr = "10-bit bayer GRBG IPU3 Packed"; break; case V4L2_PIX_FMT_IPU3_SRGGB10: descr = "10-bit bayer RGGB IPU3 Packed"; break; case V4L2_PIX_FMT_SBGGR10ALAW8: descr = "8-bit Bayer BGBG/GRGR (A-law)"; break; case V4L2_PIX_FMT_SGBRG10ALAW8: descr = "8-bit Bayer GBGB/RGRG (A-law)"; break; case V4L2_PIX_FMT_SGRBG10ALAW8: descr = "8-bit Bayer GRGR/BGBG (A-law)"; break; case V4L2_PIX_FMT_SRGGB10ALAW8: descr = "8-bit Bayer RGRG/GBGB (A-law)"; break; case V4L2_PIX_FMT_SBGGR10DPCM8: descr = "8-bit Bayer BGBG/GRGR (DPCM)"; break; case V4L2_PIX_FMT_SGBRG10DPCM8: descr = "8-bit Bayer GBGB/RGRG (DPCM)"; break; case V4L2_PIX_FMT_SGRBG10DPCM8: descr = "8-bit Bayer GRGR/BGBG (DPCM)"; break; case V4L2_PIX_FMT_SRGGB10DPCM8: descr = "8-bit Bayer RGRG/GBGB (DPCM)"; break; case V4L2_PIX_FMT_SBGGR12: descr = "12-bit Bayer BGBG/GRGR"; break; case V4L2_PIX_FMT_SGBRG12: descr = "12-bit Bayer GBGB/RGRG"; break; case V4L2_PIX_FMT_SGRBG12: descr = "12-bit Bayer GRGR/BGBG"; break; case V4L2_PIX_FMT_SRGGB12: descr = "12-bit Bayer RGRG/GBGB"; break; case V4L2_PIX_FMT_SBGGR12P: descr = "12-bit Bayer BGBG/GRGR Packed"; break; case V4L2_PIX_FMT_SGBRG12P: descr = "12-bit Bayer GBGB/RGRG Packed"; break; case V4L2_PIX_FMT_SGRBG12P: descr = "12-bit Bayer GRGR/BGBG Packed"; break; case V4L2_PIX_FMT_SRGGB12P: descr = "12-bit Bayer RGRG/GBGB Packed"; break; case V4L2_PIX_FMT_SBGGR14: descr = "14-bit Bayer BGBG/GRGR"; break; case V4L2_PIX_FMT_SGBRG14: descr = "14-bit Bayer GBGB/RGRG"; break; case V4L2_PIX_FMT_SGRBG14: descr = "14-bit Bayer GRGR/BGBG"; break; case V4L2_PIX_FMT_SRGGB14: descr = "14-bit Bayer RGRG/GBGB"; break; case V4L2_PIX_FMT_SBGGR14P: descr = "14-bit Bayer BGBG/GRGR Packed"; break; case V4L2_PIX_FMT_SGBRG14P: descr = "14-bit Bayer GBGB/RGRG Packed"; break; case V4L2_PIX_FMT_SGRBG14P: descr = "14-bit Bayer GRGR/BGBG Packed"; break; case V4L2_PIX_FMT_SRGGB14P: descr = "14-bit Bayer RGRG/GBGB Packed"; break; case V4L2_PIX_FMT_SBGGR16: descr = "16-bit Bayer BGBG/GRGR"; break; case V4L2_PIX_FMT_SGBRG16: descr = "16-bit Bayer GBGB/RGRG"; break; case V4L2_PIX_FMT_SGRBG16: descr = "16-bit Bayer GRGR/BGBG"; break; case V4L2_PIX_FMT_SRGGB16: descr = "16-bit Bayer RGRG/GBGB"; break; case V4L2_PIX_FMT_SN9C20X_I420: descr = "GSPCA SN9C20X I420"; break; case V4L2_PIX_FMT_SPCA501: descr = "GSPCA SPCA501"; break; case V4L2_PIX_FMT_SPCA505: descr = "GSPCA SPCA505"; break; case V4L2_PIX_FMT_SPCA508: descr = "GSPCA SPCA508"; break; case V4L2_PIX_FMT_STV0680: descr = "GSPCA STV0680"; break; case V4L2_PIX_FMT_TM6000: descr = "A/V + VBI Mux Packet"; break; case V4L2_PIX_FMT_CIT_YYVYUY: descr = "GSPCA CIT YYVYUY"; break; case V4L2_PIX_FMT_KONICA420: descr = "GSPCA KONICA420"; break; case V4L2_PIX_FMT_MM21: descr = "Mediatek 8-bit Block Format"; break; case V4L2_PIX_FMT_HSV24: descr = "24-bit HSV 8-8-8"; break; case V4L2_PIX_FMT_HSV32: descr = "32-bit XHSV 8-8-8-8"; break; case V4L2_SDR_FMT_CU8: descr = "Complex U8"; break; case V4L2_SDR_FMT_CU16LE: descr = "Complex U16LE"; break; case V4L2_SDR_FMT_CS8: descr = "Complex S8"; break; case V4L2_SDR_FMT_CS14LE: descr = "Complex S14LE"; break; case V4L2_SDR_FMT_RU12LE: descr = "Real U12LE"; break; case V4L2_SDR_FMT_PCU16BE: descr = "Planar Complex U16BE"; break; case V4L2_SDR_FMT_PCU18BE: descr = "Planar Complex U18BE"; break; case V4L2_SDR_FMT_PCU20BE: descr = "Planar Complex U20BE"; break; case V4L2_TCH_FMT_DELTA_TD16: descr = "16-bit Signed Deltas"; break; case V4L2_TCH_FMT_DELTA_TD08: descr = "8-bit Signed Deltas"; break; case V4L2_TCH_FMT_TU16: descr = "16-bit Unsigned Touch Data"; break; case V4L2_TCH_FMT_TU08: descr = "8-bit Unsigned Touch Data"; break; case V4L2_META_FMT_VSP1_HGO: descr = "R-Car VSP1 1-D Histogram"; break; case V4L2_META_FMT_VSP1_HGT: descr = "R-Car VSP1 2-D Histogram"; break; case V4L2_META_FMT_UVC: descr = "UVC Payload Header Metadata"; break; case V4L2_META_FMT_D4XX: descr = "Intel D4xx UVC Metadata"; break; case V4L2_META_FMT_VIVID: descr = "Vivid Metadata"; break; case V4L2_META_FMT_RK_ISP1_PARAMS: descr = "Rockchip ISP1 3A Parameters"; break; case V4L2_META_FMT_RK_ISP1_STAT_3A: descr = "Rockchip ISP1 3A Statistics"; break; case V4L2_META_FMT_RK_ISP1_EXT_PARAMS: descr = "Rockchip ISP1 Ext 3A Params"; break; case V4L2_PIX_FMT_NV12_8L128: descr = "NV12 (8x128 Linear)"; break; case V4L2_PIX_FMT_NV12M_8L128: descr = "NV12M (8x128 Linear)"; break; case V4L2_PIX_FMT_NV12_10BE_8L128: descr = "10-bit NV12 (8x128 Linear, BE)"; break; case V4L2_PIX_FMT_NV12M_10BE_8L128: descr = "10-bit NV12M (8x128 Linear, BE)"; break; case V4L2_PIX_FMT_Y210: descr = "10-bit YUYV Packed"; break; case V4L2_PIX_FMT_Y212: descr = "12-bit YUYV Packed"; break; case V4L2_PIX_FMT_Y216: descr = "16-bit YUYV Packed"; break; case V4L2_META_FMT_RPI_BE_CFG: descr = "RPi PiSP BE Config format"; break; case V4L2_META_FMT_RPI_FE_CFG: descr = "RPi PiSP FE Config format"; break; case V4L2_META_FMT_RPI_FE_STATS: descr = "RPi PiSP FE Statistics format"; break; case V4L2_META_FMT_GENERIC_8: descr = "8-bit Generic Metadata"; break; case V4L2_META_FMT_GENERIC_CSI2_10: descr = "8-bit Generic Meta, 10b CSI-2"; break; case V4L2_META_FMT_GENERIC_CSI2_12: descr = "8-bit Generic Meta, 12b CSI-2"; break; case V4L2_META_FMT_GENERIC_CSI2_14: descr = "8-bit Generic Meta, 14b CSI-2"; break; case V4L2_META_FMT_GENERIC_CSI2_16: descr = "8-bit Generic Meta, 16b CSI-2"; break; case V4L2_META_FMT_GENERIC_CSI2_20: descr = "8-bit Generic Meta, 20b CSI-2"; break; case V4L2_META_FMT_GENERIC_CSI2_24: descr = "8-bit Generic Meta, 24b CSI-2"; break; default: /* Compressed formats */ flags = V4L2_FMT_FLAG_COMPRESSED; switch (fmt->pixelformat) { /* Max description length mask: descr = "0123456789012345678901234567890" */ case V4L2_PIX_FMT_MJPEG: descr = "Motion-JPEG"; break; case V4L2_PIX_FMT_JPEG: descr = "JFIF JPEG"; break; case V4L2_PIX_FMT_DV: descr = "1394"; break; case V4L2_PIX_FMT_MPEG: descr = "MPEG-1/2/4"; break; case V4L2_PIX_FMT_H264: descr = "H.264"; break; case V4L2_PIX_FMT_H264_NO_SC: descr = "H.264 (No Start Codes)"; break; case V4L2_PIX_FMT_H264_MVC: descr = "H.264 MVC"; break; case V4L2_PIX_FMT_H264_SLICE: descr = "H.264 Parsed Slice Data"; break; case V4L2_PIX_FMT_H263: descr = "H.263"; break; case V4L2_PIX_FMT_MPEG1: descr = "MPEG-1 ES"; break; case V4L2_PIX_FMT_MPEG2: descr = "MPEG-2 ES"; break; case V4L2_PIX_FMT_MPEG2_SLICE: descr = "MPEG-2 Parsed Slice Data"; break; case V4L2_PIX_FMT_MPEG4: descr = "MPEG-4 Part 2 ES"; break; case V4L2_PIX_FMT_XVID: descr = "Xvid"; break; case V4L2_PIX_FMT_VC1_ANNEX_G: descr = "VC-1 (SMPTE 412M Annex G)"; break; case V4L2_PIX_FMT_VC1_ANNEX_L: descr = "VC-1 (SMPTE 412M Annex L)"; break; case V4L2_PIX_FMT_VP8: descr = "VP8"; break; case V4L2_PIX_FMT_VP8_FRAME: descr = "VP8 Frame"; break; case V4L2_PIX_FMT_VP9: descr = "VP9"; break; case V4L2_PIX_FMT_VP9_FRAME: descr = "VP9 Frame"; break; case V4L2_PIX_FMT_HEVC: descr = "HEVC"; break; /* aka H.265 */ case V4L2_PIX_FMT_HEVC_SLICE: descr = "HEVC Parsed Slice Data"; break; case V4L2_PIX_FMT_FWHT: descr = "FWHT"; break; /* used in vicodec */ case V4L2_PIX_FMT_FWHT_STATELESS: descr = "FWHT Stateless"; break; /* used in vicodec */ case V4L2_PIX_FMT_SPK: descr = "Sorenson Spark"; break; case V4L2_PIX_FMT_RV30: descr = "RealVideo 8"; break; case V4L2_PIX_FMT_RV40: descr = "RealVideo 9 & 10"; break; case V4L2_PIX_FMT_CPIA1: descr = "GSPCA CPiA YUV"; break; case V4L2_PIX_FMT_WNVA: descr = "WNVA"; break; case V4L2_PIX_FMT_SN9C10X: descr = "GSPCA SN9C10X"; break; case V4L2_PIX_FMT_PWC1: descr = "Raw Philips Webcam Type (Old)"; break; case V4L2_PIX_FMT_PWC2: descr = "Raw Philips Webcam Type (New)"; break; case V4L2_PIX_FMT_ET61X251: descr = "GSPCA ET61X251"; break; case V4L2_PIX_FMT_SPCA561: descr = "GSPCA SPCA561"; break; case V4L2_PIX_FMT_PAC207: descr = "GSPCA PAC207"; break; case V4L2_PIX_FMT_MR97310A: descr = "GSPCA MR97310A"; break; case V4L2_PIX_FMT_JL2005BCD: descr = "GSPCA JL2005BCD"; break; case V4L2_PIX_FMT_SN9C2028: descr = "GSPCA SN9C2028"; break; case V4L2_PIX_FMT_SQ905C: descr = "GSPCA SQ905C"; break; case V4L2_PIX_FMT_PJPG: descr = "GSPCA PJPG"; break; case V4L2_PIX_FMT_OV511: descr = "GSPCA OV511"; break; case V4L2_PIX_FMT_OV518: descr = "GSPCA OV518"; break; case V4L2_PIX_FMT_JPGL: descr = "JPEG Lite"; break; case V4L2_PIX_FMT_SE401: descr = "GSPCA SE401"; break; case V4L2_PIX_FMT_S5C_UYVY_JPG: descr = "S5C73MX interleaved UYVY/JPEG"; break; case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break; case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break; case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break; case V4L2_PIX_FMT_AJPG: descr = "Aspeed JPEG"; break; case V4L2_PIX_FMT_AV1_FRAME: descr = "AV1 Frame"; break; case V4L2_PIX_FMT_MT2110T: descr = "Mediatek 10bit Tile Mode"; break; case V4L2_PIX_FMT_MT2110R: descr = "Mediatek 10bit Raster Mode"; break; case V4L2_PIX_FMT_HEXTILE: descr = "Hextile Compressed Format"; break; case V4L2_PIX_FMT_PISP_COMP1_RGGB: descr = "PiSP 8b RGRG/GBGB mode1 compr"; break; case V4L2_PIX_FMT_PISP_COMP1_GRBG: descr = "PiSP 8b GRGR/BGBG mode1 compr"; break; case V4L2_PIX_FMT_PISP_COMP1_GBRG: descr = "PiSP 8b GBGB/RGRG mode1 compr"; break; case V4L2_PIX_FMT_PISP_COMP1_BGGR: descr = "PiSP 8b BGBG/GRGR mode1 compr"; break; case V4L2_PIX_FMT_PISP_COMP1_MONO: descr = "PiSP 8b monochrome mode1 compr"; break; case V4L2_PIX_FMT_PISP_COMP2_RGGB: descr = "PiSP 8b RGRG/GBGB mode2 compr"; break; case V4L2_PIX_FMT_PISP_COMP2_GRBG: descr = "PiSP 8b GRGR/BGBG mode2 compr"; break; case V4L2_PIX_FMT_PISP_COMP2_GBRG: descr = "PiSP 8b GBGB/RGRG mode2 compr"; break; case V4L2_PIX_FMT_PISP_COMP2_BGGR: descr = "PiSP 8b BGBG/GRGR mode2 compr"; break; case V4L2_PIX_FMT_PISP_COMP2_MONO: descr = "PiSP 8b monochrome mode2 compr"; break; default: if (fmt->description[0]) return; WARN(1, "Unknown pixelformat 0x%08x\n", fmt->pixelformat); flags = 0; snprintf(fmt->description, sz, "%p4cc", &fmt->pixelformat); break; } } if (fmt->type == V4L2_BUF_TYPE_META_CAPTURE) { switch (fmt->pixelformat) { case V4L2_META_FMT_GENERIC_8: case V4L2_META_FMT_GENERIC_CSI2_10: case V4L2_META_FMT_GENERIC_CSI2_12: case V4L2_META_FMT_GENERIC_CSI2_14: case V4L2_META_FMT_GENERIC_CSI2_16: case V4L2_META_FMT_GENERIC_CSI2_20: case V4L2_META_FMT_GENERIC_CSI2_24: fmt->flags |= V4L2_FMT_FLAG_META_LINE_BASED; break; default: fmt->flags &= ~V4L2_FMT_FLAG_META_LINE_BASED; } } if (descr) WARN_ON(strscpy(fmt->description, descr, sz) < 0); fmt->flags |= flags; } static int v4l_enum_fmt(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vdev = video_devdata(file); struct v4l2_fmtdesc *p = arg; int ret = check_fmt(file, p->type); u32 mbus_code; u32 cap_mask; if (ret) return ret; ret = -EINVAL; if (!(vdev->device_caps & V4L2_CAP_IO_MC)) p->mbus_code = 0; mbus_code = p->mbus_code; memset_after(p, 0, type); p->mbus_code = mbus_code; switch (p->type) { case V4L2_BUF_TYPE_VIDEO_CAPTURE: case V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE: cap_mask = V4L2_CAP_VIDEO_CAPTURE_MPLANE | V4L2_CAP_VIDEO_M2M_MPLANE; if (!!(vdev->device_caps & cap_mask) != (p->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)) break; if (unlikely(!ops->vidioc_enum_fmt_vid_cap)) break; ret = ops->vidioc_enum_fmt_vid_cap(file, fh, arg); break; case V4L2_BUF_TYPE_VIDEO_OVERLAY: if (unlikely(!ops->vidioc_enum_fmt_vid_overlay)) break; ret = ops->vidioc_enum_fmt_vid_overlay(file, fh, arg); break; case V4L2_BUF_TYPE_VIDEO_OUTPUT: case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: cap_mask = V4L2_CAP_VIDEO_OUTPUT_MPLANE | V4L2_CAP_VIDEO_M2M_MPLANE; if (!!(vdev->device_caps & cap_mask) != (p->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)) break; if (unlikely(!ops->vidioc_enum_fmt_vid_out)) break; ret = ops->vidioc_enum_fmt_vid_out(file, fh, arg); break; case V4L2_BUF_TYPE_SDR_CAPTURE: if (unlikely(!ops->vidioc_enum_fmt_sdr_cap)) break; ret = ops->vidioc_enum_fmt_sdr_cap(file, fh, arg); break; case V4L2_BUF_TYPE_SDR_OUTPUT: if (unlikely(!ops->vidioc_enum_fmt_sdr_out)) break; ret = ops->vidioc_enum_fmt_sdr_out(file, fh, arg); break; case V4L2_BUF_TYPE_META_CAPTURE: if (unlikely(!ops->vidioc_enum_fmt_meta_cap)) break; ret = ops->vidioc_enum_fmt_meta_cap(file, fh, arg); break; case V4L2_BUF_TYPE_META_OUTPUT: if (unlikely(!ops->vidioc_enum_fmt_meta_out)) break; ret = ops->vidioc_enum_fmt_meta_out(file, fh, arg); break; } if (ret == 0) v4l_fill_fmtdesc(p); return ret; } static void v4l_pix_format_touch(struct v4l2_pix_format *p) { /* * The v4l2_pix_format structure contains fields that make no sense for * touch. Set them to default values in this case. */ p->field = V4L2_FIELD_NONE; p->colorspace = V4L2_COLORSPACE_RAW; p->flags = 0; p->ycbcr_enc = 0; p->quantization = 0; p->xfer_func = 0; } static int v4l_g_fmt(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_format *p = arg; struct video_device *vfd = video_devdata(file); int ret = check_fmt(file, p->type); if (ret) return ret; memset(&p->fmt, 0, sizeof(p->fmt)); switch (p->type) { case V4L2_BUF_TYPE_VIDEO_CAPTURE: if (unlikely(!ops->vidioc_g_fmt_vid_cap)) break; p->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; ret = ops->vidioc_g_fmt_vid_cap(file, fh, arg); /* just in case the driver zeroed it again */ p->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; if (vfd->vfl_type == VFL_TYPE_TOUCH) v4l_pix_format_touch(&p->fmt.pix); return ret; case V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE: return ops->vidioc_g_fmt_vid_cap_mplane(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OVERLAY: return ops->vidioc_g_fmt_vid_overlay(file, fh, arg); case V4L2_BUF_TYPE_VBI_CAPTURE: return ops->vidioc_g_fmt_vbi_cap(file, fh, arg); case V4L2_BUF_TYPE_SLICED_VBI_CAPTURE: return ops->vidioc_g_fmt_sliced_vbi_cap(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OUTPUT: if (unlikely(!ops->vidioc_g_fmt_vid_out)) break; p->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; ret = ops->vidioc_g_fmt_vid_out(file, fh, arg); /* just in case the driver zeroed it again */ p->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; return ret; case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: return ops->vidioc_g_fmt_vid_out_mplane(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OUTPUT_OVERLAY: return ops->vidioc_g_fmt_vid_out_overlay(file, fh, arg); case V4L2_BUF_TYPE_VBI_OUTPUT: return ops->vidioc_g_fmt_vbi_out(file, fh, arg); case V4L2_BUF_TYPE_SLICED_VBI_OUTPUT: return ops->vidioc_g_fmt_sliced_vbi_out(file, fh, arg); case V4L2_BUF_TYPE_SDR_CAPTURE: return ops->vidioc_g_fmt_sdr_cap(file, fh, arg); case V4L2_BUF_TYPE_SDR_OUTPUT: return ops->vidioc_g_fmt_sdr_out(file, fh, arg); case V4L2_BUF_TYPE_META_CAPTURE: return ops->vidioc_g_fmt_meta_cap(file, fh, arg); case V4L2_BUF_TYPE_META_OUTPUT: return ops->vidioc_g_fmt_meta_out(file, fh, arg); } return -EINVAL; } static int v4l_s_fmt(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_format *p = arg; struct video_device *vfd = video_devdata(file); int ret = check_fmt(file, p->type); unsigned int i; if (ret) return ret; ret = v4l_enable_media_source(vfd); if (ret) return ret; v4l_sanitize_format(p); switch (p->type) { case V4L2_BUF_TYPE_VIDEO_CAPTURE: if (unlikely(!ops->vidioc_s_fmt_vid_cap)) break; memset_after(p, 0, fmt.pix); ret = ops->vidioc_s_fmt_vid_cap(file, fh, arg); /* just in case the driver zeroed it again */ p->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; if (vfd->vfl_type == VFL_TYPE_TOUCH) v4l_pix_format_touch(&p->fmt.pix); return ret; case V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE: if (unlikely(!ops->vidioc_s_fmt_vid_cap_mplane)) break; memset_after(p, 0, fmt.pix_mp.xfer_func); for (i = 0; i < p->fmt.pix_mp.num_planes; i++) memset_after(&p->fmt.pix_mp.plane_fmt[i], 0, bytesperline); return ops->vidioc_s_fmt_vid_cap_mplane(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OVERLAY: if (unlikely(!ops->vidioc_s_fmt_vid_overlay)) break; memset_after(p, 0, fmt.win); p->fmt.win.clips = NULL; p->fmt.win.clipcount = 0; p->fmt.win.bitmap = NULL; return ops->vidioc_s_fmt_vid_overlay(file, fh, arg); case V4L2_BUF_TYPE_VBI_CAPTURE: if (unlikely(!ops->vidioc_s_fmt_vbi_cap)) break; memset_after(p, 0, fmt.vbi.flags); return ops->vidioc_s_fmt_vbi_cap(file, fh, arg); case V4L2_BUF_TYPE_SLICED_VBI_CAPTURE: if (unlikely(!ops->vidioc_s_fmt_sliced_vbi_cap)) break; memset_after(p, 0, fmt.sliced.io_size); return ops->vidioc_s_fmt_sliced_vbi_cap(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OUTPUT: if (unlikely(!ops->vidioc_s_fmt_vid_out)) break; memset_after(p, 0, fmt.pix); ret = ops->vidioc_s_fmt_vid_out(file, fh, arg); /* just in case the driver zeroed it again */ p->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; return ret; case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: if (unlikely(!ops->vidioc_s_fmt_vid_out_mplane)) break; memset_after(p, 0, fmt.pix_mp.xfer_func); for (i = 0; i < p->fmt.pix_mp.num_planes; i++) memset_after(&p->fmt.pix_mp.plane_fmt[i], 0, bytesperline); return ops->vidioc_s_fmt_vid_out_mplane(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OUTPUT_OVERLAY: if (unlikely(!ops->vidioc_s_fmt_vid_out_overlay)) break; memset_after(p, 0, fmt.win); p->fmt.win.clips = NULL; p->fmt.win.clipcount = 0; p->fmt.win.bitmap = NULL; return ops->vidioc_s_fmt_vid_out_overlay(file, fh, arg); case V4L2_BUF_TYPE_VBI_OUTPUT: if (unlikely(!ops->vidioc_s_fmt_vbi_out)) break; memset_after(p, 0, fmt.vbi.flags); return ops->vidioc_s_fmt_vbi_out(file, fh, arg); case V4L2_BUF_TYPE_SLICED_VBI_OUTPUT: if (unlikely(!ops->vidioc_s_fmt_sliced_vbi_out)) break; memset_after(p, 0, fmt.sliced.io_size); return ops->vidioc_s_fmt_sliced_vbi_out(file, fh, arg); case V4L2_BUF_TYPE_SDR_CAPTURE: if (unlikely(!ops->vidioc_s_fmt_sdr_cap)) break; memset_after(p, 0, fmt.sdr.buffersize); return ops->vidioc_s_fmt_sdr_cap(file, fh, arg); case V4L2_BUF_TYPE_SDR_OUTPUT: if (unlikely(!ops->vidioc_s_fmt_sdr_out)) break; memset_after(p, 0, fmt.sdr.buffersize); return ops->vidioc_s_fmt_sdr_out(file, fh, arg); case V4L2_BUF_TYPE_META_CAPTURE: if (unlikely(!ops->vidioc_s_fmt_meta_cap)) break; memset_after(p, 0, fmt.meta); return ops->vidioc_s_fmt_meta_cap(file, fh, arg); case V4L2_BUF_TYPE_META_OUTPUT: if (unlikely(!ops->vidioc_s_fmt_meta_out)) break; memset_after(p, 0, fmt.meta); return ops->vidioc_s_fmt_meta_out(file, fh, arg); } return -EINVAL; } static int v4l_try_fmt(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_format *p = arg; struct video_device *vfd = video_devdata(file); int ret = check_fmt(file, p->type); unsigned int i; if (ret) return ret; v4l_sanitize_format(p); switch (p->type) { case V4L2_BUF_TYPE_VIDEO_CAPTURE: if (unlikely(!ops->vidioc_try_fmt_vid_cap)) break; memset_after(p, 0, fmt.pix); ret = ops->vidioc_try_fmt_vid_cap(file, fh, arg); /* just in case the driver zeroed it again */ p->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; if (vfd->vfl_type == VFL_TYPE_TOUCH) v4l_pix_format_touch(&p->fmt.pix); return ret; case V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE: if (unlikely(!ops->vidioc_try_fmt_vid_cap_mplane)) break; memset_after(p, 0, fmt.pix_mp.xfer_func); for (i = 0; i < p->fmt.pix_mp.num_planes; i++) memset_after(&p->fmt.pix_mp.plane_fmt[i], 0, bytesperline); return ops->vidioc_try_fmt_vid_cap_mplane(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OVERLAY: if (unlikely(!ops->vidioc_try_fmt_vid_overlay)) break; memset_after(p, 0, fmt.win); p->fmt.win.clips = NULL; p->fmt.win.clipcount = 0; p->fmt.win.bitmap = NULL; return ops->vidioc_try_fmt_vid_overlay(file, fh, arg); case V4L2_BUF_TYPE_VBI_CAPTURE: if (unlikely(!ops->vidioc_try_fmt_vbi_cap)) break; memset_after(p, 0, fmt.vbi.flags); return ops->vidioc_try_fmt_vbi_cap(file, fh, arg); case V4L2_BUF_TYPE_SLICED_VBI_CAPTURE: if (unlikely(!ops->vidioc_try_fmt_sliced_vbi_cap)) break; memset_after(p, 0, fmt.sliced.io_size); return ops->vidioc_try_fmt_sliced_vbi_cap(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OUTPUT: if (unlikely(!ops->vidioc_try_fmt_vid_out)) break; memset_after(p, 0, fmt.pix); ret = ops->vidioc_try_fmt_vid_out(file, fh, arg); /* just in case the driver zeroed it again */ p->fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; return ret; case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: if (unlikely(!ops->vidioc_try_fmt_vid_out_mplane)) break; memset_after(p, 0, fmt.pix_mp.xfer_func); for (i = 0; i < p->fmt.pix_mp.num_planes; i++) memset_after(&p->fmt.pix_mp.plane_fmt[i], 0, bytesperline); return ops->vidioc_try_fmt_vid_out_mplane(file, fh, arg); case V4L2_BUF_TYPE_VIDEO_OUTPUT_OVERLAY: if (unlikely(!ops->vidioc_try_fmt_vid_out_overlay)) break; memset_after(p, 0, fmt.win); p->fmt.win.clips = NULL; p->fmt.win.clipcount = 0; p->fmt.win.bitmap = NULL; return ops->vidioc_try_fmt_vid_out_overlay(file, fh, arg); case V4L2_BUF_TYPE_VBI_OUTPUT: if (unlikely(!ops->vidioc_try_fmt_vbi_out)) break; memset_after(p, 0, fmt.vbi.flags); return ops->vidioc_try_fmt_vbi_out(file, fh, arg); case V4L2_BUF_TYPE_SLICED_VBI_OUTPUT: if (unlikely(!ops->vidioc_try_fmt_sliced_vbi_out)) break; memset_after(p, 0, fmt.sliced.io_size); return ops->vidioc_try_fmt_sliced_vbi_out(file, fh, arg); case V4L2_BUF_TYPE_SDR_CAPTURE: if (unlikely(!ops->vidioc_try_fmt_sdr_cap)) break; memset_after(p, 0, fmt.sdr.buffersize); return ops->vidioc_try_fmt_sdr_cap(file, fh, arg); case V4L2_BUF_TYPE_SDR_OUTPUT: if (unlikely(!ops->vidioc_try_fmt_sdr_out)) break; memset_after(p, 0, fmt.sdr.buffersize); return ops->vidioc_try_fmt_sdr_out(file, fh, arg); case V4L2_BUF_TYPE_META_CAPTURE: if (unlikely(!ops->vidioc_try_fmt_meta_cap)) break; memset_after(p, 0, fmt.meta); return ops->vidioc_try_fmt_meta_cap(file, fh, arg); case V4L2_BUF_TYPE_META_OUTPUT: if (unlikely(!ops->vidioc_try_fmt_meta_out)) break; memset_after(p, 0, fmt.meta); return ops->vidioc_try_fmt_meta_out(file, fh, arg); } return -EINVAL; } static int v4l_streamon(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { return ops->vidioc_streamon(file, fh, *(unsigned int *)arg); } static int v4l_streamoff(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { return ops->vidioc_streamoff(file, fh, *(unsigned int *)arg); } static int v4l_g_tuner(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_tuner *p = arg; int err; p->type = (vfd->vfl_type == VFL_TYPE_RADIO) ? V4L2_TUNER_RADIO : V4L2_TUNER_ANALOG_TV; err = ops->vidioc_g_tuner(file, fh, p); if (!err) p->capability |= V4L2_TUNER_CAP_FREQ_BANDS; return err; } static int v4l_s_tuner(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_tuner *p = arg; int ret; ret = v4l_enable_media_source(vfd); if (ret) return ret; p->type = (vfd->vfl_type == VFL_TYPE_RADIO) ? V4L2_TUNER_RADIO : V4L2_TUNER_ANALOG_TV; return ops->vidioc_s_tuner(file, fh, p); } static int v4l_g_modulator(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_modulator *p = arg; int err; if (vfd->vfl_type == VFL_TYPE_RADIO) p->type = V4L2_TUNER_RADIO; err = ops->vidioc_g_modulator(file, fh, p); if (!err) p->capability |= V4L2_TUNER_CAP_FREQ_BANDS; return err; } static int v4l_s_modulator(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_modulator *p = arg; if (vfd->vfl_type == VFL_TYPE_RADIO) p->type = V4L2_TUNER_RADIO; return ops->vidioc_s_modulator(file, fh, p); } static int v4l_g_frequency(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_frequency *p = arg; if (vfd->vfl_type == VFL_TYPE_SDR) p->type = V4L2_TUNER_SDR; else p->type = (vfd->vfl_type == VFL_TYPE_RADIO) ? V4L2_TUNER_RADIO : V4L2_TUNER_ANALOG_TV; return ops->vidioc_g_frequency(file, fh, p); } static int v4l_s_frequency(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); const struct v4l2_frequency *p = arg; enum v4l2_tuner_type type; int ret; ret = v4l_enable_media_source(vfd); if (ret) return ret; if (vfd->vfl_type == VFL_TYPE_SDR) { if (p->type != V4L2_TUNER_SDR && p->type != V4L2_TUNER_RF) return -EINVAL; } else { type = (vfd->vfl_type == VFL_TYPE_RADIO) ? V4L2_TUNER_RADIO : V4L2_TUNER_ANALOG_TV; if (type != p->type) return -EINVAL; } return ops->vidioc_s_frequency(file, fh, p); } static int v4l_enumstd(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_standard *p = arg; return v4l_video_std_enumstd(p, vfd->tvnorms); } static int v4l_s_std(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); v4l2_std_id id = *(v4l2_std_id *)arg, norm; int ret; ret = v4l_enable_media_source(vfd); if (ret) return ret; norm = id & vfd->tvnorms; if (vfd->tvnorms && !norm) /* Check if std is supported */ return -EINVAL; /* Calls the specific handler */ return ops->vidioc_s_std(file, fh, norm); } static int v4l_querystd(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); v4l2_std_id *p = arg; int ret; ret = v4l_enable_media_source(vfd); if (ret) return ret; /* * If no signal is detected, then the driver should return * V4L2_STD_UNKNOWN. Otherwise it should return tvnorms with * any standards that do not apply removed. * * This means that tuners, audio and video decoders can join * their efforts to improve the standards detection. */ *p = vfd->tvnorms; return ops->vidioc_querystd(file, fh, arg); } static int v4l_s_hw_freq_seek(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_hw_freq_seek *p = arg; enum v4l2_tuner_type type; int ret; ret = v4l_enable_media_source(vfd); if (ret) return ret; /* s_hw_freq_seek is not supported for SDR for now */ if (vfd->vfl_type == VFL_TYPE_SDR) return -EINVAL; type = (vfd->vfl_type == VFL_TYPE_RADIO) ? V4L2_TUNER_RADIO : V4L2_TUNER_ANALOG_TV; if (p->type != type) return -EINVAL; return ops->vidioc_s_hw_freq_seek(file, fh, p); } static int v4l_s_fbuf(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_framebuffer *p = arg; p->base = NULL; return ops->vidioc_s_fbuf(file, fh, p); } static int v4l_overlay(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { return ops->vidioc_overlay(file, fh, *(unsigned int *)arg); } static int v4l_reqbufs(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_requestbuffers *p = arg; int ret = check_fmt(file, p->type); if (ret) return ret; memset_after(p, 0, flags); p->capabilities = 0; if (is_valid_ioctl(vfd, VIDIOC_REMOVE_BUFS)) p->capabilities = V4L2_BUF_CAP_SUPPORTS_REMOVE_BUFS; return ops->vidioc_reqbufs(file, fh, p); } static int v4l_querybuf(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_buffer *p = arg; int ret = check_fmt(file, p->type); return ret ? ret : ops->vidioc_querybuf(file, fh, p); } static int v4l_qbuf(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_buffer *p = arg; int ret = check_fmt(file, p->type); return ret ? ret : ops->vidioc_qbuf(file, fh, p); } static int v4l_dqbuf(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_buffer *p = arg; int ret = check_fmt(file, p->type); return ret ? ret : ops->vidioc_dqbuf(file, fh, p); } static int v4l_create_bufs(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_create_buffers *create = arg; int ret = check_fmt(file, create->format.type); if (ret) return ret; memset_after(create, 0, flags); v4l_sanitize_format(&create->format); create->capabilities = 0; if (is_valid_ioctl(vfd, VIDIOC_REMOVE_BUFS)) create->capabilities = V4L2_BUF_CAP_SUPPORTS_REMOVE_BUFS; ret = ops->vidioc_create_bufs(file, fh, create); if (create->format.type == V4L2_BUF_TYPE_VIDEO_CAPTURE || create->format.type == V4L2_BUF_TYPE_VIDEO_OUTPUT) create->format.fmt.pix.priv = V4L2_PIX_FMT_PRIV_MAGIC; return ret; } static int v4l_prepare_buf(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_buffer *b = arg; int ret = check_fmt(file, b->type); return ret ? ret : ops->vidioc_prepare_buf(file, fh, b); } static int v4l_remove_bufs(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_remove_buffers *remove = arg; if (ops->vidioc_remove_bufs) return ops->vidioc_remove_bufs(file, fh, remove); return -ENOTTY; } static int v4l_g_parm(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_streamparm *p = arg; v4l2_std_id std; int ret = check_fmt(file, p->type); if (ret) return ret; if (ops->vidioc_g_parm) return ops->vidioc_g_parm(file, fh, p); if (p->type != V4L2_BUF_TYPE_VIDEO_CAPTURE && p->type != V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) return -EINVAL; if (vfd->device_caps & V4L2_CAP_READWRITE) p->parm.capture.readbuffers = 2; ret = ops->vidioc_g_std(file, fh, &std); if (ret == 0) v4l2_video_std_frame_period(std, &p->parm.capture.timeperframe); return ret; } static int v4l_s_parm(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_streamparm *p = arg; int ret = check_fmt(file, p->type); if (ret) return ret; /* Note: extendedmode is never used in drivers */ if (V4L2_TYPE_IS_OUTPUT(p->type)) { memset(p->parm.output.reserved, 0, sizeof(p->parm.output.reserved)); p->parm.output.extendedmode = 0; p->parm.output.outputmode &= V4L2_MODE_HIGHQUALITY; } else { memset(p->parm.capture.reserved, 0, sizeof(p->parm.capture.reserved)); p->parm.capture.extendedmode = 0; p->parm.capture.capturemode &= V4L2_MODE_HIGHQUALITY; } return ops->vidioc_s_parm(file, fh, p); } static int v4l_queryctrl(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_query_ext_ctrl qec = {}; struct v4l2_queryctrl *p = arg; struct v4l2_fh *vfh = test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags) ? fh : NULL; int ret; if (vfh && vfh->ctrl_handler) return v4l2_queryctrl(vfh->ctrl_handler, p); if (vfd->ctrl_handler) return v4l2_queryctrl(vfd->ctrl_handler, p); if (!ops->vidioc_query_ext_ctrl) return -ENOTTY; /* Simulate query_ext_ctr using query_ctrl. */ qec.id = p->id; ret = ops->vidioc_query_ext_ctrl(file, fh, &qec); if (ret) return ret; v4l2_query_ext_ctrl_to_v4l2_queryctrl(p, &qec); return ret; } static int v4l_query_ext_ctrl(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_query_ext_ctrl *p = arg; struct v4l2_fh *vfh = test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags) ? fh : NULL; if (vfh && vfh->ctrl_handler) return v4l2_query_ext_ctrl(vfh->ctrl_handler, p); if (vfd->ctrl_handler) return v4l2_query_ext_ctrl(vfd->ctrl_handler, p); if (ops->vidioc_query_ext_ctrl) return ops->vidioc_query_ext_ctrl(file, fh, p); return -ENOTTY; } static int v4l_querymenu(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_querymenu *p = arg; struct v4l2_fh *vfh = test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags) ? fh : NULL; if (vfh && vfh->ctrl_handler) return v4l2_querymenu(vfh->ctrl_handler, p); if (vfd->ctrl_handler) return v4l2_querymenu(vfd->ctrl_handler, p); if (ops->vidioc_querymenu) return ops->vidioc_querymenu(file, fh, p); return -ENOTTY; } static int v4l_g_ctrl(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_control *p = arg; struct v4l2_fh *vfh = test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags) ? fh : NULL; struct v4l2_ext_controls ctrls; struct v4l2_ext_control ctrl; if (vfh && vfh->ctrl_handler) return v4l2_g_ctrl(vfh->ctrl_handler, p); if (vfd->ctrl_handler) return v4l2_g_ctrl(vfd->ctrl_handler, p); if (ops->vidioc_g_ext_ctrls == NULL) return -ENOTTY; ctrls.which = V4L2_CTRL_ID2WHICH(p->id); ctrls.count = 1; ctrls.controls = &ctrl; ctrl.id = p->id; ctrl.value = p->value; if (check_ext_ctrls(&ctrls, VIDIOC_G_CTRL)) { int ret = ops->vidioc_g_ext_ctrls(file, fh, &ctrls); if (ret == 0) p->value = ctrl.value; return ret; } return -EINVAL; } static int v4l_s_ctrl(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_control *p = arg; struct v4l2_fh *vfh = test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags) ? fh : NULL; struct v4l2_ext_controls ctrls; struct v4l2_ext_control ctrl; int ret; if (vfh && vfh->ctrl_handler) return v4l2_s_ctrl(vfh, vfh->ctrl_handler, p); if (vfd->ctrl_handler) return v4l2_s_ctrl(NULL, vfd->ctrl_handler, p); if (ops->vidioc_s_ext_ctrls == NULL) return -ENOTTY; ctrls.which = V4L2_CTRL_ID2WHICH(p->id); ctrls.count = 1; ctrls.controls = &ctrl; ctrl.id = p->id; ctrl.value = p->value; if (!check_ext_ctrls(&ctrls, VIDIOC_S_CTRL)) return -EINVAL; ret = ops->vidioc_s_ext_ctrls(file, fh, &ctrls); p->value = ctrl.value; return ret; } static int v4l_g_ext_ctrls(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_ext_controls *p = arg; struct v4l2_fh *vfh = test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags) ? fh : NULL; p->error_idx = p->count; if (vfh && vfh->ctrl_handler) return v4l2_g_ext_ctrls(vfh->ctrl_handler, vfd, vfd->v4l2_dev->mdev, p); if (vfd->ctrl_handler) return v4l2_g_ext_ctrls(vfd->ctrl_handler, vfd, vfd->v4l2_dev->mdev, p); if (ops->vidioc_g_ext_ctrls == NULL) return -ENOTTY; return check_ext_ctrls(p, VIDIOC_G_EXT_CTRLS) ? ops->vidioc_g_ext_ctrls(file, fh, p) : -EINVAL; } static int v4l_s_ext_ctrls(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_ext_controls *p = arg; struct v4l2_fh *vfh = test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags) ? fh : NULL; p->error_idx = p->count; if (vfh && vfh->ctrl_handler) return v4l2_s_ext_ctrls(vfh, vfh->ctrl_handler, vfd, vfd->v4l2_dev->mdev, p); if (vfd->ctrl_handler) return v4l2_s_ext_ctrls(NULL, vfd->ctrl_handler, vfd, vfd->v4l2_dev->mdev, p); if (ops->vidioc_s_ext_ctrls == NULL) return -ENOTTY; return check_ext_ctrls(p, VIDIOC_S_EXT_CTRLS) ? ops->vidioc_s_ext_ctrls(file, fh, p) : -EINVAL; } static int v4l_try_ext_ctrls(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_ext_controls *p = arg; struct v4l2_fh *vfh = test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags) ? fh : NULL; p->error_idx = p->count; if (vfh && vfh->ctrl_handler) return v4l2_try_ext_ctrls(vfh->ctrl_handler, vfd, vfd->v4l2_dev->mdev, p); if (vfd->ctrl_handler) return v4l2_try_ext_ctrls(vfd->ctrl_handler, vfd, vfd->v4l2_dev->mdev, p); if (ops->vidioc_try_ext_ctrls == NULL) return -ENOTTY; return check_ext_ctrls(p, VIDIOC_TRY_EXT_CTRLS) ? ops->vidioc_try_ext_ctrls(file, fh, p) : -EINVAL; } /* * The selection API specified originally that the _MPLANE buffer types * shouldn't be used. The reasons for this are lost in the mists of time * (or just really crappy memories). Regardless, this is really annoying * for userspace. So to keep things simple we map _MPLANE buffer types * to their 'regular' counterparts before calling the driver. And we * restore it afterwards. This way applications can use either buffer * type and drivers don't need to check for both. */ static int v4l_g_selection(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_selection *p = arg; u32 old_type = p->type; int ret; if (p->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) p->type = V4L2_BUF_TYPE_VIDEO_CAPTURE; else if (p->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) p->type = V4L2_BUF_TYPE_VIDEO_OUTPUT; ret = ops->vidioc_g_selection(file, fh, p); p->type = old_type; return ret; } static int v4l_s_selection(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_selection *p = arg; u32 old_type = p->type; int ret; if (p->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) p->type = V4L2_BUF_TYPE_VIDEO_CAPTURE; else if (p->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) p->type = V4L2_BUF_TYPE_VIDEO_OUTPUT; ret = ops->vidioc_s_selection(file, fh, p); p->type = old_type; return ret; } static int v4l_g_crop(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_crop *p = arg; struct v4l2_selection s = { .type = p->type, }; int ret; /* simulate capture crop using selection api */ /* crop means compose for output devices */ if (V4L2_TYPE_IS_OUTPUT(p->type)) s.target = V4L2_SEL_TGT_COMPOSE; else s.target = V4L2_SEL_TGT_CROP; if (test_bit(V4L2_FL_QUIRK_INVERTED_CROP, &vfd->flags)) s.target = s.target == V4L2_SEL_TGT_COMPOSE ? V4L2_SEL_TGT_CROP : V4L2_SEL_TGT_COMPOSE; ret = v4l_g_selection(ops, file, fh, &s); /* copying results to old structure on success */ if (!ret) p->c = s.r; return ret; } static int v4l_s_crop(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_crop *p = arg; struct v4l2_selection s = { .type = p->type, .r = p->c, }; /* simulate capture crop using selection api */ /* crop means compose for output devices */ if (V4L2_TYPE_IS_OUTPUT(p->type)) s.target = V4L2_SEL_TGT_COMPOSE; else s.target = V4L2_SEL_TGT_CROP; if (test_bit(V4L2_FL_QUIRK_INVERTED_CROP, &vfd->flags)) s.target = s.target == V4L2_SEL_TGT_COMPOSE ? V4L2_SEL_TGT_CROP : V4L2_SEL_TGT_COMPOSE; return v4l_s_selection(ops, file, fh, &s); } static int v4l_cropcap(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_cropcap *p = arg; struct v4l2_selection s = { .type = p->type }; int ret = 0; /* setting trivial pixelaspect */ p->pixelaspect.numerator = 1; p->pixelaspect.denominator = 1; if (s.type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) s.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; else if (s.type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) s.type = V4L2_BUF_TYPE_VIDEO_OUTPUT; /* * The determine_valid_ioctls() call already should ensure * that this can never happen, but just in case... */ if (WARN_ON(!ops->vidioc_g_selection)) return -ENOTTY; if (ops->vidioc_g_pixelaspect) ret = ops->vidioc_g_pixelaspect(file, fh, s.type, &p->pixelaspect); /* * Ignore ENOTTY or ENOIOCTLCMD error returns, just use the * square pixel aspect ratio in that case. */ if (ret && ret != -ENOTTY && ret != -ENOIOCTLCMD) return ret; /* Use g_selection() to fill in the bounds and defrect rectangles */ /* obtaining bounds */ if (V4L2_TYPE_IS_OUTPUT(p->type)) s.target = V4L2_SEL_TGT_COMPOSE_BOUNDS; else s.target = V4L2_SEL_TGT_CROP_BOUNDS; if (test_bit(V4L2_FL_QUIRK_INVERTED_CROP, &vfd->flags)) s.target = s.target == V4L2_SEL_TGT_COMPOSE_BOUNDS ? V4L2_SEL_TGT_CROP_BOUNDS : V4L2_SEL_TGT_COMPOSE_BOUNDS; ret = v4l_g_selection(ops, file, fh, &s); if (ret) return ret; p->bounds = s.r; /* obtaining defrect */ if (s.target == V4L2_SEL_TGT_COMPOSE_BOUNDS) s.target = V4L2_SEL_TGT_COMPOSE_DEFAULT; else s.target = V4L2_SEL_TGT_CROP_DEFAULT; ret = v4l_g_selection(ops, file, fh, &s); if (ret) return ret; p->defrect = s.r; return 0; } static int v4l_log_status(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); int ret; if (vfd->v4l2_dev) pr_info("%s: ================= START STATUS =================\n", vfd->v4l2_dev->name); ret = ops->vidioc_log_status(file, fh); if (vfd->v4l2_dev) pr_info("%s: ================== END STATUS ==================\n", vfd->v4l2_dev->name); return ret; } static int v4l_dbg_g_register(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { #ifdef CONFIG_VIDEO_ADV_DEBUG struct v4l2_dbg_register *p = arg; struct video_device *vfd = video_devdata(file); struct v4l2_subdev *sd; int idx = 0; if (!capable(CAP_SYS_ADMIN)) return -EPERM; if (p->match.type == V4L2_CHIP_MATCH_SUBDEV) { if (vfd->v4l2_dev == NULL) return -EINVAL; v4l2_device_for_each_subdev(sd, vfd->v4l2_dev) if (p->match.addr == idx++) return v4l2_subdev_call(sd, core, g_register, p); return -EINVAL; } if (ops->vidioc_g_register && p->match.type == V4L2_CHIP_MATCH_BRIDGE && (ops->vidioc_g_chip_info || p->match.addr == 0)) return ops->vidioc_g_register(file, fh, p); return -EINVAL; #else return -ENOTTY; #endif } static int v4l_dbg_s_register(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { #ifdef CONFIG_VIDEO_ADV_DEBUG const struct v4l2_dbg_register *p = arg; struct video_device *vfd = video_devdata(file); struct v4l2_subdev *sd; int idx = 0; if (!capable(CAP_SYS_ADMIN)) return -EPERM; if (p->match.type == V4L2_CHIP_MATCH_SUBDEV) { if (vfd->v4l2_dev == NULL) return -EINVAL; v4l2_device_for_each_subdev(sd, vfd->v4l2_dev) if (p->match.addr == idx++) return v4l2_subdev_call(sd, core, s_register, p); return -EINVAL; } if (ops->vidioc_s_register && p->match.type == V4L2_CHIP_MATCH_BRIDGE && (ops->vidioc_g_chip_info || p->match.addr == 0)) return ops->vidioc_s_register(file, fh, p); return -EINVAL; #else return -ENOTTY; #endif } static int v4l_dbg_g_chip_info(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { #ifdef CONFIG_VIDEO_ADV_DEBUG struct video_device *vfd = video_devdata(file); struct v4l2_dbg_chip_info *p = arg; struct v4l2_subdev *sd; int idx = 0; switch (p->match.type) { case V4L2_CHIP_MATCH_BRIDGE: if (ops->vidioc_s_register) p->flags |= V4L2_CHIP_FL_WRITABLE; if (ops->vidioc_g_register) p->flags |= V4L2_CHIP_FL_READABLE; strscpy(p->name, vfd->v4l2_dev->name, sizeof(p->name)); if (ops->vidioc_g_chip_info) return ops->vidioc_g_chip_info(file, fh, arg); if (p->match.addr) return -EINVAL; return 0; case V4L2_CHIP_MATCH_SUBDEV: if (vfd->v4l2_dev == NULL) break; v4l2_device_for_each_subdev(sd, vfd->v4l2_dev) { if (p->match.addr != idx++) continue; if (sd->ops->core && sd->ops->core->s_register) p->flags |= V4L2_CHIP_FL_WRITABLE; if (sd->ops->core && sd->ops->core->g_register) p->flags |= V4L2_CHIP_FL_READABLE; strscpy(p->name, sd->name, sizeof(p->name)); return 0; } break; } return -EINVAL; #else return -ENOTTY; #endif } static int v4l_dqevent(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { return v4l2_event_dequeue(fh, arg, file->f_flags & O_NONBLOCK); } static int v4l_subscribe_event(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { return ops->vidioc_subscribe_event(fh, arg); } static int v4l_unsubscribe_event(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { return ops->vidioc_unsubscribe_event(fh, arg); } static int v4l_g_sliced_vbi_cap(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct v4l2_sliced_vbi_cap *p = arg; int ret = check_fmt(file, p->type); if (ret) return ret; /* Clear up to type, everything after type is zeroed already */ memset(p, 0, offsetof(struct v4l2_sliced_vbi_cap, type)); return ops->vidioc_g_sliced_vbi_cap(file, fh, p); } static int v4l_enum_freq_bands(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *arg) { struct video_device *vfd = video_devdata(file); struct v4l2_frequency_band *p = arg; enum v4l2_tuner_type type; int err; if (vfd->vfl_type == VFL_TYPE_SDR) { if (p->type != V4L2_TUNER_SDR && p->type != V4L2_TUNER_RF) return -EINVAL; type = p->type; } else { type = (vfd->vfl_type == VFL_TYPE_RADIO) ? V4L2_TUNER_RADIO : V4L2_TUNER_ANALOG_TV; if (type != p->type) return -EINVAL; } if (ops->vidioc_enum_freq_bands) { err = ops->vidioc_enum_freq_bands(file, fh, p); if (err != -ENOTTY) return err; } if (is_valid_ioctl(vfd, VIDIOC_G_TUNER)) { struct v4l2_tuner t = { .index = p->tuner, .type = type, }; if (p->index) return -EINVAL; err = ops->vidioc_g_tuner(file, fh, &t); if (err) return err; p->capability = t.capability | V4L2_TUNER_CAP_FREQ_BANDS; p->rangelow = t.rangelow; p->rangehigh = t.rangehigh; p->modulation = (type == V4L2_TUNER_RADIO) ? V4L2_BAND_MODULATION_FM : V4L2_BAND_MODULATION_VSB; return 0; } if (is_valid_ioctl(vfd, VIDIOC_G_MODULATOR)) { struct v4l2_modulator m = { .index = p->tuner, }; if (type != V4L2_TUNER_RADIO) return -EINVAL; if (p->index) return -EINVAL; err = ops->vidioc_g_modulator(file, fh, &m); if (err) return err; p->capability = m.capability | V4L2_TUNER_CAP_FREQ_BANDS; p->rangelow = m.rangelow; p->rangehigh = m.rangehigh; p->modulation = (type == V4L2_TUNER_RADIO) ? V4L2_BAND_MODULATION_FM : V4L2_BAND_MODULATION_VSB; return 0; } return -ENOTTY; } struct v4l2_ioctl_info { unsigned int ioctl; u32 flags; const char * const name; int (*func)(const struct v4l2_ioctl_ops *ops, struct file *file, void *fh, void *p); void (*debug)(const void *arg, bool write_only); }; /* This control needs a priority check */ #define INFO_FL_PRIO (1 << 0) /* This control can be valid if the filehandle passes a control handler. */ #define INFO_FL_CTRL (1 << 1) /* Queuing ioctl */ #define INFO_FL_QUEUE (1 << 2) /* Always copy back result, even on error */ #define INFO_FL_ALWAYS_COPY (1 << 3) /* Zero struct from after the field to the end */ #define INFO_FL_CLEAR(v4l2_struct, field) \ ((offsetof(struct v4l2_struct, field) + \ sizeof_field(struct v4l2_struct, field)) << 16) #define INFO_FL_CLEAR_MASK (_IOC_SIZEMASK << 16) #define DEFINE_V4L_STUB_FUNC(_vidioc) \ static int v4l_stub_ ## _vidioc( \ const struct v4l2_ioctl_ops *ops, \ struct file *file, void *fh, void *p) \ { \ return ops->vidioc_ ## _vidioc(file, fh, p); \ } #define IOCTL_INFO(_ioctl, _func, _debug, _flags) \ [_IOC_NR(_ioctl)] = { \ .ioctl = _ioctl, \ .flags = _flags, \ .name = #_ioctl, \ .func = _func, \ .debug = _debug, \ } DEFINE_V4L_STUB_FUNC(g_fbuf) DEFINE_V4L_STUB_FUNC(expbuf) DEFINE_V4L_STUB_FUNC(g_std) DEFINE_V4L_STUB_FUNC(g_audio) DEFINE_V4L_STUB_FUNC(s_audio) DEFINE_V4L_STUB_FUNC(g_edid) DEFINE_V4L_STUB_FUNC(s_edid) DEFINE_V4L_STUB_FUNC(g_audout) DEFINE_V4L_STUB_FUNC(s_audout) DEFINE_V4L_STUB_FUNC(g_jpegcomp) DEFINE_V4L_STUB_FUNC(s_jpegcomp) DEFINE_V4L_STUB_FUNC(enumaudio) DEFINE_V4L_STUB_FUNC(enumaudout) DEFINE_V4L_STUB_FUNC(enum_framesizes) DEFINE_V4L_STUB_FUNC(enum_frameintervals) DEFINE_V4L_STUB_FUNC(g_enc_index) DEFINE_V4L_STUB_FUNC(encoder_cmd) DEFINE_V4L_STUB_FUNC(try_encoder_cmd) DEFINE_V4L_STUB_FUNC(decoder_cmd) DEFINE_V4L_STUB_FUNC(try_decoder_cmd) DEFINE_V4L_STUB_FUNC(s_dv_timings) DEFINE_V4L_STUB_FUNC(g_dv_timings) DEFINE_V4L_STUB_FUNC(enum_dv_timings) DEFINE_V4L_STUB_FUNC(query_dv_timings) DEFINE_V4L_STUB_FUNC(dv_timings_cap) static const struct v4l2_ioctl_info v4l2_ioctls[] = { IOCTL_INFO(VIDIOC_QUERYCAP, v4l_querycap, v4l_print_querycap, 0), IOCTL_INFO(VIDIOC_ENUM_FMT, v4l_enum_fmt, v4l_print_fmtdesc, 0), IOCTL_INFO(VIDIOC_G_FMT, v4l_g_fmt, v4l_print_format, 0), IOCTL_INFO(VIDIOC_S_FMT, v4l_s_fmt, v4l_print_format, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_REQBUFS, v4l_reqbufs, v4l_print_requestbuffers, INFO_FL_PRIO | INFO_FL_QUEUE), IOCTL_INFO(VIDIOC_QUERYBUF, v4l_querybuf, v4l_print_buffer, INFO_FL_QUEUE | INFO_FL_CLEAR(v4l2_buffer, length)), IOCTL_INFO(VIDIOC_G_FBUF, v4l_stub_g_fbuf, v4l_print_framebuffer, 0), IOCTL_INFO(VIDIOC_S_FBUF, v4l_s_fbuf, v4l_print_framebuffer, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_OVERLAY, v4l_overlay, v4l_print_u32, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_QBUF, v4l_qbuf, v4l_print_buffer, INFO_FL_QUEUE), IOCTL_INFO(VIDIOC_EXPBUF, v4l_stub_expbuf, v4l_print_exportbuffer, INFO_FL_QUEUE | INFO_FL_CLEAR(v4l2_exportbuffer, flags)), IOCTL_INFO(VIDIOC_DQBUF, v4l_dqbuf, v4l_print_buffer, INFO_FL_QUEUE), IOCTL_INFO(VIDIOC_STREAMON, v4l_streamon, v4l_print_buftype, INFO_FL_PRIO | INFO_FL_QUEUE), IOCTL_INFO(VIDIOC_STREAMOFF, v4l_streamoff, v4l_print_buftype, INFO_FL_PRIO | INFO_FL_QUEUE), IOCTL_INFO(VIDIOC_G_PARM, v4l_g_parm, v4l_print_streamparm, INFO_FL_CLEAR(v4l2_streamparm, type)), IOCTL_INFO(VIDIOC_S_PARM, v4l_s_parm, v4l_print_streamparm, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_G_STD, v4l_stub_g_std, v4l_print_std, 0), IOCTL_INFO(VIDIOC_S_STD, v4l_s_std, v4l_print_std, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_ENUMSTD, v4l_enumstd, v4l_print_standard, INFO_FL_CLEAR(v4l2_standard, index)), IOCTL_INFO(VIDIOC_ENUMINPUT, v4l_enuminput, v4l_print_enuminput, INFO_FL_CLEAR(v4l2_input, index)), IOCTL_INFO(VIDIOC_G_CTRL, v4l_g_ctrl, v4l_print_control, INFO_FL_CTRL | INFO_FL_CLEAR(v4l2_control, id)), IOCTL_INFO(VIDIOC_S_CTRL, v4l_s_ctrl, v4l_print_control, INFO_FL_PRIO | INFO_FL_CTRL), IOCTL_INFO(VIDIOC_G_TUNER, v4l_g_tuner, v4l_print_tuner, INFO_FL_CLEAR(v4l2_tuner, index)), IOCTL_INFO(VIDIOC_S_TUNER, v4l_s_tuner, v4l_print_tuner, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_G_AUDIO, v4l_stub_g_audio, v4l_print_audio, 0), IOCTL_INFO(VIDIOC_S_AUDIO, v4l_stub_s_audio, v4l_print_audio, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_QUERYCTRL, v4l_queryctrl, v4l_print_queryctrl, INFO_FL_CTRL | INFO_FL_CLEAR(v4l2_queryctrl, id)), IOCTL_INFO(VIDIOC_QUERYMENU, v4l_querymenu, v4l_print_querymenu, INFO_FL_CTRL | INFO_FL_CLEAR(v4l2_querymenu, index)), IOCTL_INFO(VIDIOC_G_INPUT, v4l_g_input, v4l_print_u32, 0), IOCTL_INFO(VIDIOC_S_INPUT, v4l_s_input, v4l_print_u32, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_G_EDID, v4l_stub_g_edid, v4l_print_edid, INFO_FL_ALWAYS_COPY), IOCTL_INFO(VIDIOC_S_EDID, v4l_stub_s_edid, v4l_print_edid, INFO_FL_PRIO | INFO_FL_ALWAYS_COPY), IOCTL_INFO(VIDIOC_G_OUTPUT, v4l_g_output, v4l_print_u32, 0), IOCTL_INFO(VIDIOC_S_OUTPUT, v4l_s_output, v4l_print_u32, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_ENUMOUTPUT, v4l_enumoutput, v4l_print_enumoutput, INFO_FL_CLEAR(v4l2_output, index)), IOCTL_INFO(VIDIOC_G_AUDOUT, v4l_stub_g_audout, v4l_print_audioout, 0), IOCTL_INFO(VIDIOC_S_AUDOUT, v4l_stub_s_audout, v4l_print_audioout, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_G_MODULATOR, v4l_g_modulator, v4l_print_modulator, INFO_FL_CLEAR(v4l2_modulator, index)), IOCTL_INFO(VIDIOC_S_MODULATOR, v4l_s_modulator, v4l_print_modulator, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_G_FREQUENCY, v4l_g_frequency, v4l_print_frequency, INFO_FL_CLEAR(v4l2_frequency, tuner)), IOCTL_INFO(VIDIOC_S_FREQUENCY, v4l_s_frequency, v4l_print_frequency, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_CROPCAP, v4l_cropcap, v4l_print_cropcap, INFO_FL_CLEAR(v4l2_cropcap, type)), IOCTL_INFO(VIDIOC_G_CROP, v4l_g_crop, v4l_print_crop, INFO_FL_CLEAR(v4l2_crop, type)), IOCTL_INFO(VIDIOC_S_CROP, v4l_s_crop, v4l_print_crop, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_G_SELECTION, v4l_g_selection, v4l_print_selection, INFO_FL_CLEAR(v4l2_selection, r)), IOCTL_INFO(VIDIOC_S_SELECTION, v4l_s_selection, v4l_print_selection, INFO_FL_PRIO | INFO_FL_CLEAR(v4l2_selection, r)), IOCTL_INFO(VIDIOC_G_JPEGCOMP, v4l_stub_g_jpegcomp, v4l_print_jpegcompression, 0), IOCTL_INFO(VIDIOC_S_JPEGCOMP, v4l_stub_s_jpegcomp, v4l_print_jpegcompression, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_QUERYSTD, v4l_querystd, v4l_print_std, 0), IOCTL_INFO(VIDIOC_TRY_FMT, v4l_try_fmt, v4l_print_format, 0), IOCTL_INFO(VIDIOC_ENUMAUDIO, v4l_stub_enumaudio, v4l_print_audio, INFO_FL_CLEAR(v4l2_audio, index)), IOCTL_INFO(VIDIOC_ENUMAUDOUT, v4l_stub_enumaudout, v4l_print_audioout, INFO_FL_CLEAR(v4l2_audioout, index)), IOCTL_INFO(VIDIOC_G_PRIORITY, v4l_g_priority, v4l_print_u32, 0), IOCTL_INFO(VIDIOC_S_PRIORITY, v4l_s_priority, v4l_print_u32, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_G_SLICED_VBI_CAP, v4l_g_sliced_vbi_cap, v4l_print_sliced_vbi_cap, INFO_FL_CLEAR(v4l2_sliced_vbi_cap, type)), IOCTL_INFO(VIDIOC_LOG_STATUS, v4l_log_status, v4l_print_newline, 0), IOCTL_INFO(VIDIOC_G_EXT_CTRLS, v4l_g_ext_ctrls, v4l_print_ext_controls, INFO_FL_CTRL | INFO_FL_ALWAYS_COPY), IOCTL_INFO(VIDIOC_S_EXT_CTRLS, v4l_s_ext_ctrls, v4l_print_ext_controls, INFO_FL_PRIO | INFO_FL_CTRL | INFO_FL_ALWAYS_COPY), IOCTL_INFO(VIDIOC_TRY_EXT_CTRLS, v4l_try_ext_ctrls, v4l_print_ext_controls, INFO_FL_CTRL | INFO_FL_ALWAYS_COPY), IOCTL_INFO(VIDIOC_ENUM_FRAMESIZES, v4l_stub_enum_framesizes, v4l_print_frmsizeenum, INFO_FL_CLEAR(v4l2_frmsizeenum, pixel_format)), IOCTL_INFO(VIDIOC_ENUM_FRAMEINTERVALS, v4l_stub_enum_frameintervals, v4l_print_frmivalenum, INFO_FL_CLEAR(v4l2_frmivalenum, height)), IOCTL_INFO(VIDIOC_G_ENC_INDEX, v4l_stub_g_enc_index, v4l_print_enc_idx, 0), IOCTL_INFO(VIDIOC_ENCODER_CMD, v4l_stub_encoder_cmd, v4l_print_encoder_cmd, INFO_FL_PRIO | INFO_FL_CLEAR(v4l2_encoder_cmd, flags)), IOCTL_INFO(VIDIOC_TRY_ENCODER_CMD, v4l_stub_try_encoder_cmd, v4l_print_encoder_cmd, INFO_FL_CLEAR(v4l2_encoder_cmd, flags)), IOCTL_INFO(VIDIOC_DECODER_CMD, v4l_stub_decoder_cmd, v4l_print_decoder_cmd, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_TRY_DECODER_CMD, v4l_stub_try_decoder_cmd, v4l_print_decoder_cmd, 0), IOCTL_INFO(VIDIOC_DBG_S_REGISTER, v4l_dbg_s_register, v4l_print_dbg_register, 0), IOCTL_INFO(VIDIOC_DBG_G_REGISTER, v4l_dbg_g_register, v4l_print_dbg_register, 0), IOCTL_INFO(VIDIOC_S_HW_FREQ_SEEK, v4l_s_hw_freq_seek, v4l_print_hw_freq_seek, INFO_FL_PRIO), IOCTL_INFO(VIDIOC_S_DV_TIMINGS, v4l_stub_s_dv_timings, v4l_print_dv_timings, INFO_FL_PRIO | INFO_FL_CLEAR(v4l2_dv_timings, bt.flags)), IOCTL_INFO(VIDIOC_G_DV_TIMINGS, v4l_stub_g_dv_timings, v4l_print_dv_timings, 0), IOCTL_INFO(VIDIOC_DQEVENT, v4l_dqevent, v4l_print_event, 0), IOCTL_INFO(VIDIOC_SUBSCRIBE_EVENT, v4l_subscribe_event, v4l_print_event_subscription, 0), IOCTL_INFO(VIDIOC_UNSUBSCRIBE_EVENT, v4l_unsubscribe_event, v4l_print_event_subscription, 0), IOCTL_INFO(VIDIOC_CREATE_BUFS, v4l_create_bufs, v4l_print_create_buffers, INFO_FL_PRIO | INFO_FL_QUEUE), IOCTL_INFO(VIDIOC_PREPARE_BUF, v4l_prepare_buf, v4l_print_buffer, INFO_FL_QUEUE), IOCTL_INFO(VIDIOC_ENUM_DV_TIMINGS, v4l_stub_enum_dv_timings, v4l_print_enum_dv_timings, INFO_FL_CLEAR(v4l2_enum_dv_timings, pad)), IOCTL_INFO(VIDIOC_QUERY_DV_TIMINGS, v4l_stub_query_dv_timings, v4l_print_dv_timings, INFO_FL_ALWAYS_COPY), IOCTL_INFO(VIDIOC_DV_TIMINGS_CAP, v4l_stub_dv_timings_cap, v4l_print_dv_timings_cap, INFO_FL_CLEAR(v4l2_dv_timings_cap, pad)), IOCTL_INFO(VIDIOC_ENUM_FREQ_BANDS, v4l_enum_freq_bands, v4l_print_freq_band, 0), IOCTL_INFO(VIDIOC_DBG_G_CHIP_INFO, v4l_dbg_g_chip_info, v4l_print_dbg_chip_info, INFO_FL_CLEAR(v4l2_dbg_chip_info, match)), IOCTL_INFO(VIDIOC_QUERY_EXT_CTRL, v4l_query_ext_ctrl, v4l_print_query_ext_ctrl, INFO_FL_CTRL | INFO_FL_CLEAR(v4l2_query_ext_ctrl, id)), IOCTL_INFO(VIDIOC_REMOVE_BUFS, v4l_remove_bufs, v4l_print_remove_buffers, INFO_FL_PRIO | INFO_FL_QUEUE | INFO_FL_CLEAR(v4l2_remove_buffers, type)), }; #define V4L2_IOCTLS ARRAY_SIZE(v4l2_ioctls) static bool v4l2_is_known_ioctl(unsigned int cmd) { if (_IOC_NR(cmd) >= V4L2_IOCTLS) return false; return v4l2_ioctls[_IOC_NR(cmd)].ioctl == cmd; } static struct mutex *v4l2_ioctl_get_lock(struct video_device *vdev, struct v4l2_fh *vfh, unsigned int cmd, void *arg) { if (_IOC_NR(cmd) >= V4L2_IOCTLS) return vdev->lock; if (vfh && vfh->m2m_ctx && (v4l2_ioctls[_IOC_NR(cmd)].flags & INFO_FL_QUEUE)) { if (vfh->m2m_ctx->q_lock) return vfh->m2m_ctx->q_lock; } if (vdev->queue && vdev->queue->lock && (v4l2_ioctls[_IOC_NR(cmd)].flags & INFO_FL_QUEUE)) return vdev->queue->lock; return vdev->lock; } /* Common ioctl debug function. This function can be used by external ioctl messages as well as internal V4L ioctl */ void v4l_printk_ioctl(const char *prefix, unsigned int cmd) { const char *dir, *type; if (prefix) printk(KERN_DEBUG "%s: ", prefix); switch (_IOC_TYPE(cmd)) { case 'd': type = "v4l2_int"; break; case 'V': if (!v4l2_is_known_ioctl(cmd)) { type = "v4l2"; break; } pr_cont("%s", v4l2_ioctls[_IOC_NR(cmd)].name); return; default: type = "unknown"; break; } switch (_IOC_DIR(cmd)) { case _IOC_NONE: dir = "--"; break; case _IOC_READ: dir = "r-"; break; case _IOC_WRITE: dir = "-w"; break; case _IOC_READ | _IOC_WRITE: dir = "rw"; break; default: dir = "*ERR*"; break; } pr_cont("%s ioctl '%c', dir=%s, #%d (0x%08x)", type, _IOC_TYPE(cmd), dir, _IOC_NR(cmd), cmd); } EXPORT_SYMBOL(v4l_printk_ioctl); static long __video_do_ioctl(struct file *file, unsigned int cmd, void *arg) { struct video_device *vfd = video_devdata(file); struct mutex *req_queue_lock = NULL; struct mutex *lock; /* ioctl serialization mutex */ const struct v4l2_ioctl_ops *ops = vfd->ioctl_ops; bool write_only = false; struct v4l2_ioctl_info default_info; const struct v4l2_ioctl_info *info; void *fh = file->private_data; struct v4l2_fh *vfh = NULL; int dev_debug = vfd->dev_debug; long ret = -ENOTTY; if (ops == NULL) { pr_warn("%s: has no ioctl_ops.\n", video_device_node_name(vfd)); return ret; } if (test_bit(V4L2_FL_USES_V4L2_FH, &vfd->flags)) vfh = file->private_data; /* * We need to serialize streamon/off with queueing new requests. * These ioctls may trigger the cancellation of a streaming * operation, and that should not be mixed with queueing a new * request at the same time. */ if (v4l2_device_supports_requests(vfd->v4l2_dev) && (cmd == VIDIOC_STREAMON || cmd == VIDIOC_STREAMOFF)) { req_queue_lock = &vfd->v4l2_dev->mdev->req_queue_mutex; if (mutex_lock_interruptible(req_queue_lock)) return -ERESTARTSYS; } lock = v4l2_ioctl_get_lock(vfd, vfh, cmd, arg); if (lock && mutex_lock_interruptible(lock)) { if (req_queue_lock) mutex_unlock(req_queue_lock); return -ERESTARTSYS; } if (!video_is_registered(vfd)) { ret = -ENODEV; goto unlock; } if (v4l2_is_known_ioctl(cmd)) { info = &v4l2_ioctls[_IOC_NR(cmd)]; if (!is_valid_ioctl(vfd, cmd) && !((info->flags & INFO_FL_CTRL) && vfh && vfh->ctrl_handler)) goto done; if (vfh && (info->flags & INFO_FL_PRIO)) { ret = v4l2_prio_check(vfd->prio, vfh->prio); if (ret) goto done; } } else { default_info.ioctl = cmd; default_info.flags = 0; default_info.debug = v4l_print_default; info = &default_info; } write_only = _IOC_DIR(cmd) == _IOC_WRITE; if (info != &default_info) { ret = info->func(ops, file, fh, arg); } else if (!ops->vidioc_default) { ret = -ENOTTY; } else { ret = ops->vidioc_default(file, fh, vfh ? v4l2_prio_check(vfd->prio, vfh->prio) >= 0 : 0, cmd, arg); } done: if (dev_debug & (V4L2_DEV_DEBUG_IOCTL | V4L2_DEV_DEBUG_IOCTL_ARG)) { if (!(dev_debug & V4L2_DEV_DEBUG_STREAMING) && (cmd == VIDIOC_QBUF || cmd == VIDIOC_DQBUF)) goto unlock; v4l_printk_ioctl(video_device_node_name(vfd), cmd); if (ret < 0) pr_cont(": error %ld", ret); if (!(dev_debug & V4L2_DEV_DEBUG_IOCTL_ARG)) pr_cont("\n"); else if (_IOC_DIR(cmd) == _IOC_NONE) info->debug(arg, write_only); else { pr_cont(": "); info->debug(arg, write_only); } } unlock: if (lock) mutex_unlock(lock); if (req_queue_lock) mutex_unlock(req_queue_lock); return ret; } static int check_array_args(unsigned int cmd, void *parg, size_t *array_size, void __user **user_ptr, void ***kernel_ptr) { int ret = 0; switch (cmd) { case VIDIOC_PREPARE_BUF: case VIDIOC_QUERYBUF: case VIDIOC_QBUF: case VIDIOC_DQBUF: { struct v4l2_buffer *buf = parg; if (V4L2_TYPE_IS_MULTIPLANAR(buf->type) && buf->length > 0) { if (buf->length > VIDEO_MAX_PLANES) { ret = -EINVAL; break; } *user_ptr = (void __user *)buf->m.planes; *kernel_ptr = (void **)&buf->m.planes; *array_size = sizeof(struct v4l2_plane) * buf->length; ret = 1; } break; } case VIDIOC_G_EDID: case VIDIOC_S_EDID: { struct v4l2_edid *edid = parg; if (edid->blocks) { if (edid->blocks > 256) { ret = -EINVAL; break; } *user_ptr = (void __user *)edid->edid; *kernel_ptr = (void **)&edid->edid; *array_size = edid->blocks * 128; ret = 1; } break; } case VIDIOC_S_EXT_CTRLS: case VIDIOC_G_EXT_CTRLS: case VIDIOC_TRY_EXT_CTRLS: { struct v4l2_ext_controls *ctrls = parg; if (ctrls->count != 0) { if (ctrls->count > V4L2_CID_MAX_CTRLS) { ret = -EINVAL; break; } *user_ptr = (void __user *)ctrls->controls; *kernel_ptr = (void **)&ctrls->controls; *array_size = sizeof(struct v4l2_ext_control) * ctrls->count; ret = 1; } break; } case VIDIOC_SUBDEV_G_ROUTING: case VIDIOC_SUBDEV_S_ROUTING: { struct v4l2_subdev_routing *routing = parg; if (routing->len_routes > 256) return -E2BIG; *user_ptr = u64_to_user_ptr(routing->routes); *kernel_ptr = (void **)&routing->routes; *array_size = sizeof(struct v4l2_subdev_route) * routing->len_routes; ret = 1; break; } } return ret; } static unsigned int video_translate_cmd(unsigned int cmd) { #if !defined(CONFIG_64BIT) && defined(CONFIG_COMPAT_32BIT_TIME) switch (cmd) { case VIDIOC_DQEVENT_TIME32: return VIDIOC_DQEVENT; case VIDIOC_QUERYBUF_TIME32: return VIDIOC_QUERYBUF; case VIDIOC_QBUF_TIME32: return VIDIOC_QBUF; case VIDIOC_DQBUF_TIME32: return VIDIOC_DQBUF; case VIDIOC_PREPARE_BUF_TIME32: return VIDIOC_PREPARE_BUF; } #endif if (in_compat_syscall()) return v4l2_compat_translate_cmd(cmd); return cmd; } static int video_get_user(void __user *arg, void *parg, unsigned int real_cmd, unsigned int cmd, bool *always_copy) { unsigned int n = _IOC_SIZE(real_cmd); int err = 0; if (!(_IOC_DIR(cmd) & _IOC_WRITE)) { /* read-only ioctl */ memset(parg, 0, n); return 0; } /* * In some cases, only a few fields are used as input, * i.e. when the app sets "index" and then the driver * fills in the rest of the structure for the thing * with that index. We only need to copy up the first * non-input field. */ if (v4l2_is_known_ioctl(real_cmd)) { u32 flags = v4l2_ioctls[_IOC_NR(real_cmd)].flags; if (flags & INFO_FL_CLEAR_MASK) n = (flags & INFO_FL_CLEAR_MASK) >> 16; *always_copy = flags & INFO_FL_ALWAYS_COPY; } if (cmd == real_cmd) { if (copy_from_user(parg, (void __user *)arg, n)) err = -EFAULT; } else if (in_compat_syscall()) { memset(parg, 0, n); err = v4l2_compat_get_user(arg, parg, cmd); } else { memset(parg, 0, n); #if !defined(CONFIG_64BIT) && defined(CONFIG_COMPAT_32BIT_TIME) switch (cmd) { case VIDIOC_QUERYBUF_TIME32: case VIDIOC_QBUF_TIME32: case VIDIOC_DQBUF_TIME32: case VIDIOC_PREPARE_BUF_TIME32: { struct v4l2_buffer_time32 vb32; struct v4l2_buffer *vb = parg; if (copy_from_user(&vb32, arg, sizeof(vb32))) return -EFAULT; *vb = (struct v4l2_buffer) { .index = vb32.index, .type = vb32.type, .bytesused = vb32.bytesused, .flags = vb32.flags, .field = vb32.field, .timestamp.tv_sec = vb32.timestamp.tv_sec, .timestamp.tv_usec = vb32.timestamp.tv_usec, .timecode = vb32.timecode, .sequence = vb32.sequence, .memory = vb32.memory, .m.userptr = vb32.m.userptr, .length = vb32.length, .request_fd = vb32.request_fd, }; break; } } #endif } /* zero out anything we don't copy from userspace */ if (!err && n < _IOC_SIZE(real_cmd)) memset((u8 *)parg + n, 0, _IOC_SIZE(real_cmd) - n); return err; } static int video_put_user(void __user *arg, void *parg, unsigned int real_cmd, unsigned int cmd) { if (!(_IOC_DIR(cmd) & _IOC_READ)) return 0; if (cmd == real_cmd) { /* Copy results into user buffer */ if (copy_to_user(arg, parg, _IOC_SIZE(cmd))) return -EFAULT; return 0; } if (in_compat_syscall()) return v4l2_compat_put_user(arg, parg, cmd); #if !defined(CONFIG_64BIT) && defined(CONFIG_COMPAT_32BIT_TIME) switch (cmd) { case VIDIOC_DQEVENT_TIME32: { struct v4l2_event *ev = parg; struct v4l2_event_time32 ev32; memset(&ev32, 0, sizeof(ev32)); ev32.type = ev->type; ev32.pending = ev->pending; ev32.sequence = ev->sequence; ev32.timestamp.tv_sec = ev->timestamp.tv_sec; ev32.timestamp.tv_nsec = ev->timestamp.tv_nsec; ev32.id = ev->id; memcpy(&ev32.u, &ev->u, sizeof(ev->u)); memcpy(&ev32.reserved, &ev->reserved, sizeof(ev->reserved)); if (copy_to_user(arg, &ev32, sizeof(ev32))) return -EFAULT; break; } case VIDIOC_QUERYBUF_TIME32: case VIDIOC_QBUF_TIME32: case VIDIOC_DQBUF_TIME32: case VIDIOC_PREPARE_BUF_TIME32: { struct v4l2_buffer *vb = parg; struct v4l2_buffer_time32 vb32; memset(&vb32, 0, sizeof(vb32)); vb32.index = vb->index; vb32.type = vb->type; vb32.bytesused = vb->bytesused; vb32.flags = vb->flags; vb32.field = vb->field; vb32.timestamp.tv_sec = vb->timestamp.tv_sec; vb32.timestamp.tv_usec = vb->timestamp.tv_usec; vb32.timecode = vb->timecode; vb32.sequence = vb->sequence; vb32.memory = vb->memory; vb32.m.userptr = vb->m.userptr; vb32.length = vb->length; vb32.request_fd = vb->request_fd; if (copy_to_user(arg, &vb32, sizeof(vb32))) return -EFAULT; break; } } #endif return 0; } long video_usercopy(struct file *file, unsigned int orig_cmd, unsigned long arg, v4l2_kioctl func) { char sbuf[128]; void *mbuf = NULL, *array_buf = NULL; void *parg = (void *)arg; long err = -EINVAL; bool has_array_args; bool always_copy = false; size_t array_size = 0; void __user *user_ptr = NULL; void **kernel_ptr = NULL; unsigned int cmd = video_translate_cmd(orig_cmd); const size_t ioc_size = _IOC_SIZE(cmd); /* Copy arguments into temp kernel buffer */ if (_IOC_DIR(cmd) != _IOC_NONE) { if (ioc_size <= sizeof(sbuf)) { parg = sbuf; } else { /* too big to allocate from stack */ mbuf = kmalloc(ioc_size, GFP_KERNEL); if (NULL == mbuf) return -ENOMEM; parg = mbuf; } err = video_get_user((void __user *)arg, parg, cmd, orig_cmd, &always_copy); if (err) goto out; } err = check_array_args(cmd, parg, &array_size, &user_ptr, &kernel_ptr); if (err < 0) goto out; has_array_args = err; if (has_array_args) { array_buf = kvmalloc(array_size, GFP_KERNEL); err = -ENOMEM; if (array_buf == NULL) goto out; if (in_compat_syscall()) err = v4l2_compat_get_array_args(file, array_buf, user_ptr, array_size, orig_cmd, parg); else err = copy_from_user(array_buf, user_ptr, array_size) ? -EFAULT : 0; if (err) goto out; *kernel_ptr = array_buf; } /* Handles IOCTL */ err = func(file, cmd, parg); if (err == -ENOTTY || err == -ENOIOCTLCMD) { err = -ENOTTY; goto out; } if (err == 0) { if (cmd == VIDIOC_DQBUF) trace_v4l2_dqbuf(video_devdata(file)->minor, parg); else if (cmd == VIDIOC_QBUF) trace_v4l2_qbuf(video_devdata(file)->minor, parg); } /* * Some ioctls can return an error, but still have valid * results that must be returned. * * FIXME: subdev IOCTLS are partially handled here and partially in * v4l2-subdev.c and the 'always_copy' flag can only be set for IOCTLS * defined here as part of the 'v4l2_ioctls' array. As * VIDIOC_SUBDEV_[GS]_ROUTING needs to return results to applications * even in case of failure, but it is not defined here as part of the * 'v4l2_ioctls' array, insert an ad-hoc check to address that. */ if (cmd == VIDIOC_SUBDEV_G_ROUTING || cmd == VIDIOC_SUBDEV_S_ROUTING) always_copy = true; if (err < 0 && !always_copy) goto out; if (has_array_args) { *kernel_ptr = (void __force *)user_ptr; if (in_compat_syscall()) { int put_err; put_err = v4l2_compat_put_array_args(file, user_ptr, array_buf, array_size, orig_cmd, parg); if (put_err) err = put_err; } else if (copy_to_user(user_ptr, array_buf, array_size)) { err = -EFAULT; } } if (video_put_user((void __user *)arg, parg, cmd, orig_cmd)) err = -EFAULT; out: kvfree(array_buf); kfree(mbuf); return err; } long video_ioctl2(struct file *file, unsigned int cmd, unsigned long arg) { return video_usercopy(file, cmd, arg, __video_do_ioctl); } EXPORT_SYMBOL(video_ioctl2); |
| 336 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | /* SPDX-License-Identifier: GPL-2.0 */ /* * kernel/workqueue_internal.h * * Workqueue internal header file. Only to be included by workqueue and * core kernel subsystems. */ #ifndef _KERNEL_WORKQUEUE_INTERNAL_H #define _KERNEL_WORKQUEUE_INTERNAL_H #include <linux/workqueue.h> #include <linux/kthread.h> #include <linux/preempt.h> struct worker_pool; /* * The poor guys doing the actual heavy lifting. All on-duty workers are * either serving the manager role, on idle list or on busy hash. For * details on the locking annotation (L, I, X...), refer to workqueue.c. * * Only to be used in workqueue and async. */ struct worker { /* on idle list while idle, on busy hash table while busy */ union { struct list_head entry; /* L: while idle */ struct hlist_node hentry; /* L: while busy */ }; struct work_struct *current_work; /* K: work being processed and its */ work_func_t current_func; /* K: function */ struct pool_workqueue *current_pwq; /* K: pwq */ u64 current_at; /* K: runtime at start or last wakeup */ unsigned int current_color; /* K: color */ int sleeping; /* S: is worker sleeping? */ /* used by the scheduler to determine a worker's last known identity */ work_func_t last_func; /* K: last work's fn */ struct list_head scheduled; /* L: scheduled works */ struct task_struct *task; /* I: worker task */ struct worker_pool *pool; /* A: the associated pool */ /* L: for rescuers */ struct list_head node; /* A: anchored at pool->workers */ /* A: runs through worker->node */ unsigned long last_active; /* K: last active timestamp */ unsigned int flags; /* L: flags */ int id; /* I: worker id */ /* * Opaque string set with work_set_desc(). Printed out with task * dump for debugging - WARN, BUG, panic or sysrq. */ char desc[WORKER_DESC_LEN]; /* used only by rescuers to point to the target workqueue */ struct workqueue_struct *rescue_wq; /* I: the workqueue to rescue */ }; /** * current_wq_worker - return struct worker if %current is a workqueue worker */ static inline struct worker *current_wq_worker(void) { if (in_task() && (current->flags & PF_WQ_WORKER)) return kthread_data(current); return NULL; } /* * Scheduler hooks for concurrency managed workqueue. Only to be used from * sched/ and workqueue.c. */ void wq_worker_running(struct task_struct *task); void wq_worker_sleeping(struct task_struct *task); void wq_worker_tick(struct task_struct *task); work_func_t wq_worker_last_func(struct task_struct *task); #endif /* _KERNEL_WORKQUEUE_INTERNAL_H */ |
| 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 | // SPDX-License-Identifier: GPL-2.0-only /* (C) 1999-2001 Paul `Rusty' Russell * (C) 2002-2004 Netfilter Core Team <coreteam@netfilter.org> */ #include <linux/types.h> #include <linux/ip.h> #include <linux/netfilter.h> #include <linux/module.h> #include <linux/rcupdate.h> #include <linux/skbuff.h> #include <net/netns/generic.h> #include <net/route.h> #include <net/ip.h> #include <linux/netfilter_bridge.h> #include <linux/netfilter_ipv4.h> #include <net/netfilter/ipv4/nf_defrag_ipv4.h> #if IS_ENABLED(CONFIG_NF_CONNTRACK) #include <net/netfilter/nf_conntrack.h> #endif #include <net/netfilter/nf_conntrack_zones.h> static DEFINE_MUTEX(defrag4_mutex); static int nf_ct_ipv4_gather_frags(struct net *net, struct sk_buff *skb, u_int32_t user) { int err; local_bh_disable(); err = ip_defrag(net, skb, user); local_bh_enable(); if (!err) skb->ignore_df = 1; return err; } static enum ip_defrag_users nf_ct_defrag_user(unsigned int hooknum, struct sk_buff *skb) { u16 zone_id = NF_CT_DEFAULT_ZONE_ID; #if IS_ENABLED(CONFIG_NF_CONNTRACK) if (skb_nfct(skb)) { enum ip_conntrack_info ctinfo; const struct nf_conn *ct = nf_ct_get(skb, &ctinfo); zone_id = nf_ct_zone_id(nf_ct_zone(ct), CTINFO2DIR(ctinfo)); } #endif if (nf_bridge_in_prerouting(skb)) return IP_DEFRAG_CONNTRACK_BRIDGE_IN + zone_id; if (hooknum == NF_INET_PRE_ROUTING) return IP_DEFRAG_CONNTRACK_IN + zone_id; else return IP_DEFRAG_CONNTRACK_OUT + zone_id; } static unsigned int ipv4_conntrack_defrag(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { struct sock *sk = skb->sk; if (sk && sk_fullsock(sk) && (sk->sk_family == PF_INET) && inet_test_bit(NODEFRAG, sk)) return NF_ACCEPT; #if IS_ENABLED(CONFIG_NF_CONNTRACK) #if !IS_ENABLED(CONFIG_NF_NAT) /* Previously seen (loopback)? Ignore. Do this before fragment check. */ if (skb_nfct(skb) && !nf_ct_is_template((struct nf_conn *)skb_nfct(skb))) return NF_ACCEPT; #endif if (skb->_nfct == IP_CT_UNTRACKED) return NF_ACCEPT; #endif /* Gather fragments. */ if (ip_is_fragment(ip_hdr(skb))) { enum ip_defrag_users user = nf_ct_defrag_user(state->hook, skb); if (nf_ct_ipv4_gather_frags(state->net, skb, user)) return NF_STOLEN; } return NF_ACCEPT; } static const struct nf_hook_ops ipv4_defrag_ops[] = { { .hook = ipv4_conntrack_defrag, .pf = NFPROTO_IPV4, .hooknum = NF_INET_PRE_ROUTING, .priority = NF_IP_PRI_CONNTRACK_DEFRAG, }, { .hook = ipv4_conntrack_defrag, .pf = NFPROTO_IPV4, .hooknum = NF_INET_LOCAL_OUT, .priority = NF_IP_PRI_CONNTRACK_DEFRAG, }, }; static void __net_exit defrag4_net_exit(struct net *net) { if (net->nf.defrag_ipv4_users) { nf_unregister_net_hooks(net, ipv4_defrag_ops, ARRAY_SIZE(ipv4_defrag_ops)); net->nf.defrag_ipv4_users = 0; } } static const struct nf_defrag_hook defrag_hook = { .owner = THIS_MODULE, .enable = nf_defrag_ipv4_enable, .disable = nf_defrag_ipv4_disable, }; static struct pernet_operations defrag4_net_ops = { .exit = defrag4_net_exit, }; static int __init nf_defrag_init(void) { int err; err = register_pernet_subsys(&defrag4_net_ops); if (err) return err; rcu_assign_pointer(nf_defrag_v4_hook, &defrag_hook); return err; } static void __exit nf_defrag_fini(void) { rcu_assign_pointer(nf_defrag_v4_hook, NULL); unregister_pernet_subsys(&defrag4_net_ops); } int nf_defrag_ipv4_enable(struct net *net) { int err = 0; mutex_lock(&defrag4_mutex); if (net->nf.defrag_ipv4_users == UINT_MAX) { err = -EOVERFLOW; goto out_unlock; } if (net->nf.defrag_ipv4_users) { net->nf.defrag_ipv4_users++; goto out_unlock; } err = nf_register_net_hooks(net, ipv4_defrag_ops, ARRAY_SIZE(ipv4_defrag_ops)); if (err == 0) net->nf.defrag_ipv4_users = 1; out_unlock: mutex_unlock(&defrag4_mutex); return err; } EXPORT_SYMBOL_GPL(nf_defrag_ipv4_enable); void nf_defrag_ipv4_disable(struct net *net) { mutex_lock(&defrag4_mutex); if (net->nf.defrag_ipv4_users) { net->nf.defrag_ipv4_users--; if (net->nf.defrag_ipv4_users == 0) nf_unregister_net_hooks(net, ipv4_defrag_ops, ARRAY_SIZE(ipv4_defrag_ops)); } mutex_unlock(&defrag4_mutex); } EXPORT_SYMBOL_GPL(nf_defrag_ipv4_disable); module_init(nf_defrag_init); module_exit(nf_defrag_fini); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("IPv4 defragmentation support"); |
| 2 34 34 2 2 2 33 1 1 34 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 | // SPDX-License-Identifier: GPL-2.0 // Copyright (c) 2010-2011 EIA Electronics, // Kurt Van Dijck <kurt.van.dijck@eia.be> // Copyright (c) 2010-2011 EIA Electronics, // Pieter Beyens <pieter.beyens@eia.be> // Copyright (c) 2017-2019 Pengutronix, // Marc Kleine-Budde <kernel@pengutronix.de> // Copyright (c) 2017-2019 Pengutronix, // Oleksij Rempel <kernel@pengutronix.de> /* J1939 Address Claiming. * Address Claiming in the kernel * - keeps track of the AC states of ECU's, * - resolves NAME<=>SA taking into account the AC states of ECU's. * * All Address Claim msgs (including host-originated msg) are processed * at the receive path (a sent msg is always received again via CAN echo). * As such, the processing of AC msgs is done in the order on which msgs * are sent on the bus. * * This module doesn't send msgs itself (e.g. replies on Address Claims), * this is the responsibility of a user space application or daemon. */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/netdevice.h> #include <linux/skbuff.h> #include "j1939-priv.h" static inline name_t j1939_skb_to_name(const struct sk_buff *skb) { return le64_to_cpup((__le64 *)skb->data); } static inline bool j1939_ac_msg_is_request(struct sk_buff *skb) { struct j1939_sk_buff_cb *skcb = j1939_skb_to_cb(skb); int req_pgn; if (skb->len < 3 || skcb->addr.pgn != J1939_PGN_REQUEST) return false; req_pgn = skb->data[0] | (skb->data[1] << 8) | (skb->data[2] << 16); return req_pgn == J1939_PGN_ADDRESS_CLAIMED; } static int j1939_ac_verify_outgoing(struct j1939_priv *priv, struct sk_buff *skb) { struct j1939_sk_buff_cb *skcb = j1939_skb_to_cb(skb); if (skb->len != 8) { netdev_notice(priv->ndev, "tx address claim with dlc %i\n", skb->len); return -EPROTO; } if (skcb->addr.src_name != j1939_skb_to_name(skb)) { netdev_notice(priv->ndev, "tx address claim with different name\n"); return -EPROTO; } if (skcb->addr.sa == J1939_NO_ADDR) { netdev_notice(priv->ndev, "tx address claim with broadcast sa\n"); return -EPROTO; } /* ac must always be a broadcast */ if (skcb->addr.dst_name || skcb->addr.da != J1939_NO_ADDR) { netdev_notice(priv->ndev, "tx address claim with dest, not broadcast\n"); return -EPROTO; } return 0; } int j1939_ac_fixup(struct j1939_priv *priv, struct sk_buff *skb) { struct j1939_sk_buff_cb *skcb = j1939_skb_to_cb(skb); int ret; u8 addr; /* network mgmt: address claiming msgs */ if (skcb->addr.pgn == J1939_PGN_ADDRESS_CLAIMED) { struct j1939_ecu *ecu; ret = j1939_ac_verify_outgoing(priv, skb); /* return both when failure & when successful */ if (ret < 0) return ret; ecu = j1939_ecu_get_by_name(priv, skcb->addr.src_name); if (!ecu) return -ENODEV; if (ecu->addr != skcb->addr.sa) /* hold further traffic for ecu, remove from parent */ j1939_ecu_unmap(ecu); j1939_ecu_put(ecu); } else if (skcb->addr.src_name) { /* assign source address */ addr = j1939_name_to_addr(priv, skcb->addr.src_name); if (!j1939_address_is_unicast(addr) && !j1939_ac_msg_is_request(skb)) { netdev_notice(priv->ndev, "tx drop: invalid sa for name 0x%016llx\n", skcb->addr.src_name); return -EADDRNOTAVAIL; } skcb->addr.sa = addr; } /* assign destination address */ if (skcb->addr.dst_name) { addr = j1939_name_to_addr(priv, skcb->addr.dst_name); if (!j1939_address_is_unicast(addr)) { netdev_notice(priv->ndev, "tx drop: invalid da for name 0x%016llx\n", skcb->addr.dst_name); return -EADDRNOTAVAIL; } skcb->addr.da = addr; } return 0; } static void j1939_ac_process(struct j1939_priv *priv, struct sk_buff *skb) { struct j1939_sk_buff_cb *skcb = j1939_skb_to_cb(skb); struct j1939_ecu *ecu, *prev; name_t name; if (skb->len != 8) { netdev_notice(priv->ndev, "rx address claim with wrong dlc %i\n", skb->len); return; } name = j1939_skb_to_name(skb); skcb->addr.src_name = name; if (!name) { netdev_notice(priv->ndev, "rx address claim without name\n"); return; } if (!j1939_address_is_valid(skcb->addr.sa)) { netdev_notice(priv->ndev, "rx address claim with broadcast sa\n"); return; } write_lock_bh(&priv->lock); /* Few words on the ECU ref counting: * * First we get an ECU handle, either with * j1939_ecu_get_by_name_locked() (increments the ref counter) * or j1939_ecu_create_locked() (initializes an ECU object * with a ref counter of 1). * * j1939_ecu_unmap_locked() will decrement the ref counter, * but only if the ECU was mapped before. So "ecu" still * belongs to us. * * j1939_ecu_timer_start() will increment the ref counter * before it starts the timer, so we can put the ecu when * leaving this function. */ ecu = j1939_ecu_get_by_name_locked(priv, name); if (ecu && ecu->addr == skcb->addr.sa) { /* The ISO 11783-5 standard, in "4.5.2 - Address claim * requirements", states: * d) No CF shall begin, or resume, transmission on the * network until 250 ms after it has successfully claimed * an address except when responding to a request for * address-claimed. * * But "Figure 6" and "Figure 7" in "4.5.4.2 - Address-claim * prioritization" show that the CF begins the transmission * after 250 ms from the first AC (address-claimed) message * even if it sends another AC message during that time window * to resolve the address contention with another CF. * * As stated in "4.4.2.3 - Address-claimed message": * In order to successfully claim an address, the CF sending * an address claimed message shall not receive a contending * claim from another CF for at least 250 ms. * * As stated in "4.4.3.2 - NAME management (NM) message": * 1) A commanding CF can * d) request that a CF with a specified NAME transmit * the address-claimed message with its current NAME. * 2) A target CF shall * d) send an address-claimed message in response to a * request for a matching NAME * * Taking the above arguments into account, the 250 ms wait is * requested only during network initialization. * * Do not restart the timer on AC message if both the NAME and * the address match and so if the address has already been * claimed (timer has expired) or the AC message has been sent * to resolve the contention with another CF (timer is still * running). */ goto out_ecu_put; } if (!ecu && j1939_address_is_unicast(skcb->addr.sa)) ecu = j1939_ecu_create_locked(priv, name); if (IS_ERR_OR_NULL(ecu)) goto out_unlock_bh; /* cancel pending (previous) address claim */ j1939_ecu_timer_cancel(ecu); if (j1939_address_is_idle(skcb->addr.sa)) { j1939_ecu_unmap_locked(ecu); goto out_ecu_put; } /* save new addr */ if (ecu->addr != skcb->addr.sa) j1939_ecu_unmap_locked(ecu); ecu->addr = skcb->addr.sa; prev = j1939_ecu_get_by_addr_locked(priv, skcb->addr.sa); if (prev) { if (ecu->name > prev->name) { j1939_ecu_unmap_locked(ecu); j1939_ecu_put(prev); goto out_ecu_put; } else { /* kick prev if less or equal */ j1939_ecu_unmap_locked(prev); j1939_ecu_put(prev); } } j1939_ecu_timer_start(ecu); out_ecu_put: j1939_ecu_put(ecu); out_unlock_bh: write_unlock_bh(&priv->lock); } void j1939_ac_recv(struct j1939_priv *priv, struct sk_buff *skb) { struct j1939_sk_buff_cb *skcb = j1939_skb_to_cb(skb); struct j1939_ecu *ecu; /* network mgmt */ if (skcb->addr.pgn == J1939_PGN_ADDRESS_CLAIMED) { j1939_ac_process(priv, skb); } else if (j1939_address_is_unicast(skcb->addr.sa)) { /* assign source name */ ecu = j1939_ecu_get_by_addr(priv, skcb->addr.sa); if (ecu) { skcb->addr.src_name = ecu->name; j1939_ecu_put(ecu); } } /* assign destination name */ ecu = j1939_ecu_get_by_addr(priv, skcb->addr.da); if (ecu) { skcb->addr.dst_name = ecu->name; j1939_ecu_put(ecu); } } |
| 2 2 2 2 2 2 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 | // SPDX-License-Identifier: GPL-2.0 /* * SME code for cfg80211 * both driver SME event handling and the SME implementation * (for nl80211's connect() and wext) * * Copyright 2009 Johannes Berg <johannes@sipsolutions.net> * Copyright (C) 2009, 2020, 2022-2024 Intel Corporation. All rights reserved. * Copyright 2017 Intel Deutschland GmbH */ #include <linux/etherdevice.h> #include <linux/if_arp.h> #include <linux/slab.h> #include <linux/workqueue.h> #include <linux/wireless.h> #include <linux/export.h> #include <net/iw_handler.h> #include <net/cfg80211.h> #include <net/rtnetlink.h> #include "nl80211.h" #include "reg.h" #include "rdev-ops.h" /* * Software SME in cfg80211, using auth/assoc/deauth calls to the * driver. This is for implementing nl80211's connect/disconnect * and wireless extensions (if configured.) */ struct cfg80211_conn { struct cfg80211_connect_params params; /* these are sub-states of the _CONNECTING sme_state */ enum { CFG80211_CONN_SCANNING, CFG80211_CONN_SCAN_AGAIN, CFG80211_CONN_AUTHENTICATE_NEXT, CFG80211_CONN_AUTHENTICATING, CFG80211_CONN_AUTH_FAILED_TIMEOUT, CFG80211_CONN_ASSOCIATE_NEXT, CFG80211_CONN_ASSOCIATING, CFG80211_CONN_ASSOC_FAILED, CFG80211_CONN_ASSOC_FAILED_TIMEOUT, CFG80211_CONN_DEAUTH, CFG80211_CONN_ABANDON, CFG80211_CONN_CONNECTED, } state; u8 bssid[ETH_ALEN], prev_bssid[ETH_ALEN]; const u8 *ie; size_t ie_len; bool auto_auth, prev_bssid_valid; }; static void cfg80211_sme_free(struct wireless_dev *wdev) { if (!wdev->conn) return; kfree(wdev->conn->ie); kfree(wdev->conn); wdev->conn = NULL; } static int cfg80211_conn_scan(struct wireless_dev *wdev) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_scan_request *request; int n_channels, err; lockdep_assert_wiphy(wdev->wiphy); if (rdev->scan_req || rdev->scan_msg) return -EBUSY; if (wdev->conn->params.channel) n_channels = 1; else n_channels = ieee80211_get_num_supported_channels(wdev->wiphy); request = kzalloc(sizeof(*request) + sizeof(request->ssids[0]) + sizeof(request->channels[0]) * n_channels, GFP_KERNEL); if (!request) return -ENOMEM; request->n_channels = n_channels; if (wdev->conn->params.channel) { enum nl80211_band band = wdev->conn->params.channel->band; struct ieee80211_supported_band *sband = wdev->wiphy->bands[band]; if (!sband) { kfree(request); return -EINVAL; } request->channels[0] = wdev->conn->params.channel; request->rates[band] = (1 << sband->n_bitrates) - 1; } else { int i = 0, j; enum nl80211_band band; struct ieee80211_supported_band *bands; struct ieee80211_channel *channel; for (band = 0; band < NUM_NL80211_BANDS; band++) { bands = wdev->wiphy->bands[band]; if (!bands) continue; for (j = 0; j < bands->n_channels; j++) { channel = &bands->channels[j]; if (channel->flags & IEEE80211_CHAN_DISABLED) continue; request->channels[i++] = channel; } request->rates[band] = (1 << bands->n_bitrates) - 1; } n_channels = i; } request->n_channels = n_channels; request->ssids = (void *)request + struct_size(request, channels, n_channels); request->n_ssids = 1; memcpy(request->ssids[0].ssid, wdev->conn->params.ssid, wdev->conn->params.ssid_len); request->ssids[0].ssid_len = wdev->conn->params.ssid_len; eth_broadcast_addr(request->bssid); request->wdev = wdev; request->wiphy = &rdev->wiphy; request->scan_start = jiffies; rdev->scan_req = request; err = cfg80211_scan(rdev); if (!err) { wdev->conn->state = CFG80211_CONN_SCANNING; nl80211_send_scan_start(rdev, wdev); dev_hold(wdev->netdev); } else { rdev->scan_req = NULL; kfree(request); } return err; } static int cfg80211_conn_do_work(struct wireless_dev *wdev, enum nl80211_timeout_reason *treason) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_connect_params *params; struct cfg80211_auth_request auth_req = {}; struct cfg80211_assoc_request req = {}; int err; lockdep_assert_wiphy(wdev->wiphy); if (!wdev->conn) return 0; params = &wdev->conn->params; switch (wdev->conn->state) { case CFG80211_CONN_SCANNING: /* didn't find it during scan ... */ return -ENOENT; case CFG80211_CONN_SCAN_AGAIN: return cfg80211_conn_scan(wdev); case CFG80211_CONN_AUTHENTICATE_NEXT: if (WARN_ON(!rdev->ops->auth)) return -EOPNOTSUPP; wdev->conn->state = CFG80211_CONN_AUTHENTICATING; auth_req.key = params->key; auth_req.key_len = params->key_len; auth_req.key_idx = params->key_idx; auth_req.auth_type = params->auth_type; auth_req.bss = cfg80211_get_bss(&rdev->wiphy, params->channel, params->bssid, params->ssid, params->ssid_len, IEEE80211_BSS_TYPE_ESS, IEEE80211_PRIVACY_ANY); auth_req.link_id = -1; err = cfg80211_mlme_auth(rdev, wdev->netdev, &auth_req); cfg80211_put_bss(&rdev->wiphy, auth_req.bss); return err; case CFG80211_CONN_AUTH_FAILED_TIMEOUT: *treason = NL80211_TIMEOUT_AUTH; return -ENOTCONN; case CFG80211_CONN_ASSOCIATE_NEXT: if (WARN_ON(!rdev->ops->assoc)) return -EOPNOTSUPP; wdev->conn->state = CFG80211_CONN_ASSOCIATING; if (wdev->conn->prev_bssid_valid) req.prev_bssid = wdev->conn->prev_bssid; req.ie = params->ie; req.ie_len = params->ie_len; req.use_mfp = params->mfp != NL80211_MFP_NO; req.crypto = params->crypto; req.flags = params->flags; req.ht_capa = params->ht_capa; req.ht_capa_mask = params->ht_capa_mask; req.vht_capa = params->vht_capa; req.vht_capa_mask = params->vht_capa_mask; req.link_id = -1; req.bss = cfg80211_get_bss(&rdev->wiphy, params->channel, params->bssid, params->ssid, params->ssid_len, IEEE80211_BSS_TYPE_ESS, IEEE80211_PRIVACY_ANY); if (!req.bss) { err = -ENOENT; } else { err = cfg80211_mlme_assoc(rdev, wdev->netdev, &req, NULL); cfg80211_put_bss(&rdev->wiphy, req.bss); } if (err) cfg80211_mlme_deauth(rdev, wdev->netdev, params->bssid, NULL, 0, WLAN_REASON_DEAUTH_LEAVING, false); return err; case CFG80211_CONN_ASSOC_FAILED_TIMEOUT: *treason = NL80211_TIMEOUT_ASSOC; fallthrough; case CFG80211_CONN_ASSOC_FAILED: cfg80211_mlme_deauth(rdev, wdev->netdev, params->bssid, NULL, 0, WLAN_REASON_DEAUTH_LEAVING, false); return -ENOTCONN; case CFG80211_CONN_DEAUTH: cfg80211_mlme_deauth(rdev, wdev->netdev, params->bssid, NULL, 0, WLAN_REASON_DEAUTH_LEAVING, false); fallthrough; case CFG80211_CONN_ABANDON: /* free directly, disconnected event already sent */ cfg80211_sme_free(wdev); return 0; default: return 0; } } void cfg80211_conn_work(struct work_struct *work) { struct cfg80211_registered_device *rdev = container_of(work, struct cfg80211_registered_device, conn_work); struct wireless_dev *wdev; u8 bssid_buf[ETH_ALEN], *bssid = NULL; enum nl80211_timeout_reason treason; guard(wiphy)(&rdev->wiphy); list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) { if (!wdev->netdev) continue; if (!netif_running(wdev->netdev)) continue; if (!wdev->conn || wdev->conn->state == CFG80211_CONN_CONNECTED) continue; if (wdev->conn->params.bssid) { memcpy(bssid_buf, wdev->conn->params.bssid, ETH_ALEN); bssid = bssid_buf; } treason = NL80211_TIMEOUT_UNSPECIFIED; if (cfg80211_conn_do_work(wdev, &treason)) { struct cfg80211_connect_resp_params cr; memset(&cr, 0, sizeof(cr)); cr.status = -1; cr.links[0].bssid = bssid; cr.timeout_reason = treason; __cfg80211_connect_result(wdev->netdev, &cr, false); } } } static void cfg80211_step_auth_next(struct cfg80211_conn *conn, struct cfg80211_bss *bss) { memcpy(conn->bssid, bss->bssid, ETH_ALEN); conn->params.bssid = conn->bssid; conn->params.channel = bss->channel; conn->state = CFG80211_CONN_AUTHENTICATE_NEXT; } /* Returned bss is reference counted and must be cleaned up appropriately. */ static struct cfg80211_bss *cfg80211_get_conn_bss(struct wireless_dev *wdev) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_bss *bss; lockdep_assert_wiphy(wdev->wiphy); bss = cfg80211_get_bss(wdev->wiphy, wdev->conn->params.channel, wdev->conn->params.bssid, wdev->conn->params.ssid, wdev->conn->params.ssid_len, wdev->conn_bss_type, IEEE80211_PRIVACY(wdev->conn->params.privacy)); if (!bss) return NULL; cfg80211_step_auth_next(wdev->conn, bss); schedule_work(&rdev->conn_work); return bss; } void cfg80211_sme_scan_done(struct net_device *dev) { struct wireless_dev *wdev = dev->ieee80211_ptr; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_bss *bss; lockdep_assert_wiphy(wdev->wiphy); if (!wdev->conn) return; if (wdev->conn->state != CFG80211_CONN_SCANNING && wdev->conn->state != CFG80211_CONN_SCAN_AGAIN) return; bss = cfg80211_get_conn_bss(wdev); if (bss) cfg80211_put_bss(&rdev->wiphy, bss); else schedule_work(&rdev->conn_work); } void cfg80211_sme_rx_auth(struct wireless_dev *wdev, const u8 *buf, size_t len) { struct wiphy *wiphy = wdev->wiphy; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); struct ieee80211_mgmt *mgmt = (struct ieee80211_mgmt *)buf; u16 status_code = le16_to_cpu(mgmt->u.auth.status_code); lockdep_assert_wiphy(wdev->wiphy); if (!wdev->conn || wdev->conn->state == CFG80211_CONN_CONNECTED) return; if (status_code == WLAN_STATUS_NOT_SUPPORTED_AUTH_ALG && wdev->conn->auto_auth && wdev->conn->params.auth_type != NL80211_AUTHTYPE_NETWORK_EAP) { /* select automatically between only open, shared, leap */ switch (wdev->conn->params.auth_type) { case NL80211_AUTHTYPE_OPEN_SYSTEM: if (wdev->connect_keys) wdev->conn->params.auth_type = NL80211_AUTHTYPE_SHARED_KEY; else wdev->conn->params.auth_type = NL80211_AUTHTYPE_NETWORK_EAP; break; case NL80211_AUTHTYPE_SHARED_KEY: wdev->conn->params.auth_type = NL80211_AUTHTYPE_NETWORK_EAP; break; default: /* huh? */ wdev->conn->params.auth_type = NL80211_AUTHTYPE_OPEN_SYSTEM; break; } wdev->conn->state = CFG80211_CONN_AUTHENTICATE_NEXT; schedule_work(&rdev->conn_work); } else if (status_code != WLAN_STATUS_SUCCESS) { struct cfg80211_connect_resp_params cr; memset(&cr, 0, sizeof(cr)); cr.status = status_code; cr.links[0].bssid = mgmt->bssid; cr.timeout_reason = NL80211_TIMEOUT_UNSPECIFIED; __cfg80211_connect_result(wdev->netdev, &cr, false); } else if (wdev->conn->state == CFG80211_CONN_AUTHENTICATING) { wdev->conn->state = CFG80211_CONN_ASSOCIATE_NEXT; schedule_work(&rdev->conn_work); } } bool cfg80211_sme_rx_assoc_resp(struct wireless_dev *wdev, u16 status) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); if (!wdev->conn) return false; if (status == WLAN_STATUS_SUCCESS) { wdev->conn->state = CFG80211_CONN_CONNECTED; return false; } if (wdev->conn->prev_bssid_valid) { /* * Some stupid APs don't accept reassoc, so we * need to fall back to trying regular assoc; * return true so no event is sent to userspace. */ wdev->conn->prev_bssid_valid = false; wdev->conn->state = CFG80211_CONN_ASSOCIATE_NEXT; schedule_work(&rdev->conn_work); return true; } wdev->conn->state = CFG80211_CONN_ASSOC_FAILED; schedule_work(&rdev->conn_work); return false; } void cfg80211_sme_deauth(struct wireless_dev *wdev) { cfg80211_sme_free(wdev); } void cfg80211_sme_auth_timeout(struct wireless_dev *wdev) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); if (!wdev->conn) return; wdev->conn->state = CFG80211_CONN_AUTH_FAILED_TIMEOUT; schedule_work(&rdev->conn_work); } void cfg80211_sme_disassoc(struct wireless_dev *wdev) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); if (!wdev->conn) return; wdev->conn->state = CFG80211_CONN_DEAUTH; schedule_work(&rdev->conn_work); } void cfg80211_sme_assoc_timeout(struct wireless_dev *wdev) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); if (!wdev->conn) return; wdev->conn->state = CFG80211_CONN_ASSOC_FAILED_TIMEOUT; schedule_work(&rdev->conn_work); } void cfg80211_sme_abandon_assoc(struct wireless_dev *wdev) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); if (!wdev->conn) return; wdev->conn->state = CFG80211_CONN_ABANDON; schedule_work(&rdev->conn_work); } static void cfg80211_wdev_release_bsses(struct wireless_dev *wdev) { unsigned int link; for_each_valid_link(wdev, link) { if (!wdev->links[link].client.current_bss) continue; cfg80211_unhold_bss(wdev->links[link].client.current_bss); cfg80211_put_bss(wdev->wiphy, &wdev->links[link].client.current_bss->pub); wdev->links[link].client.current_bss = NULL; } } void cfg80211_wdev_release_link_bsses(struct wireless_dev *wdev, u16 link_mask) { unsigned int link; for_each_valid_link(wdev, link) { if (!wdev->links[link].client.current_bss || !(link_mask & BIT(link))) continue; cfg80211_unhold_bss(wdev->links[link].client.current_bss); cfg80211_put_bss(wdev->wiphy, &wdev->links[link].client.current_bss->pub); wdev->links[link].client.current_bss = NULL; } } static int cfg80211_sme_get_conn_ies(struct wireless_dev *wdev, const u8 *ies, size_t ies_len, const u8 **out_ies, size_t *out_ies_len) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); u8 *buf; size_t offs; if (!rdev->wiphy.extended_capabilities_len || (ies && cfg80211_find_ie(WLAN_EID_EXT_CAPABILITY, ies, ies_len))) { *out_ies = kmemdup(ies, ies_len, GFP_KERNEL); if (!*out_ies) return -ENOMEM; *out_ies_len = ies_len; return 0; } buf = kmalloc(ies_len + rdev->wiphy.extended_capabilities_len + 2, GFP_KERNEL); if (!buf) return -ENOMEM; if (ies_len) { static const u8 before_extcapa[] = { /* not listing IEs expected to be created by driver */ WLAN_EID_RSN, WLAN_EID_QOS_CAPA, WLAN_EID_RRM_ENABLED_CAPABILITIES, WLAN_EID_MOBILITY_DOMAIN, WLAN_EID_SUPPORTED_REGULATORY_CLASSES, WLAN_EID_BSS_COEX_2040, }; offs = ieee80211_ie_split(ies, ies_len, before_extcapa, ARRAY_SIZE(before_extcapa), 0); memcpy(buf, ies, offs); /* leave a whole for extended capabilities IE */ memcpy(buf + offs + rdev->wiphy.extended_capabilities_len + 2, ies + offs, ies_len - offs); } else { offs = 0; } /* place extended capabilities IE (with only driver capabilities) */ buf[offs] = WLAN_EID_EXT_CAPABILITY; buf[offs + 1] = rdev->wiphy.extended_capabilities_len; memcpy(buf + offs + 2, rdev->wiphy.extended_capabilities, rdev->wiphy.extended_capabilities_len); *out_ies = buf; *out_ies_len = ies_len + rdev->wiphy.extended_capabilities_len + 2; return 0; } static int cfg80211_sme_connect(struct wireless_dev *wdev, struct cfg80211_connect_params *connect, const u8 *prev_bssid) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_bss *bss; int err; if (!rdev->ops->auth || !rdev->ops->assoc) return -EOPNOTSUPP; cfg80211_wdev_release_bsses(wdev); if (wdev->connected) { cfg80211_sme_free(wdev); wdev->connected = false; } if (wdev->conn) return -EINPROGRESS; wdev->conn = kzalloc(sizeof(*wdev->conn), GFP_KERNEL); if (!wdev->conn) return -ENOMEM; /* * Copy all parameters, and treat explicitly IEs, BSSID, SSID. */ memcpy(&wdev->conn->params, connect, sizeof(*connect)); if (connect->bssid) { wdev->conn->params.bssid = wdev->conn->bssid; memcpy(wdev->conn->bssid, connect->bssid, ETH_ALEN); } if (cfg80211_sme_get_conn_ies(wdev, connect->ie, connect->ie_len, &wdev->conn->ie, &wdev->conn->params.ie_len)) { kfree(wdev->conn); wdev->conn = NULL; return -ENOMEM; } wdev->conn->params.ie = wdev->conn->ie; if (connect->auth_type == NL80211_AUTHTYPE_AUTOMATIC) { wdev->conn->auto_auth = true; /* start with open system ... should mostly work */ wdev->conn->params.auth_type = NL80211_AUTHTYPE_OPEN_SYSTEM; } else { wdev->conn->auto_auth = false; } wdev->conn->params.ssid = wdev->u.client.ssid; wdev->conn->params.ssid_len = wdev->u.client.ssid_len; /* see if we have the bss already */ bss = cfg80211_get_bss(wdev->wiphy, wdev->conn->params.channel, wdev->conn->params.bssid, wdev->conn->params.ssid, wdev->conn->params.ssid_len, wdev->conn_bss_type, IEEE80211_PRIVACY(wdev->conn->params.privacy)); if (prev_bssid) { memcpy(wdev->conn->prev_bssid, prev_bssid, ETH_ALEN); wdev->conn->prev_bssid_valid = true; } /* we're good if we have a matching bss struct */ if (bss) { enum nl80211_timeout_reason treason; cfg80211_step_auth_next(wdev->conn, bss); err = cfg80211_conn_do_work(wdev, &treason); cfg80211_put_bss(wdev->wiphy, bss); } else { /* otherwise we'll need to scan for the AP first */ err = cfg80211_conn_scan(wdev); /* * If we can't scan right now, then we need to scan again * after the current scan finished, since the parameters * changed (unless we find a good AP anyway). */ if (err == -EBUSY) { err = 0; wdev->conn->state = CFG80211_CONN_SCAN_AGAIN; } } if (err) cfg80211_sme_free(wdev); return err; } static int cfg80211_sme_disconnect(struct wireless_dev *wdev, u16 reason) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); int err; if (!wdev->conn) return 0; if (!rdev->ops->deauth) return -EOPNOTSUPP; if (wdev->conn->state == CFG80211_CONN_SCANNING || wdev->conn->state == CFG80211_CONN_SCAN_AGAIN) { err = 0; goto out; } /* wdev->conn->params.bssid must be set if > SCANNING */ err = cfg80211_mlme_deauth(rdev, wdev->netdev, wdev->conn->params.bssid, NULL, 0, reason, false); out: cfg80211_sme_free(wdev); return err; } /* * code shared for in-device and software SME */ static bool cfg80211_is_all_idle(void) { struct cfg80211_registered_device *rdev; struct wireless_dev *wdev; bool is_all_idle = true; /* * All devices must be idle as otherwise if you are actively * scanning some new beacon hints could be learned and would * count as new regulatory hints. * Also if there is any other active beaconing interface we * need not issue a disconnect hint and reset any info such * as chan dfs state, etc. */ for_each_rdev(rdev) { guard(wiphy)(&rdev->wiphy); list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) { if (wdev->conn || wdev->connected || cfg80211_beaconing_iface_active(wdev)) is_all_idle = false; } } return is_all_idle; } static void disconnect_work(struct work_struct *work) { rtnl_lock(); if (cfg80211_is_all_idle()) regulatory_hint_disconnect(); rtnl_unlock(); } DECLARE_WORK(cfg80211_disconnect_work, disconnect_work); static void cfg80211_connect_result_release_bsses(struct wireless_dev *wdev, struct cfg80211_connect_resp_params *cr) { unsigned int link; for_each_valid_link(cr, link) { if (!cr->links[link].bss) continue; cfg80211_unhold_bss(bss_from_pub(cr->links[link].bss)); cfg80211_put_bss(wdev->wiphy, cr->links[link].bss); } } /* * API calls for drivers implementing connect/disconnect and * SME event handling */ /* This method must consume bss one way or another */ void __cfg80211_connect_result(struct net_device *dev, struct cfg80211_connect_resp_params *cr, bool wextev) { struct wireless_dev *wdev = dev->ieee80211_ptr; const struct element *country_elem = NULL; const struct element *ssid; const u8 *country_data; u8 country_datalen; #ifdef CONFIG_CFG80211_WEXT union iwreq_data wrqu; #endif unsigned int link; const u8 *connected_addr; bool bss_not_found = false; lockdep_assert_wiphy(wdev->wiphy); if (WARN_ON(wdev->iftype != NL80211_IFTYPE_STATION && wdev->iftype != NL80211_IFTYPE_P2P_CLIENT)) goto out; if (cr->valid_links) { if (WARN_ON(!cr->ap_mld_addr)) goto out; for_each_valid_link(cr, link) { if (WARN_ON(!cr->links[link].addr)) goto out; } if (WARN_ON(wdev->connect_keys)) goto out; } wdev->unprot_beacon_reported = 0; nl80211_send_connect_result(wiphy_to_rdev(wdev->wiphy), dev, cr, GFP_KERNEL); connected_addr = cr->valid_links ? cr->ap_mld_addr : cr->links[0].bssid; #ifdef CONFIG_CFG80211_WEXT if (wextev && !cr->valid_links) { if (cr->req_ie && cr->status == WLAN_STATUS_SUCCESS) { memset(&wrqu, 0, sizeof(wrqu)); wrqu.data.length = cr->req_ie_len; wireless_send_event(dev, IWEVASSOCREQIE, &wrqu, cr->req_ie); } if (cr->resp_ie && cr->status == WLAN_STATUS_SUCCESS) { memset(&wrqu, 0, sizeof(wrqu)); wrqu.data.length = cr->resp_ie_len; wireless_send_event(dev, IWEVASSOCRESPIE, &wrqu, cr->resp_ie); } memset(&wrqu, 0, sizeof(wrqu)); wrqu.ap_addr.sa_family = ARPHRD_ETHER; if (connected_addr && cr->status == WLAN_STATUS_SUCCESS) { memcpy(wrqu.ap_addr.sa_data, connected_addr, ETH_ALEN); memcpy(wdev->wext.prev_bssid, connected_addr, ETH_ALEN); wdev->wext.prev_bssid_valid = true; } wireless_send_event(dev, SIOCGIWAP, &wrqu, NULL); } #endif if (cr->status == WLAN_STATUS_SUCCESS) { if (!wiphy_to_rdev(wdev->wiphy)->ops->connect) { for_each_valid_link(cr, link) { if (WARN_ON_ONCE(!cr->links[link].bss)) break; } } for_each_valid_link(cr, link) { /* don't do extra lookups for failures */ if (cr->links[link].status != WLAN_STATUS_SUCCESS) continue; if (cr->links[link].bss) continue; cr->links[link].bss = cfg80211_get_bss(wdev->wiphy, NULL, cr->links[link].bssid, wdev->u.client.ssid, wdev->u.client.ssid_len, wdev->conn_bss_type, IEEE80211_PRIVACY_ANY); if (!cr->links[link].bss) { bss_not_found = true; break; } cfg80211_hold_bss(bss_from_pub(cr->links[link].bss)); } } cfg80211_wdev_release_bsses(wdev); if (cr->status != WLAN_STATUS_SUCCESS) { kfree_sensitive(wdev->connect_keys); wdev->connect_keys = NULL; wdev->u.client.ssid_len = 0; wdev->conn_owner_nlportid = 0; cfg80211_connect_result_release_bsses(wdev, cr); cfg80211_sme_free(wdev); return; } if (WARN_ON(bss_not_found)) { cfg80211_connect_result_release_bsses(wdev, cr); return; } memset(wdev->links, 0, sizeof(wdev->links)); for_each_valid_link(cr, link) { if (cr->links[link].status == WLAN_STATUS_SUCCESS) continue; cr->valid_links &= ~BIT(link); /* don't require bss pointer for failed links */ if (!cr->links[link].bss) continue; cfg80211_unhold_bss(bss_from_pub(cr->links[link].bss)); cfg80211_put_bss(wdev->wiphy, cr->links[link].bss); } wdev->valid_links = cr->valid_links; for_each_valid_link(cr, link) wdev->links[link].client.current_bss = bss_from_pub(cr->links[link].bss); wdev->connected = true; ether_addr_copy(wdev->u.client.connected_addr, connected_addr); if (cr->valid_links) { for_each_valid_link(cr, link) memcpy(wdev->links[link].addr, cr->links[link].addr, ETH_ALEN); } cfg80211_upload_connect_keys(wdev); rcu_read_lock(); for_each_valid_link(cr, link) { country_elem = ieee80211_bss_get_elem(cr->links[link].bss, WLAN_EID_COUNTRY); if (country_elem) break; } if (!country_elem) { rcu_read_unlock(); return; } country_datalen = country_elem->datalen; country_data = kmemdup(country_elem->data, country_datalen, GFP_ATOMIC); rcu_read_unlock(); if (!country_data) return; regulatory_hint_country_ie(wdev->wiphy, cr->links[link].bss->channel->band, country_data, country_datalen); kfree(country_data); if (!wdev->u.client.ssid_len) { rcu_read_lock(); for_each_valid_link(cr, link) { ssid = ieee80211_bss_get_elem(cr->links[link].bss, WLAN_EID_SSID); if (!ssid || !ssid->datalen) continue; memcpy(wdev->u.client.ssid, ssid->data, ssid->datalen); wdev->u.client.ssid_len = ssid->datalen; break; } rcu_read_unlock(); } return; out: for_each_valid_link(cr, link) cfg80211_put_bss(wdev->wiphy, cr->links[link].bss); } static void cfg80211_update_link_bss(struct wireless_dev *wdev, struct cfg80211_bss **bss) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_internal_bss *ibss; if (!*bss) return; ibss = bss_from_pub(*bss); if (list_empty(&ibss->list)) { struct cfg80211_bss *found = NULL, *tmp = *bss; found = cfg80211_get_bss(wdev->wiphy, NULL, (*bss)->bssid, wdev->u.client.ssid, wdev->u.client.ssid_len, wdev->conn_bss_type, IEEE80211_PRIVACY_ANY); if (found) { /* The same BSS is already updated so use it * instead, as it has latest info. */ *bss = found; } else { /* Update with BSS provided by driver, it will * be freshly added and ref cnted, we can free * the old one. * * signal_valid can be false, as we are not * expecting the BSS to be found. * * keep the old timestamp to avoid confusion */ cfg80211_bss_update(rdev, ibss, false, ibss->ts); } cfg80211_put_bss(wdev->wiphy, tmp); } } /* Consumes bss object(s) one way or another */ void cfg80211_connect_done(struct net_device *dev, struct cfg80211_connect_resp_params *params, gfp_t gfp) { struct wireless_dev *wdev = dev->ieee80211_ptr; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_event *ev; unsigned long flags; u8 *next; size_t link_info_size = 0; unsigned int link; for_each_valid_link(params, link) { cfg80211_update_link_bss(wdev, ¶ms->links[link].bss); link_info_size += params->links[link].bssid ? ETH_ALEN : 0; link_info_size += params->links[link].addr ? ETH_ALEN : 0; } ev = kzalloc(sizeof(*ev) + (params->ap_mld_addr ? ETH_ALEN : 0) + params->req_ie_len + params->resp_ie_len + params->fils.kek_len + params->fils.pmk_len + (params->fils.pmkid ? WLAN_PMKID_LEN : 0) + link_info_size, gfp); if (!ev) { for_each_valid_link(params, link) cfg80211_put_bss(wdev->wiphy, params->links[link].bss); return; } ev->type = EVENT_CONNECT_RESULT; next = ((u8 *)ev) + sizeof(*ev); if (params->ap_mld_addr) { ev->cr.ap_mld_addr = next; memcpy((void *)ev->cr.ap_mld_addr, params->ap_mld_addr, ETH_ALEN); next += ETH_ALEN; } if (params->req_ie_len) { ev->cr.req_ie = next; ev->cr.req_ie_len = params->req_ie_len; memcpy((void *)ev->cr.req_ie, params->req_ie, params->req_ie_len); next += params->req_ie_len; } if (params->resp_ie_len) { ev->cr.resp_ie = next; ev->cr.resp_ie_len = params->resp_ie_len; memcpy((void *)ev->cr.resp_ie, params->resp_ie, params->resp_ie_len); next += params->resp_ie_len; } if (params->fils.kek_len) { ev->cr.fils.kek = next; ev->cr.fils.kek_len = params->fils.kek_len; memcpy((void *)ev->cr.fils.kek, params->fils.kek, params->fils.kek_len); next += params->fils.kek_len; } if (params->fils.pmk_len) { ev->cr.fils.pmk = next; ev->cr.fils.pmk_len = params->fils.pmk_len; memcpy((void *)ev->cr.fils.pmk, params->fils.pmk, params->fils.pmk_len); next += params->fils.pmk_len; } if (params->fils.pmkid) { ev->cr.fils.pmkid = next; memcpy((void *)ev->cr.fils.pmkid, params->fils.pmkid, WLAN_PMKID_LEN); next += WLAN_PMKID_LEN; } ev->cr.fils.update_erp_next_seq_num = params->fils.update_erp_next_seq_num; if (params->fils.update_erp_next_seq_num) ev->cr.fils.erp_next_seq_num = params->fils.erp_next_seq_num; ev->cr.valid_links = params->valid_links; for_each_valid_link(params, link) { if (params->links[link].bss) cfg80211_hold_bss( bss_from_pub(params->links[link].bss)); ev->cr.links[link].bss = params->links[link].bss; ev->cr.links[link].status = params->links[link].status; if (params->links[link].addr) { ev->cr.links[link].addr = next; memcpy((void *)ev->cr.links[link].addr, params->links[link].addr, ETH_ALEN); next += ETH_ALEN; } if (params->links[link].bssid) { ev->cr.links[link].bssid = next; memcpy((void *)ev->cr.links[link].bssid, params->links[link].bssid, ETH_ALEN); next += ETH_ALEN; } } ev->cr.status = params->status; ev->cr.timeout_reason = params->timeout_reason; spin_lock_irqsave(&wdev->event_lock, flags); list_add_tail(&ev->list, &wdev->event_list); spin_unlock_irqrestore(&wdev->event_lock, flags); queue_work(cfg80211_wq, &rdev->event_work); } EXPORT_SYMBOL(cfg80211_connect_done); /* Consumes bss object one way or another */ void __cfg80211_roamed(struct wireless_dev *wdev, struct cfg80211_roam_info *info) { #ifdef CONFIG_CFG80211_WEXT union iwreq_data wrqu; #endif unsigned int link; const u8 *connected_addr; lockdep_assert_wiphy(wdev->wiphy); if (WARN_ON(wdev->iftype != NL80211_IFTYPE_STATION && wdev->iftype != NL80211_IFTYPE_P2P_CLIENT)) goto out; if (WARN_ON(!wdev->connected)) goto out; if (info->valid_links) { if (WARN_ON(!info->ap_mld_addr)) goto out; for_each_valid_link(info, link) { if (WARN_ON(!info->links[link].addr)) goto out; } } cfg80211_wdev_release_bsses(wdev); for_each_valid_link(info, link) { if (WARN_ON(!info->links[link].bss)) goto out; } memset(wdev->links, 0, sizeof(wdev->links)); wdev->valid_links = info->valid_links; for_each_valid_link(info, link) { cfg80211_hold_bss(bss_from_pub(info->links[link].bss)); wdev->links[link].client.current_bss = bss_from_pub(info->links[link].bss); } connected_addr = info->valid_links ? info->ap_mld_addr : info->links[0].bss->bssid; ether_addr_copy(wdev->u.client.connected_addr, connected_addr); if (info->valid_links) { for_each_valid_link(info, link) memcpy(wdev->links[link].addr, info->links[link].addr, ETH_ALEN); } wdev->unprot_beacon_reported = 0; nl80211_send_roamed(wiphy_to_rdev(wdev->wiphy), wdev->netdev, info, GFP_KERNEL); #ifdef CONFIG_CFG80211_WEXT if (!info->valid_links) { if (info->req_ie) { memset(&wrqu, 0, sizeof(wrqu)); wrqu.data.length = info->req_ie_len; wireless_send_event(wdev->netdev, IWEVASSOCREQIE, &wrqu, info->req_ie); } if (info->resp_ie) { memset(&wrqu, 0, sizeof(wrqu)); wrqu.data.length = info->resp_ie_len; wireless_send_event(wdev->netdev, IWEVASSOCRESPIE, &wrqu, info->resp_ie); } memset(&wrqu, 0, sizeof(wrqu)); wrqu.ap_addr.sa_family = ARPHRD_ETHER; memcpy(wrqu.ap_addr.sa_data, connected_addr, ETH_ALEN); memcpy(wdev->wext.prev_bssid, connected_addr, ETH_ALEN); wdev->wext.prev_bssid_valid = true; wireless_send_event(wdev->netdev, SIOCGIWAP, &wrqu, NULL); } #endif return; out: for_each_valid_link(info, link) cfg80211_put_bss(wdev->wiphy, info->links[link].bss); } /* Consumes info->links.bss object(s) one way or another */ void cfg80211_roamed(struct net_device *dev, struct cfg80211_roam_info *info, gfp_t gfp) { struct wireless_dev *wdev = dev->ieee80211_ptr; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_event *ev; unsigned long flags; u8 *next; unsigned int link; size_t link_info_size = 0; bool bss_not_found = false; for_each_valid_link(info, link) { link_info_size += info->links[link].addr ? ETH_ALEN : 0; link_info_size += info->links[link].bssid ? ETH_ALEN : 0; if (info->links[link].bss) continue; info->links[link].bss = cfg80211_get_bss(wdev->wiphy, info->links[link].channel, info->links[link].bssid, wdev->u.client.ssid, wdev->u.client.ssid_len, wdev->conn_bss_type, IEEE80211_PRIVACY_ANY); if (!info->links[link].bss) { bss_not_found = true; break; } } if (WARN_ON(bss_not_found)) goto out; ev = kzalloc(sizeof(*ev) + info->req_ie_len + info->resp_ie_len + info->fils.kek_len + info->fils.pmk_len + (info->fils.pmkid ? WLAN_PMKID_LEN : 0) + (info->ap_mld_addr ? ETH_ALEN : 0) + link_info_size, gfp); if (!ev) goto out; ev->type = EVENT_ROAMED; next = ((u8 *)ev) + sizeof(*ev); if (info->req_ie_len) { ev->rm.req_ie = next; ev->rm.req_ie_len = info->req_ie_len; memcpy((void *)ev->rm.req_ie, info->req_ie, info->req_ie_len); next += info->req_ie_len; } if (info->resp_ie_len) { ev->rm.resp_ie = next; ev->rm.resp_ie_len = info->resp_ie_len; memcpy((void *)ev->rm.resp_ie, info->resp_ie, info->resp_ie_len); next += info->resp_ie_len; } if (info->fils.kek_len) { ev->rm.fils.kek = next; ev->rm.fils.kek_len = info->fils.kek_len; memcpy((void *)ev->rm.fils.kek, info->fils.kek, info->fils.kek_len); next += info->fils.kek_len; } if (info->fils.pmk_len) { ev->rm.fils.pmk = next; ev->rm.fils.pmk_len = info->fils.pmk_len; memcpy((void *)ev->rm.fils.pmk, info->fils.pmk, info->fils.pmk_len); next += info->fils.pmk_len; } if (info->fils.pmkid) { ev->rm.fils.pmkid = next; memcpy((void *)ev->rm.fils.pmkid, info->fils.pmkid, WLAN_PMKID_LEN); next += WLAN_PMKID_LEN; } ev->rm.fils.update_erp_next_seq_num = info->fils.update_erp_next_seq_num; if (info->fils.update_erp_next_seq_num) ev->rm.fils.erp_next_seq_num = info->fils.erp_next_seq_num; if (info->ap_mld_addr) { ev->rm.ap_mld_addr = next; memcpy((void *)ev->rm.ap_mld_addr, info->ap_mld_addr, ETH_ALEN); next += ETH_ALEN; } ev->rm.valid_links = info->valid_links; for_each_valid_link(info, link) { ev->rm.links[link].bss = info->links[link].bss; if (info->links[link].addr) { ev->rm.links[link].addr = next; memcpy((void *)ev->rm.links[link].addr, info->links[link].addr, ETH_ALEN); next += ETH_ALEN; } if (info->links[link].bssid) { ev->rm.links[link].bssid = next; memcpy((void *)ev->rm.links[link].bssid, info->links[link].bssid, ETH_ALEN); next += ETH_ALEN; } } spin_lock_irqsave(&wdev->event_lock, flags); list_add_tail(&ev->list, &wdev->event_list); spin_unlock_irqrestore(&wdev->event_lock, flags); queue_work(cfg80211_wq, &rdev->event_work); return; out: for_each_valid_link(info, link) cfg80211_put_bss(wdev->wiphy, info->links[link].bss); } EXPORT_SYMBOL(cfg80211_roamed); void __cfg80211_port_authorized(struct wireless_dev *wdev, const u8 *peer_addr, const u8 *td_bitmap, u8 td_bitmap_len) { lockdep_assert_wiphy(wdev->wiphy); if (WARN_ON(wdev->iftype != NL80211_IFTYPE_STATION && wdev->iftype != NL80211_IFTYPE_P2P_CLIENT && wdev->iftype != NL80211_IFTYPE_AP && wdev->iftype != NL80211_IFTYPE_P2P_GO)) return; if (wdev->iftype == NL80211_IFTYPE_STATION || wdev->iftype == NL80211_IFTYPE_P2P_CLIENT) { if (WARN_ON(!wdev->connected) || WARN_ON(!ether_addr_equal(wdev->u.client.connected_addr, peer_addr))) return; } nl80211_send_port_authorized(wiphy_to_rdev(wdev->wiphy), wdev->netdev, peer_addr, td_bitmap, td_bitmap_len); } void cfg80211_port_authorized(struct net_device *dev, const u8 *peer_addr, const u8 *td_bitmap, u8 td_bitmap_len, gfp_t gfp) { struct wireless_dev *wdev = dev->ieee80211_ptr; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_event *ev; unsigned long flags; if (WARN_ON(!peer_addr)) return; ev = kzalloc(sizeof(*ev) + td_bitmap_len, gfp); if (!ev) return; ev->type = EVENT_PORT_AUTHORIZED; memcpy(ev->pa.peer_addr, peer_addr, ETH_ALEN); ev->pa.td_bitmap = ((u8 *)ev) + sizeof(*ev); ev->pa.td_bitmap_len = td_bitmap_len; memcpy((void *)ev->pa.td_bitmap, td_bitmap, td_bitmap_len); /* * Use the wdev event list so that if there are pending * connected/roamed events, they will be reported first. */ spin_lock_irqsave(&wdev->event_lock, flags); list_add_tail(&ev->list, &wdev->event_list); spin_unlock_irqrestore(&wdev->event_lock, flags); queue_work(cfg80211_wq, &rdev->event_work); } EXPORT_SYMBOL(cfg80211_port_authorized); void __cfg80211_disconnected(struct net_device *dev, const u8 *ie, size_t ie_len, u16 reason, bool from_ap) { struct wireless_dev *wdev = dev->ieee80211_ptr; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); int i; #ifdef CONFIG_CFG80211_WEXT union iwreq_data wrqu; #endif lockdep_assert_wiphy(wdev->wiphy); if (WARN_ON(wdev->iftype != NL80211_IFTYPE_STATION && wdev->iftype != NL80211_IFTYPE_P2P_CLIENT)) return; cfg80211_wdev_release_bsses(wdev); wdev->valid_links = 0; wdev->connected = false; wdev->u.client.ssid_len = 0; wdev->conn_owner_nlportid = 0; kfree_sensitive(wdev->connect_keys); wdev->connect_keys = NULL; nl80211_send_disconnected(rdev, dev, reason, ie, ie_len, from_ap); /* stop critical protocol if supported */ if (rdev->ops->crit_proto_stop && rdev->crit_proto_nlportid) { rdev->crit_proto_nlportid = 0; rdev_crit_proto_stop(rdev, wdev); } /* * Delete all the keys ... pairwise keys can't really * exist any more anyway, but default keys might. */ if (rdev->ops->del_key) { int max_key_idx = 5; if (wiphy_ext_feature_isset( wdev->wiphy, NL80211_EXT_FEATURE_BEACON_PROTECTION) || wiphy_ext_feature_isset( wdev->wiphy, NL80211_EXT_FEATURE_BEACON_PROTECTION_CLIENT)) max_key_idx = 7; for (i = 0; i <= max_key_idx; i++) rdev_del_key(rdev, dev, -1, i, false, NULL); } rdev_set_qos_map(rdev, dev, NULL); #ifdef CONFIG_CFG80211_WEXT memset(&wrqu, 0, sizeof(wrqu)); wrqu.ap_addr.sa_family = ARPHRD_ETHER; wireless_send_event(dev, SIOCGIWAP, &wrqu, NULL); wdev->wext.connect.ssid_len = 0; #endif schedule_work(&cfg80211_disconnect_work); cfg80211_schedule_channels_check(wdev); } void cfg80211_disconnected(struct net_device *dev, u16 reason, const u8 *ie, size_t ie_len, bool locally_generated, gfp_t gfp) { struct wireless_dev *wdev = dev->ieee80211_ptr; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); struct cfg80211_event *ev; unsigned long flags; ev = kzalloc(sizeof(*ev) + ie_len, gfp); if (!ev) return; ev->type = EVENT_DISCONNECTED; ev->dc.ie = ((u8 *)ev) + sizeof(*ev); ev->dc.ie_len = ie_len; memcpy((void *)ev->dc.ie, ie, ie_len); ev->dc.reason = reason; ev->dc.locally_generated = locally_generated; spin_lock_irqsave(&wdev->event_lock, flags); list_add_tail(&ev->list, &wdev->event_list); spin_unlock_irqrestore(&wdev->event_lock, flags); queue_work(cfg80211_wq, &rdev->event_work); } EXPORT_SYMBOL(cfg80211_disconnected); /* * API calls for nl80211/wext compatibility code */ int cfg80211_connect(struct cfg80211_registered_device *rdev, struct net_device *dev, struct cfg80211_connect_params *connect, struct cfg80211_cached_keys *connkeys, const u8 *prev_bssid) { struct wireless_dev *wdev = dev->ieee80211_ptr; int err; lockdep_assert_wiphy(wdev->wiphy); /* * If we have an ssid_len, we're trying to connect or are * already connected, so reject a new SSID unless it's the * same (which is the case for re-association.) */ if (wdev->u.client.ssid_len && (wdev->u.client.ssid_len != connect->ssid_len || memcmp(wdev->u.client.ssid, connect->ssid, wdev->u.client.ssid_len))) return -EALREADY; /* * If connected, reject (re-)association unless prev_bssid * matches the current BSSID. */ if (wdev->connected) { if (!prev_bssid) return -EALREADY; if (!ether_addr_equal(prev_bssid, wdev->u.client.connected_addr)) return -ENOTCONN; } /* * Reject if we're in the process of connecting with WEP, * this case isn't very interesting and trying to handle * it would make the code much more complex. */ if (wdev->connect_keys) return -EINPROGRESS; cfg80211_oper_and_ht_capa(&connect->ht_capa_mask, rdev->wiphy.ht_capa_mod_mask); cfg80211_oper_and_vht_capa(&connect->vht_capa_mask, rdev->wiphy.vht_capa_mod_mask); if (connkeys && connkeys->def >= 0) { int idx; u32 cipher; idx = connkeys->def; cipher = connkeys->params[idx].cipher; /* If given a WEP key we may need it for shared key auth */ if (cipher == WLAN_CIPHER_SUITE_WEP40 || cipher == WLAN_CIPHER_SUITE_WEP104) { connect->key_idx = idx; connect->key = connkeys->params[idx].key; connect->key_len = connkeys->params[idx].key_len; /* * If ciphers are not set (e.g. when going through * iwconfig), we have to set them appropriately here. */ if (connect->crypto.cipher_group == 0) connect->crypto.cipher_group = cipher; if (connect->crypto.n_ciphers_pairwise == 0) { connect->crypto.n_ciphers_pairwise = 1; connect->crypto.ciphers_pairwise[0] = cipher; } } } else { if (WARN_ON(connkeys)) return -EINVAL; /* connect can point to wdev->wext.connect which * can hold key data from a previous connection */ connect->key = NULL; connect->key_len = 0; connect->key_idx = 0; } wdev->connect_keys = connkeys; memcpy(wdev->u.client.ssid, connect->ssid, connect->ssid_len); wdev->u.client.ssid_len = connect->ssid_len; wdev->conn_bss_type = connect->pbss ? IEEE80211_BSS_TYPE_PBSS : IEEE80211_BSS_TYPE_ESS; if (!rdev->ops->connect) err = cfg80211_sme_connect(wdev, connect, prev_bssid); else err = rdev_connect(rdev, dev, connect); if (err) { wdev->connect_keys = NULL; /* * This could be reassoc getting refused, don't clear * ssid_len in that case. */ if (!wdev->connected) wdev->u.client.ssid_len = 0; return err; } return 0; } int cfg80211_disconnect(struct cfg80211_registered_device *rdev, struct net_device *dev, u16 reason, bool wextev) { struct wireless_dev *wdev = dev->ieee80211_ptr; int err = 0; lockdep_assert_wiphy(wdev->wiphy); kfree_sensitive(wdev->connect_keys); wdev->connect_keys = NULL; wdev->conn_owner_nlportid = 0; if (wdev->conn) err = cfg80211_sme_disconnect(wdev, reason); else if (!rdev->ops->disconnect) cfg80211_mlme_down(rdev, dev); else if (wdev->u.client.ssid_len) err = rdev_disconnect(rdev, dev, reason); /* * Clear ssid_len unless we actually were fully connected, * in which case cfg80211_disconnected() will take care of * this later. */ if (!wdev->connected) wdev->u.client.ssid_len = 0; return err; } /* * Used to clean up after the connection / connection attempt owner socket * disconnects */ void cfg80211_autodisconnect_wk(struct work_struct *work) { struct wireless_dev *wdev = container_of(work, struct wireless_dev, disconnect_wk); struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); guard(wiphy)(wdev->wiphy); if (wdev->conn_owner_nlportid) { switch (wdev->iftype) { case NL80211_IFTYPE_ADHOC: cfg80211_leave_ibss(rdev, wdev->netdev, false); break; case NL80211_IFTYPE_AP: case NL80211_IFTYPE_P2P_GO: cfg80211_stop_ap(rdev, wdev->netdev, -1, false); break; case NL80211_IFTYPE_MESH_POINT: cfg80211_leave_mesh(rdev, wdev->netdev); break; case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_P2P_CLIENT: /* * Use disconnect_bssid if still connecting and * ops->disconnect not implemented. Otherwise we can * use cfg80211_disconnect. */ if (rdev->ops->disconnect || wdev->connected) cfg80211_disconnect(rdev, wdev->netdev, WLAN_REASON_DEAUTH_LEAVING, true); else cfg80211_mlme_deauth(rdev, wdev->netdev, wdev->disconnect_bssid, NULL, 0, WLAN_REASON_DEAUTH_LEAVING, false); break; default: break; } } } |
| 906 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef MPLS_INTERNAL_H #define MPLS_INTERNAL_H #include <net/mpls.h> /* put a reasonable limit on the number of labels * we will accept from userspace */ #define MAX_NEW_LABELS 30 struct mpls_entry_decoded { u32 label; u8 ttl; u8 tc; u8 bos; }; struct mpls_pcpu_stats { struct mpls_link_stats stats; struct u64_stats_sync syncp; }; struct mpls_dev { int input_enabled; struct net_device *dev; struct mpls_pcpu_stats __percpu *stats; struct ctl_table_header *sysctl; struct rcu_head rcu; }; #if BITS_PER_LONG == 32 #define MPLS_INC_STATS_LEN(mdev, len, pkts_field, bytes_field) \ do { \ TYPEOF_UNQUAL(*(mdev)->stats) *ptr = \ raw_cpu_ptr((mdev)->stats); \ local_bh_disable(); \ u64_stats_update_begin(&ptr->syncp); \ ptr->stats.pkts_field++; \ ptr->stats.bytes_field += (len); \ u64_stats_update_end(&ptr->syncp); \ local_bh_enable(); \ } while (0) #define MPLS_INC_STATS(mdev, field) \ do { \ TYPEOF_UNQUAL(*(mdev)->stats) *ptr = \ raw_cpu_ptr((mdev)->stats); \ local_bh_disable(); \ u64_stats_update_begin(&ptr->syncp); \ ptr->stats.field++; \ u64_stats_update_end(&ptr->syncp); \ local_bh_enable(); \ } while (0) #else #define MPLS_INC_STATS_LEN(mdev, len, pkts_field, bytes_field) \ do { \ this_cpu_inc((mdev)->stats->stats.pkts_field); \ this_cpu_add((mdev)->stats->stats.bytes_field, (len)); \ } while (0) #define MPLS_INC_STATS(mdev, field) \ this_cpu_inc((mdev)->stats->stats.field) #endif struct sk_buff; #define LABEL_NOT_SPECIFIED (1 << 20) /* This maximum ha length copied from the definition of struct neighbour */ #define VIA_ALEN_ALIGN sizeof(unsigned long) #define MAX_VIA_ALEN (ALIGN(MAX_ADDR_LEN, VIA_ALEN_ALIGN)) enum mpls_payload_type { MPT_UNSPEC, /* IPv4 or IPv6 */ MPT_IPV4 = 4, MPT_IPV6 = 6, /* Other types not implemented: * - Pseudo-wire with or without control word (RFC4385) * - GAL (RFC5586) */ }; struct mpls_nh { /* next hop label forwarding entry */ struct net_device *nh_dev; /* nh_flags is accessed under RCU in the packet path; it is * modified handling netdev events with rtnl lock held */ unsigned int nh_flags; u8 nh_labels; u8 nh_via_alen; u8 nh_via_table; u8 nh_reserved1; u32 nh_label[]; }; /* offset of via from beginning of mpls_nh */ #define MPLS_NH_VIA_OFF(num_labels) \ ALIGN(sizeof(struct mpls_nh) + (num_labels) * sizeof(u32), \ VIA_ALEN_ALIGN) /* all nexthops within a route have the same size based on the * max number of labels and max via length across all nexthops */ #define MPLS_NH_SIZE(num_labels, max_via_alen) \ (MPLS_NH_VIA_OFF((num_labels)) + \ ALIGN((max_via_alen), VIA_ALEN_ALIGN)) enum mpls_ttl_propagation { MPLS_TTL_PROP_DEFAULT, MPLS_TTL_PROP_ENABLED, MPLS_TTL_PROP_DISABLED, }; /* The route, nexthops and vias are stored together in the same memory * block: * * +----------------------+ * | mpls_route | * +----------------------+ * | mpls_nh 0 | * +----------------------+ * | alignment padding | 4 bytes for odd number of labels * +----------------------+ * | via[rt_max_alen] 0 | * +----------------------+ * | alignment padding | via's aligned on sizeof(unsigned long) * +----------------------+ * | ... | * +----------------------+ * | mpls_nh n-1 | * +----------------------+ * | via[rt_max_alen] n-1 | * +----------------------+ */ struct mpls_route { /* next hop label forwarding entry */ struct rcu_head rt_rcu; u8 rt_protocol; u8 rt_payload_type; u8 rt_max_alen; u8 rt_ttl_propagate; u8 rt_nhn; /* rt_nhn_alive is accessed under RCU in the packet path; it * is modified handling netdev events with rtnl lock held */ u8 rt_nhn_alive; u8 rt_nh_size; u8 rt_via_offset; u8 rt_reserved1; struct mpls_nh rt_nh[]; }; #define for_nexthops(rt) { \ int nhsel; const struct mpls_nh *nh; \ for (nhsel = 0, nh = (rt)->rt_nh; \ nhsel < (rt)->rt_nhn; \ nh = (void *)nh + (rt)->rt_nh_size, nhsel++) #define change_nexthops(rt) { \ int nhsel; struct mpls_nh *nh; \ for (nhsel = 0, nh = (rt)->rt_nh; \ nhsel < (rt)->rt_nhn; \ nh = (void *)nh + (rt)->rt_nh_size, nhsel++) #define endfor_nexthops(rt) } static inline struct mpls_entry_decoded mpls_entry_decode(struct mpls_shim_hdr *hdr) { struct mpls_entry_decoded result; unsigned entry = be32_to_cpu(hdr->label_stack_entry); result.label = (entry & MPLS_LS_LABEL_MASK) >> MPLS_LS_LABEL_SHIFT; result.ttl = (entry & MPLS_LS_TTL_MASK) >> MPLS_LS_TTL_SHIFT; result.tc = (entry & MPLS_LS_TC_MASK) >> MPLS_LS_TC_SHIFT; result.bos = (entry & MPLS_LS_S_MASK) >> MPLS_LS_S_SHIFT; return result; } static inline struct mpls_dev *mpls_dev_get(const struct net_device *dev) { return rcu_dereference_rtnl(dev->mpls_ptr); } int nla_put_labels(struct sk_buff *skb, int attrtype, u8 labels, const u32 label[]); int nla_get_labels(const struct nlattr *nla, u8 max_labels, u8 *labels, u32 label[], struct netlink_ext_ack *extack); bool mpls_output_possible(const struct net_device *dev); unsigned int mpls_dev_mtu(const struct net_device *dev); bool mpls_pkt_too_big(const struct sk_buff *skb, unsigned int mtu); void mpls_stats_inc_outucastpkts(struct net_device *dev, const struct sk_buff *skb); #endif /* MPLS_INTERNAL_H */ |
| 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* * Queue read/write lock * * These use generic atomic and locking routines, but depend on a fair spinlock * implementation in order to be fair themselves. The implementation in * asm-generic/spinlock.h meets these requirements. * * (C) Copyright 2013-2014 Hewlett-Packard Development Company, L.P. * * Authors: Waiman Long <waiman.long@hp.com> */ #ifndef __ASM_GENERIC_QRWLOCK_H #define __ASM_GENERIC_QRWLOCK_H #include <linux/atomic.h> #include <asm/barrier.h> #include <asm/processor.h> #include <asm-generic/qrwlock_types.h> /* Must be included from asm/spinlock.h after defining arch_spin_is_locked. */ /* * Writer states & reader shift and bias. */ #define _QW_WAITING 0x100 /* A writer is waiting */ #define _QW_LOCKED 0x0ff /* A writer holds the lock */ #define _QW_WMASK 0x1ff /* Writer mask */ #define _QR_SHIFT 9 /* Reader count shift */ #define _QR_BIAS (1U << _QR_SHIFT) /* * External function declarations */ extern void queued_read_lock_slowpath(struct qrwlock *lock); extern void queued_write_lock_slowpath(struct qrwlock *lock); /** * queued_read_trylock - try to acquire read lock of a queued rwlock * @lock : Pointer to queued rwlock structure * Return: 1 if lock acquired, 0 if failed */ static inline int queued_read_trylock(struct qrwlock *lock) { int cnts; cnts = atomic_read(&lock->cnts); if (likely(!(cnts & _QW_WMASK))) { cnts = (u32)atomic_add_return_acquire(_QR_BIAS, &lock->cnts); if (likely(!(cnts & _QW_WMASK))) return 1; atomic_sub(_QR_BIAS, &lock->cnts); } return 0; } /** * queued_write_trylock - try to acquire write lock of a queued rwlock * @lock : Pointer to queued rwlock structure * Return: 1 if lock acquired, 0 if failed */ static inline int queued_write_trylock(struct qrwlock *lock) { int cnts; cnts = atomic_read(&lock->cnts); if (unlikely(cnts)) return 0; return likely(atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED)); } /** * queued_read_lock - acquire read lock of a queued rwlock * @lock: Pointer to queued rwlock structure */ static inline void queued_read_lock(struct qrwlock *lock) { int cnts; cnts = atomic_add_return_acquire(_QR_BIAS, &lock->cnts); if (likely(!(cnts & _QW_WMASK))) return; /* The slowpath will decrement the reader count, if necessary. */ queued_read_lock_slowpath(lock); } /** * queued_write_lock - acquire write lock of a queued rwlock * @lock : Pointer to queued rwlock structure */ static inline void queued_write_lock(struct qrwlock *lock) { int cnts = 0; /* Optimize for the unfair lock case where the fair flag is 0. */ if (likely(atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED))) return; queued_write_lock_slowpath(lock); } /** * queued_read_unlock - release read lock of a queued rwlock * @lock : Pointer to queued rwlock structure */ static inline void queued_read_unlock(struct qrwlock *lock) { /* * Atomically decrement the reader count */ (void)atomic_sub_return_release(_QR_BIAS, &lock->cnts); } /** * queued_write_unlock - release write lock of a queued rwlock * @lock : Pointer to queued rwlock structure */ static inline void queued_write_unlock(struct qrwlock *lock) { smp_store_release(&lock->wlocked, 0); } /** * queued_rwlock_is_contended - check if the lock is contended * @lock : Pointer to queued rwlock structure * Return: 1 if lock contended, 0 otherwise */ static inline int queued_rwlock_is_contended(struct qrwlock *lock) { return arch_spin_is_locked(&lock->wait_lock); } /* * Remapping rwlock architecture specific functions to the corresponding * queued rwlock functions. */ #define arch_read_lock(l) queued_read_lock(l) #define arch_write_lock(l) queued_write_lock(l) #define arch_read_trylock(l) queued_read_trylock(l) #define arch_write_trylock(l) queued_write_trylock(l) #define arch_read_unlock(l) queued_read_unlock(l) #define arch_write_unlock(l) queued_write_unlock(l) #define arch_rwlock_is_contended(l) queued_rwlock_is_contended(l) #endif /* __ASM_GENERIC_QRWLOCK_H */ |
| 20 29 11 6 6 11 6 1 3 3 3 3 3 277 277 277 277 1 1 1 1 1 3 3 3 3 3 15 14 15 14 15 15 15 15 15 15 13 12 11 12 12 11 11 17 17 12 12 5 5 17 17 17 17 17 17 17 17 17 15 16 16 16 16 15 15 15 15 15 15 15 1 2 2 2 4 4 17 17 10 10 10 10 10 10 10 10 10 10 10 10 10 6 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 28 6 1 1 1 5 5 6 5 5 5 5 5 5 2 5 5 5 5 5 5 3 3 3 3 3 3 3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 3 1 1 3 3 2 1 1 3 17 16 16 14 14 4 4 1 14 3 3 14 44 45 44 2 2 2 2 2 2 2 28 28 28 28 28 28 28 28 2 2 2 15 16 16 16 16 14 10 10 3 10 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 | // SPDX-License-Identifier: GPL-2.0 /* * gendisk handling * * Portions Copyright (C) 2020 Christoph Hellwig */ #include <linux/module.h> #include <linux/ctype.h> #include <linux/fs.h> #include <linux/kdev_t.h> #include <linux/kernel.h> #include <linux/blkdev.h> #include <linux/backing-dev.h> #include <linux/init.h> #include <linux/spinlock.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> #include <linux/slab.h> #include <linux/kmod.h> #include <linux/major.h> #include <linux/mutex.h> #include <linux/idr.h> #include <linux/log2.h> #include <linux/pm_runtime.h> #include <linux/badblocks.h> #include <linux/part_stat.h> #include <linux/blktrace_api.h> #include "blk-throttle.h" #include "blk.h" #include "blk-mq-sched.h" #include "blk-rq-qos.h" #include "blk-cgroup.h" static struct kobject *block_depr; /* * Unique, monotonically increasing sequential number associated with block * devices instances (i.e. incremented each time a device is attached). * Associating uevents with block devices in userspace is difficult and racy: * the uevent netlink socket is lossy, and on slow and overloaded systems has * a very high latency. * Block devices do not have exclusive owners in userspace, any process can set * one up (e.g. loop devices). Moreover, device names can be reused (e.g. loop0 * can be reused again and again). * A userspace process setting up a block device and watching for its events * cannot thus reliably tell whether an event relates to the device it just set * up or another earlier instance with the same name. * This sequential number allows userspace processes to solve this problem, and * uniquely associate an uevent to the lifetime to a device. */ static atomic64_t diskseq; /* for extended dynamic devt allocation, currently only one major is used */ #define NR_EXT_DEVT (1 << MINORBITS) static DEFINE_IDA(ext_devt_ida); void set_capacity(struct gendisk *disk, sector_t sectors) { if (sectors > BLK_DEV_MAX_SECTORS) { pr_warn_once("%s: truncate capacity from %lld to %lld\n", disk->disk_name, sectors, BLK_DEV_MAX_SECTORS); sectors = BLK_DEV_MAX_SECTORS; } bdev_set_nr_sectors(disk->part0, sectors); } EXPORT_SYMBOL(set_capacity); /* * Set disk capacity and notify if the size is not currently zero and will not * be set to zero. Returns true if a uevent was sent, otherwise false. */ bool set_capacity_and_notify(struct gendisk *disk, sector_t size) { sector_t capacity = get_capacity(disk); char *envp[] = { "RESIZE=1", NULL }; set_capacity(disk, size); /* * Only print a message and send a uevent if the gendisk is user visible * and alive. This avoids spamming the log and udev when setting the * initial capacity during probing. */ if (size == capacity || !disk_live(disk) || (disk->flags & GENHD_FL_HIDDEN)) return false; pr_info("%s: detected capacity change from %lld to %lld\n", disk->disk_name, capacity, size); /* * Historically we did not send a uevent for changes to/from an empty * device. */ if (!capacity || !size) return false; kobject_uevent_env(&disk_to_dev(disk)->kobj, KOBJ_CHANGE, envp); return true; } EXPORT_SYMBOL_GPL(set_capacity_and_notify); static void part_stat_read_all(struct block_device *part, struct disk_stats *stat) { int cpu; memset(stat, 0, sizeof(struct disk_stats)); for_each_possible_cpu(cpu) { struct disk_stats *ptr = per_cpu_ptr(part->bd_stats, cpu); int group; for (group = 0; group < NR_STAT_GROUPS; group++) { stat->nsecs[group] += ptr->nsecs[group]; stat->sectors[group] += ptr->sectors[group]; stat->ios[group] += ptr->ios[group]; stat->merges[group] += ptr->merges[group]; } stat->io_ticks += ptr->io_ticks; } } unsigned int part_in_flight(struct block_device *part) { unsigned int inflight = 0; int cpu; for_each_possible_cpu(cpu) { inflight += part_stat_local_read_cpu(part, in_flight[0], cpu) + part_stat_local_read_cpu(part, in_flight[1], cpu); } if ((int)inflight < 0) inflight = 0; return inflight; } static void part_in_flight_rw(struct block_device *part, unsigned int inflight[2]) { int cpu; inflight[0] = 0; inflight[1] = 0; for_each_possible_cpu(cpu) { inflight[0] += part_stat_local_read_cpu(part, in_flight[0], cpu); inflight[1] += part_stat_local_read_cpu(part, in_flight[1], cpu); } if ((int)inflight[0] < 0) inflight[0] = 0; if ((int)inflight[1] < 0) inflight[1] = 0; } /* * Can be deleted altogether. Later. * */ #define BLKDEV_MAJOR_HASH_SIZE 255 static struct blk_major_name { struct blk_major_name *next; int major; char name[16]; #ifdef CONFIG_BLOCK_LEGACY_AUTOLOAD void (*probe)(dev_t devt); #endif } *major_names[BLKDEV_MAJOR_HASH_SIZE]; static DEFINE_MUTEX(major_names_lock); static DEFINE_SPINLOCK(major_names_spinlock); /* index in the above - for now: assume no multimajor ranges */ static inline int major_to_index(unsigned major) { return major % BLKDEV_MAJOR_HASH_SIZE; } #ifdef CONFIG_PROC_FS void blkdev_show(struct seq_file *seqf, off_t offset) { struct blk_major_name *dp; spin_lock(&major_names_spinlock); for (dp = major_names[major_to_index(offset)]; dp; dp = dp->next) if (dp->major == offset) seq_printf(seqf, "%3d %s\n", dp->major, dp->name); spin_unlock(&major_names_spinlock); } #endif /* CONFIG_PROC_FS */ /** * __register_blkdev - register a new block device * * @major: the requested major device number [1..BLKDEV_MAJOR_MAX-1]. If * @major = 0, try to allocate any unused major number. * @name: the name of the new block device as a zero terminated string * @probe: pre-devtmpfs / pre-udev callback used to create disks when their * pre-created device node is accessed. When a probe call uses * add_disk() and it fails the driver must cleanup resources. This * interface may soon be removed. * * The @name must be unique within the system. * * The return value depends on the @major input parameter: * * - if a major device number was requested in range [1..BLKDEV_MAJOR_MAX-1] * then the function returns zero on success, or a negative error code * - if any unused major number was requested with @major = 0 parameter * then the return value is the allocated major number in range * [1..BLKDEV_MAJOR_MAX-1] or a negative error code otherwise * * See Documentation/admin-guide/devices.txt for the list of allocated * major numbers. * * Use register_blkdev instead for any new code. */ int __register_blkdev(unsigned int major, const char *name, void (*probe)(dev_t devt)) { struct blk_major_name **n, *p; int index, ret = 0; mutex_lock(&major_names_lock); /* temporary */ if (major == 0) { for (index = ARRAY_SIZE(major_names)-1; index > 0; index--) { if (major_names[index] == NULL) break; } if (index == 0) { printk("%s: failed to get major for %s\n", __func__, name); ret = -EBUSY; goto out; } major = index; ret = major; } if (major >= BLKDEV_MAJOR_MAX) { pr_err("%s: major requested (%u) is greater than the maximum (%u) for %s\n", __func__, major, BLKDEV_MAJOR_MAX-1, name); ret = -EINVAL; goto out; } p = kmalloc(sizeof(struct blk_major_name), GFP_KERNEL); if (p == NULL) { ret = -ENOMEM; goto out; } p->major = major; #ifdef CONFIG_BLOCK_LEGACY_AUTOLOAD p->probe = probe; #endif strscpy(p->name, name, sizeof(p->name)); p->next = NULL; index = major_to_index(major); spin_lock(&major_names_spinlock); for (n = &major_names[index]; *n; n = &(*n)->next) { if ((*n)->major == major) break; } if (!*n) *n = p; else ret = -EBUSY; spin_unlock(&major_names_spinlock); if (ret < 0) { printk("register_blkdev: cannot get major %u for %s\n", major, name); kfree(p); } out: mutex_unlock(&major_names_lock); return ret; } EXPORT_SYMBOL(__register_blkdev); void unregister_blkdev(unsigned int major, const char *name) { struct blk_major_name **n; struct blk_major_name *p = NULL; int index = major_to_index(major); mutex_lock(&major_names_lock); spin_lock(&major_names_spinlock); for (n = &major_names[index]; *n; n = &(*n)->next) if ((*n)->major == major) break; if (!*n || strcmp((*n)->name, name)) { WARN_ON(1); } else { p = *n; *n = p->next; } spin_unlock(&major_names_spinlock); mutex_unlock(&major_names_lock); kfree(p); } EXPORT_SYMBOL(unregister_blkdev); int blk_alloc_ext_minor(void) { int idx; idx = ida_alloc_range(&ext_devt_ida, 0, NR_EXT_DEVT - 1, GFP_KERNEL); if (idx == -ENOSPC) return -EBUSY; return idx; } void blk_free_ext_minor(unsigned int minor) { ida_free(&ext_devt_ida, minor); } void disk_uevent(struct gendisk *disk, enum kobject_action action) { struct block_device *part; unsigned long idx; rcu_read_lock(); xa_for_each(&disk->part_tbl, idx, part) { if (bdev_is_partition(part) && !bdev_nr_sectors(part)) continue; if (!kobject_get_unless_zero(&part->bd_device.kobj)) continue; rcu_read_unlock(); kobject_uevent(bdev_kobj(part), action); put_device(&part->bd_device); rcu_read_lock(); } rcu_read_unlock(); } EXPORT_SYMBOL_GPL(disk_uevent); int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode) { struct file *file; int ret = 0; if (!disk_has_partscan(disk)) return -EINVAL; if (disk->open_partitions) return -EBUSY; /* * If the device is opened exclusively by current thread already, it's * safe to scan partitons, otherwise, use bd_prepare_to_claim() to * synchronize with other exclusive openers and other partition * scanners. */ if (!(mode & BLK_OPEN_EXCL)) { ret = bd_prepare_to_claim(disk->part0, disk_scan_partitions, NULL); if (ret) return ret; } set_bit(GD_NEED_PART_SCAN, &disk->state); file = bdev_file_open_by_dev(disk_devt(disk), mode & ~BLK_OPEN_EXCL, NULL, NULL); if (IS_ERR(file)) ret = PTR_ERR(file); else fput(file); /* * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set, * and this will cause that re-assemble partitioned raid device will * creat partition for underlying disk. */ clear_bit(GD_NEED_PART_SCAN, &disk->state); if (!(mode & BLK_OPEN_EXCL)) bd_abort_claiming(disk->part0, disk_scan_partitions); return ret; } /** * add_disk_fwnode - add disk information to kernel list with fwnode * @parent: parent device for the disk * @disk: per-device partitioning information * @groups: Additional per-device sysfs groups * @fwnode: attached disk fwnode * * This function registers the partitioning information in @disk * with the kernel. Also attach a fwnode to the disk device. */ int __must_check add_disk_fwnode(struct device *parent, struct gendisk *disk, const struct attribute_group **groups, struct fwnode_handle *fwnode) { struct device *ddev = disk_to_dev(disk); int ret; if (WARN_ON_ONCE(bdev_nr_sectors(disk->part0) > BLK_DEV_MAX_SECTORS)) return -EINVAL; if (queue_is_mq(disk->queue)) { /* * ->submit_bio and ->poll_bio are bypassed for blk-mq drivers. */ if (disk->fops->submit_bio || disk->fops->poll_bio) return -EINVAL; /* * Initialize the I/O scheduler code and pick a default one if * needed. */ elevator_init_mq(disk->queue); } else { if (!disk->fops->submit_bio) return -EINVAL; bdev_set_flag(disk->part0, BD_HAS_SUBMIT_BIO); } /* * If the driver provides an explicit major number it also must provide * the number of minors numbers supported, and those will be used to * setup the gendisk. * Otherwise just allocate the device numbers for both the whole device * and all partitions from the extended dev_t space. */ ret = -EINVAL; if (disk->major) { if (WARN_ON(!disk->minors)) goto out_exit_elevator; if (disk->minors > DISK_MAX_PARTS) { pr_err("block: can't allocate more than %d partitions\n", DISK_MAX_PARTS); disk->minors = DISK_MAX_PARTS; } if (disk->first_minor > MINORMASK || disk->minors > MINORMASK + 1 || disk->first_minor + disk->minors > MINORMASK + 1) goto out_exit_elevator; } else { if (WARN_ON(disk->minors)) goto out_exit_elevator; ret = blk_alloc_ext_minor(); if (ret < 0) goto out_exit_elevator; disk->major = BLOCK_EXT_MAJOR; disk->first_minor = ret; } /* delay uevents, until we scanned partition table */ dev_set_uevent_suppress(ddev, 1); ddev->parent = parent; ddev->groups = groups; dev_set_name(ddev, "%s", disk->disk_name); if (fwnode) device_set_node(ddev, fwnode); if (!(disk->flags & GENHD_FL_HIDDEN)) ddev->devt = MKDEV(disk->major, disk->first_minor); ret = device_add(ddev); if (ret) goto out_free_ext_minor; ret = disk_alloc_events(disk); if (ret) goto out_device_del; ret = sysfs_create_link(block_depr, &ddev->kobj, kobject_name(&ddev->kobj)); if (ret) goto out_device_del; /* * avoid probable deadlock caused by allocating memory with * GFP_KERNEL in runtime_resume callback of its all ancestor * devices */ pm_runtime_set_memalloc_noio(ddev, true); disk->part0->bd_holder_dir = kobject_create_and_add("holders", &ddev->kobj); if (!disk->part0->bd_holder_dir) { ret = -ENOMEM; goto out_del_block_link; } disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj); if (!disk->slave_dir) { ret = -ENOMEM; goto out_put_holder_dir; } ret = blk_register_queue(disk); if (ret) goto out_put_slave_dir; if (!(disk->flags & GENHD_FL_HIDDEN)) { ret = bdi_register(disk->bdi, "%u:%u", disk->major, disk->first_minor); if (ret) goto out_unregister_queue; bdi_set_owner(disk->bdi, ddev); ret = sysfs_create_link(&ddev->kobj, &disk->bdi->dev->kobj, "bdi"); if (ret) goto out_unregister_bdi; /* Make sure the first partition scan will be proceed */ if (get_capacity(disk) && disk_has_partscan(disk)) set_bit(GD_NEED_PART_SCAN, &disk->state); bdev_add(disk->part0, ddev->devt); if (get_capacity(disk)) disk_scan_partitions(disk, BLK_OPEN_READ); /* * Announce the disk and partitions after all partitions are * created. (for hidden disks uevents remain suppressed forever) */ dev_set_uevent_suppress(ddev, 0); disk_uevent(disk, KOBJ_ADD); } else { /* * Even if the block_device for a hidden gendisk is not * registered, it needs to have a valid bd_dev so that the * freeing of the dynamic major works. */ disk->part0->bd_dev = MKDEV(disk->major, disk->first_minor); } blk_apply_bdi_limits(disk->bdi, &disk->queue->limits); disk_add_events(disk); set_bit(GD_ADDED, &disk->state); return 0; out_unregister_bdi: if (!(disk->flags & GENHD_FL_HIDDEN)) bdi_unregister(disk->bdi); out_unregister_queue: blk_unregister_queue(disk); rq_qos_exit(disk->queue); out_put_slave_dir: kobject_put(disk->slave_dir); disk->slave_dir = NULL; out_put_holder_dir: kobject_put(disk->part0->bd_holder_dir); out_del_block_link: sysfs_remove_link(block_depr, dev_name(ddev)); pm_runtime_set_memalloc_noio(ddev, false); out_device_del: device_del(ddev); out_free_ext_minor: if (disk->major == BLOCK_EXT_MAJOR) blk_free_ext_minor(disk->first_minor); out_exit_elevator: if (disk->queue->elevator) { mutex_lock(&disk->queue->elevator_lock); elevator_exit(disk->queue); mutex_unlock(&disk->queue->elevator_lock); } return ret; } EXPORT_SYMBOL_GPL(add_disk_fwnode); /** * device_add_disk - add disk information to kernel list * @parent: parent device for the disk * @disk: per-device partitioning information * @groups: Additional per-device sysfs groups * * This function registers the partitioning information in @disk * with the kernel. */ int __must_check device_add_disk(struct device *parent, struct gendisk *disk, const struct attribute_group **groups) { return add_disk_fwnode(parent, disk, groups, NULL); } EXPORT_SYMBOL(device_add_disk); static void blk_report_disk_dead(struct gendisk *disk, bool surprise) { struct block_device *bdev; unsigned long idx; /* * On surprise disk removal, bdev_mark_dead() may call into file * systems below. Make it clear that we're expecting to not hold * disk->open_mutex. */ lockdep_assert_not_held(&disk->open_mutex); rcu_read_lock(); xa_for_each(&disk->part_tbl, idx, bdev) { if (!kobject_get_unless_zero(&bdev->bd_device.kobj)) continue; rcu_read_unlock(); bdev_mark_dead(bdev, surprise); put_device(&bdev->bd_device); rcu_read_lock(); } rcu_read_unlock(); } static bool __blk_mark_disk_dead(struct gendisk *disk) { /* * Fail any new I/O. */ if (test_and_set_bit(GD_DEAD, &disk->state)) return false; if (test_bit(GD_OWNS_QUEUE, &disk->state)) blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue); /* * Stop buffered writers from dirtying pages that can't be written out. */ set_capacity(disk, 0); /* * Prevent new I/O from crossing bio_queue_enter(). */ return blk_queue_start_drain(disk->queue); } /** * blk_mark_disk_dead - mark a disk as dead * @disk: disk to mark as dead * * Mark as disk as dead (e.g. surprise removed) and don't accept any new I/O * to this disk. */ void blk_mark_disk_dead(struct gendisk *disk) { __blk_mark_disk_dead(disk); blk_report_disk_dead(disk, true); } EXPORT_SYMBOL_GPL(blk_mark_disk_dead); /** * del_gendisk - remove the gendisk * @disk: the struct gendisk to remove * * Removes the gendisk and all its associated resources. This deletes the * partitions associated with the gendisk, and unregisters the associated * request_queue. * * This is the counter to the respective __device_add_disk() call. * * The final removal of the struct gendisk happens when its refcount reaches 0 * with put_disk(), which should be called after del_gendisk(), if * __device_add_disk() was used. * * Drivers exist which depend on the release of the gendisk to be synchronous, * it should not be deferred. * * Context: can sleep */ void del_gendisk(struct gendisk *disk) { struct request_queue *q = disk->queue; struct block_device *part; unsigned long idx; bool start_drain; might_sleep(); if (WARN_ON_ONCE(!disk_live(disk) && !(disk->flags & GENHD_FL_HIDDEN))) return; disk_del_events(disk); /* * Prevent new openers by unlinked the bdev inode. */ mutex_lock(&disk->open_mutex); xa_for_each(&disk->part_tbl, idx, part) bdev_unhash(part); mutex_unlock(&disk->open_mutex); /* * Tell the file system to write back all dirty data and shut down if * it hasn't been notified earlier. */ if (!test_bit(GD_DEAD, &disk->state)) blk_report_disk_dead(disk, false); /* * Drop all partitions now that the disk is marked dead. */ mutex_lock(&disk->open_mutex); start_drain = __blk_mark_disk_dead(disk); if (start_drain) blk_freeze_acquire_lock(q); xa_for_each_start(&disk->part_tbl, idx, part, 1) drop_partition(part); mutex_unlock(&disk->open_mutex); if (!(disk->flags & GENHD_FL_HIDDEN)) { sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi"); /* * Unregister bdi before releasing device numbers (as they can * get reused and we'd get clashes in sysfs). */ bdi_unregister(disk->bdi); } blk_unregister_queue(disk); kobject_put(disk->part0->bd_holder_dir); kobject_put(disk->slave_dir); disk->slave_dir = NULL; part_stat_set_all(disk->part0, 0); disk->part0->bd_stamp = 0; sysfs_remove_link(block_depr, dev_name(disk_to_dev(disk))); pm_runtime_set_memalloc_noio(disk_to_dev(disk), false); device_del(disk_to_dev(disk)); blk_mq_freeze_queue_wait(q); blk_throtl_cancel_bios(disk); blk_sync_queue(q); blk_flush_integrity(); if (queue_is_mq(q)) blk_mq_cancel_work_sync(q); blk_mq_quiesce_queue(q); if (q->elevator) { mutex_lock(&q->elevator_lock); elevator_exit(q); mutex_unlock(&q->elevator_lock); } rq_qos_exit(q); blk_mq_unquiesce_queue(q); /* * If the disk does not own the queue, allow using passthrough requests * again. Else leave the queue frozen to fail all I/O. */ if (!test_bit(GD_OWNS_QUEUE, &disk->state)) __blk_mq_unfreeze_queue(q, true); else if (queue_is_mq(q)) blk_mq_exit_queue(q); if (start_drain) blk_unfreeze_release_lock(q); } EXPORT_SYMBOL(del_gendisk); /** * invalidate_disk - invalidate the disk * @disk: the struct gendisk to invalidate * * A helper to invalidates the disk. It will clean the disk's associated * buffer/page caches and reset its internal states so that the disk * can be reused by the drivers. * * Context: can sleep */ void invalidate_disk(struct gendisk *disk) { struct block_device *bdev = disk->part0; invalidate_bdev(bdev); bdev->bd_mapping->wb_err = 0; set_capacity(disk, 0); } EXPORT_SYMBOL(invalidate_disk); /* sysfs access to bad-blocks list. */ static ssize_t disk_badblocks_show(struct device *dev, struct device_attribute *attr, char *page) { struct gendisk *disk = dev_to_disk(dev); if (!disk->bb) return sysfs_emit(page, "\n"); return badblocks_show(disk->bb, page, 0); } static ssize_t disk_badblocks_store(struct device *dev, struct device_attribute *attr, const char *page, size_t len) { struct gendisk *disk = dev_to_disk(dev); if (!disk->bb) return -ENXIO; return badblocks_store(disk->bb, page, len, 0); } #ifdef CONFIG_BLOCK_LEGACY_AUTOLOAD static bool blk_probe_dev(dev_t devt) { unsigned int major = MAJOR(devt); struct blk_major_name **n; mutex_lock(&major_names_lock); for (n = &major_names[major_to_index(major)]; *n; n = &(*n)->next) { if ((*n)->major == major && (*n)->probe) { (*n)->probe(devt); mutex_unlock(&major_names_lock); return true; } } mutex_unlock(&major_names_lock); return false; } void blk_request_module(dev_t devt) { int error; if (blk_probe_dev(devt)) return; error = request_module("block-major-%d-%d", MAJOR(devt), MINOR(devt)); /* Make old-style 2.4 aliases work */ if (error > 0) error = request_module("block-major-%d", MAJOR(devt)); if (!error) blk_probe_dev(devt); } #endif /* CONFIG_BLOCK_LEGACY_AUTOLOAD */ #ifdef CONFIG_PROC_FS /* iterator */ static void *disk_seqf_start(struct seq_file *seqf, loff_t *pos) { loff_t skip = *pos; struct class_dev_iter *iter; struct device *dev; iter = kmalloc(sizeof(*iter), GFP_KERNEL); if (!iter) return ERR_PTR(-ENOMEM); seqf->private = iter; class_dev_iter_init(iter, &block_class, NULL, &disk_type); do { dev = class_dev_iter_next(iter); if (!dev) return NULL; } while (skip--); return dev_to_disk(dev); } static void *disk_seqf_next(struct seq_file *seqf, void *v, loff_t *pos) { struct device *dev; (*pos)++; dev = class_dev_iter_next(seqf->private); if (dev) return dev_to_disk(dev); return NULL; } static void disk_seqf_stop(struct seq_file *seqf, void *v) { struct class_dev_iter *iter = seqf->private; /* stop is called even after start failed :-( */ if (iter) { class_dev_iter_exit(iter); kfree(iter); seqf->private = NULL; } } static void *show_partition_start(struct seq_file *seqf, loff_t *pos) { void *p; p = disk_seqf_start(seqf, pos); if (!IS_ERR_OR_NULL(p) && !*pos) seq_puts(seqf, "major minor #blocks name\n\n"); return p; } static int show_partition(struct seq_file *seqf, void *v) { struct gendisk *sgp = v; struct block_device *part; unsigned long idx; if (!get_capacity(sgp) || (sgp->flags & GENHD_FL_HIDDEN)) return 0; rcu_read_lock(); xa_for_each(&sgp->part_tbl, idx, part) { if (!bdev_nr_sectors(part)) continue; seq_printf(seqf, "%4d %7d %10llu %pg\n", MAJOR(part->bd_dev), MINOR(part->bd_dev), bdev_nr_sectors(part) >> 1, part); } rcu_read_unlock(); return 0; } static const struct seq_operations partitions_op = { .start = show_partition_start, .next = disk_seqf_next, .stop = disk_seqf_stop, .show = show_partition }; #endif static int __init genhd_device_init(void) { int error; error = class_register(&block_class); if (unlikely(error)) return error; blk_dev_init(); register_blkdev(BLOCK_EXT_MAJOR, "blkext"); /* create top-level block dir */ block_depr = kobject_create_and_add("block", NULL); return 0; } subsys_initcall(genhd_device_init); static ssize_t disk_range_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sysfs_emit(buf, "%d\n", disk->minors); } static ssize_t disk_ext_range_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sysfs_emit(buf, "%d\n", (disk->flags & GENHD_FL_NO_PART) ? 1 : DISK_MAX_PARTS); } static ssize_t disk_removable_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sysfs_emit(buf, "%d\n", (disk->flags & GENHD_FL_REMOVABLE ? 1 : 0)); } static ssize_t disk_hidden_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sysfs_emit(buf, "%d\n", (disk->flags & GENHD_FL_HIDDEN ? 1 : 0)); } static ssize_t disk_ro_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sysfs_emit(buf, "%d\n", get_disk_ro(disk) ? 1 : 0); } ssize_t part_size_show(struct device *dev, struct device_attribute *attr, char *buf) { return sysfs_emit(buf, "%llu\n", bdev_nr_sectors(dev_to_bdev(dev))); } ssize_t part_stat_show(struct device *dev, struct device_attribute *attr, char *buf) { struct block_device *bdev = dev_to_bdev(dev); struct disk_stats stat; unsigned int inflight; inflight = part_in_flight(bdev); if (inflight) { part_stat_lock(); update_io_ticks(bdev, jiffies, true); part_stat_unlock(); } part_stat_read_all(bdev, &stat); return sysfs_emit(buf, "%8lu %8lu %8llu %8u " "%8lu %8lu %8llu %8u " "%8u %8u %8u " "%8lu %8lu %8llu %8u " "%8lu %8u" "\n", stat.ios[STAT_READ], stat.merges[STAT_READ], (unsigned long long)stat.sectors[STAT_READ], (unsigned int)div_u64(stat.nsecs[STAT_READ], NSEC_PER_MSEC), stat.ios[STAT_WRITE], stat.merges[STAT_WRITE], (unsigned long long)stat.sectors[STAT_WRITE], (unsigned int)div_u64(stat.nsecs[STAT_WRITE], NSEC_PER_MSEC), inflight, jiffies_to_msecs(stat.io_ticks), (unsigned int)div_u64(stat.nsecs[STAT_READ] + stat.nsecs[STAT_WRITE] + stat.nsecs[STAT_DISCARD] + stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC), stat.ios[STAT_DISCARD], stat.merges[STAT_DISCARD], (unsigned long long)stat.sectors[STAT_DISCARD], (unsigned int)div_u64(stat.nsecs[STAT_DISCARD], NSEC_PER_MSEC), stat.ios[STAT_FLUSH], (unsigned int)div_u64(stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC)); } ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr, char *buf) { struct block_device *bdev = dev_to_bdev(dev); struct request_queue *q = bdev_get_queue(bdev); unsigned int inflight[2]; if (queue_is_mq(q)) blk_mq_in_flight_rw(q, bdev, inflight); else part_in_flight_rw(bdev, inflight); return sysfs_emit(buf, "%8u %8u\n", inflight[0], inflight[1]); } static ssize_t disk_capability_show(struct device *dev, struct device_attribute *attr, char *buf) { dev_warn_once(dev, "the capability attribute has been deprecated.\n"); return sysfs_emit(buf, "0\n"); } static ssize_t disk_alignment_offset_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sysfs_emit(buf, "%d\n", bdev_alignment_offset(disk->part0)); } static ssize_t disk_discard_alignment_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sysfs_emit(buf, "%d\n", bdev_alignment_offset(disk->part0)); } static ssize_t diskseq_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sysfs_emit(buf, "%llu\n", disk->diskseq); } static ssize_t partscan_show(struct device *dev, struct device_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", disk_has_partscan(dev_to_disk(dev))); } static DEVICE_ATTR(range, 0444, disk_range_show, NULL); static DEVICE_ATTR(ext_range, 0444, disk_ext_range_show, NULL); static DEVICE_ATTR(removable, 0444, disk_removable_show, NULL); static DEVICE_ATTR(hidden, 0444, disk_hidden_show, NULL); static DEVICE_ATTR(ro, 0444, disk_ro_show, NULL); static DEVICE_ATTR(size, 0444, part_size_show, NULL); static DEVICE_ATTR(alignment_offset, 0444, disk_alignment_offset_show, NULL); static DEVICE_ATTR(discard_alignment, 0444, disk_discard_alignment_show, NULL); static DEVICE_ATTR(capability, 0444, disk_capability_show, NULL); static DEVICE_ATTR(stat, 0444, part_stat_show, NULL); static DEVICE_ATTR(inflight, 0444, part_inflight_show, NULL); static DEVICE_ATTR(badblocks, 0644, disk_badblocks_show, disk_badblocks_store); static DEVICE_ATTR(diskseq, 0444, diskseq_show, NULL); static DEVICE_ATTR(partscan, 0444, partscan_show, NULL); #ifdef CONFIG_FAIL_MAKE_REQUEST ssize_t part_fail_show(struct device *dev, struct device_attribute *attr, char *buf) { return sysfs_emit(buf, "%d\n", bdev_test_flag(dev_to_bdev(dev), BD_MAKE_IT_FAIL)); } ssize_t part_fail_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { int i; if (count > 0 && sscanf(buf, "%d", &i) > 0) { if (i) bdev_set_flag(dev_to_bdev(dev), BD_MAKE_IT_FAIL); else bdev_clear_flag(dev_to_bdev(dev), BD_MAKE_IT_FAIL); } return count; } static struct device_attribute dev_attr_fail = __ATTR(make-it-fail, 0644, part_fail_show, part_fail_store); #endif /* CONFIG_FAIL_MAKE_REQUEST */ #ifdef CONFIG_FAIL_IO_TIMEOUT static struct device_attribute dev_attr_fail_timeout = __ATTR(io-timeout-fail, 0644, part_timeout_show, part_timeout_store); #endif static struct attribute *disk_attrs[] = { &dev_attr_range.attr, &dev_attr_ext_range.attr, &dev_attr_removable.attr, &dev_attr_hidden.attr, &dev_attr_ro.attr, &dev_attr_size.attr, &dev_attr_alignment_offset.attr, &dev_attr_discard_alignment.attr, &dev_attr_capability.attr, &dev_attr_stat.attr, &dev_attr_inflight.attr, &dev_attr_badblocks.attr, &dev_attr_events.attr, &dev_attr_events_async.attr, &dev_attr_events_poll_msecs.attr, &dev_attr_diskseq.attr, &dev_attr_partscan.attr, #ifdef CONFIG_FAIL_MAKE_REQUEST &dev_attr_fail.attr, #endif #ifdef CONFIG_FAIL_IO_TIMEOUT &dev_attr_fail_timeout.attr, #endif NULL }; static umode_t disk_visible(struct kobject *kobj, struct attribute *a, int n) { struct device *dev = container_of(kobj, typeof(*dev), kobj); struct gendisk *disk = dev_to_disk(dev); if (a == &dev_attr_badblocks.attr && !disk->bb) return 0; return a->mode; } static struct attribute_group disk_attr_group = { .attrs = disk_attrs, .is_visible = disk_visible, }; static const struct attribute_group *disk_attr_groups[] = { &disk_attr_group, #ifdef CONFIG_BLK_DEV_IO_TRACE &blk_trace_attr_group, #endif #ifdef CONFIG_BLK_DEV_INTEGRITY &blk_integrity_attr_group, #endif NULL }; /** * disk_release - releases all allocated resources of the gendisk * @dev: the device representing this disk * * This function releases all allocated resources of the gendisk. * * Drivers which used __device_add_disk() have a gendisk with a request_queue * assigned. Since the request_queue sits on top of the gendisk for these * drivers we also call blk_put_queue() for them, and we expect the * request_queue refcount to reach 0 at this point, and so the request_queue * will also be freed prior to the disk. * * Context: can sleep */ static void disk_release(struct device *dev) { struct gendisk *disk = dev_to_disk(dev); might_sleep(); WARN_ON_ONCE(disk_live(disk)); blk_trace_remove(disk->queue); /* * To undo the all initialization from blk_mq_init_allocated_queue in * case of a probe failure where add_disk is never called we have to * call blk_mq_exit_queue here. We can't do this for the more common * teardown case (yet) as the tagset can be gone by the time the disk * is released once it was added. */ if (queue_is_mq(disk->queue) && test_bit(GD_OWNS_QUEUE, &disk->state) && !test_bit(GD_ADDED, &disk->state)) blk_mq_exit_queue(disk->queue); blkcg_exit_disk(disk); bioset_exit(&disk->bio_split); disk_release_events(disk); kfree(disk->random); disk_free_zone_resources(disk); xa_destroy(&disk->part_tbl); disk->queue->disk = NULL; blk_put_queue(disk->queue); if (test_bit(GD_ADDED, &disk->state) && disk->fops->free_disk) disk->fops->free_disk(disk); bdev_drop(disk->part0); /* frees the disk */ } static int block_uevent(const struct device *dev, struct kobj_uevent_env *env) { const struct gendisk *disk = dev_to_disk(dev); return add_uevent_var(env, "DISKSEQ=%llu", disk->diskseq); } const struct class block_class = { .name = "block", .dev_uevent = block_uevent, }; static char *block_devnode(const struct device *dev, umode_t *mode, kuid_t *uid, kgid_t *gid) { struct gendisk *disk = dev_to_disk(dev); if (disk->fops->devnode) return disk->fops->devnode(disk, mode); return NULL; } const struct device_type disk_type = { .name = "disk", .groups = disk_attr_groups, .release = disk_release, .devnode = block_devnode, }; #ifdef CONFIG_PROC_FS /* * aggregate disk stat collector. Uses the same stats that the sysfs * entries do, above, but makes them available through one seq_file. * * The output looks suspiciously like /proc/partitions with a bunch of * extra fields. */ static int diskstats_show(struct seq_file *seqf, void *v) { struct gendisk *gp = v; struct block_device *hd; unsigned int inflight; struct disk_stats stat; unsigned long idx; /* if (&disk_to_dev(gp)->kobj.entry == block_class.devices.next) seq_puts(seqf, "major minor name" " rio rmerge rsect ruse wio wmerge " "wsect wuse running use aveq" "\n\n"); */ rcu_read_lock(); xa_for_each(&gp->part_tbl, idx, hd) { if (bdev_is_partition(hd) && !bdev_nr_sectors(hd)) continue; inflight = part_in_flight(hd); if (inflight) { part_stat_lock(); update_io_ticks(hd, jiffies, true); part_stat_unlock(); } part_stat_read_all(hd, &stat); seq_put_decimal_ull_width(seqf, "", MAJOR(hd->bd_dev), 4); seq_put_decimal_ull_width(seqf, " ", MINOR(hd->bd_dev), 7); seq_printf(seqf, " %pg", hd); seq_put_decimal_ull(seqf, " ", stat.ios[STAT_READ]); seq_put_decimal_ull(seqf, " ", stat.merges[STAT_READ]); seq_put_decimal_ull(seqf, " ", stat.sectors[STAT_READ]); seq_put_decimal_ull(seqf, " ", (unsigned int)div_u64(stat.nsecs[STAT_READ], NSEC_PER_MSEC)); seq_put_decimal_ull(seqf, " ", stat.ios[STAT_WRITE]); seq_put_decimal_ull(seqf, " ", stat.merges[STAT_WRITE]); seq_put_decimal_ull(seqf, " ", stat.sectors[STAT_WRITE]); seq_put_decimal_ull(seqf, " ", (unsigned int)div_u64(stat.nsecs[STAT_WRITE], NSEC_PER_MSEC)); seq_put_decimal_ull(seqf, " ", inflight); seq_put_decimal_ull(seqf, " ", jiffies_to_msecs(stat.io_ticks)); seq_put_decimal_ull(seqf, " ", (unsigned int)div_u64(stat.nsecs[STAT_READ] + stat.nsecs[STAT_WRITE] + stat.nsecs[STAT_DISCARD] + stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC)); seq_put_decimal_ull(seqf, " ", stat.ios[STAT_DISCARD]); seq_put_decimal_ull(seqf, " ", stat.merges[STAT_DISCARD]); seq_put_decimal_ull(seqf, " ", stat.sectors[STAT_DISCARD]); seq_put_decimal_ull(seqf, " ", (unsigned int)div_u64(stat.nsecs[STAT_DISCARD], NSEC_PER_MSEC)); seq_put_decimal_ull(seqf, " ", stat.ios[STAT_FLUSH]); seq_put_decimal_ull(seqf, " ", (unsigned int)div_u64(stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC)); seq_putc(seqf, '\n'); } rcu_read_unlock(); return 0; } static const struct seq_operations diskstats_op = { .start = disk_seqf_start, .next = disk_seqf_next, .stop = disk_seqf_stop, .show = diskstats_show }; static int __init proc_genhd_init(void) { proc_create_seq("diskstats", 0, NULL, &diskstats_op); proc_create_seq("partitions", 0, NULL, &partitions_op); return 0; } module_init(proc_genhd_init); #endif /* CONFIG_PROC_FS */ dev_t part_devt(struct gendisk *disk, u8 partno) { struct block_device *part; dev_t devt = 0; rcu_read_lock(); part = xa_load(&disk->part_tbl, partno); if (part) devt = part->bd_dev; rcu_read_unlock(); return devt; } struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id, struct lock_class_key *lkclass) { struct gendisk *disk; disk = kzalloc_node(sizeof(struct gendisk), GFP_KERNEL, node_id); if (!disk) return NULL; if (bioset_init(&disk->bio_split, BIO_POOL_SIZE, 0, 0)) goto out_free_disk; disk->bdi = bdi_alloc(node_id); if (!disk->bdi) goto out_free_bioset; /* bdev_alloc() might need the queue, set before the first call */ disk->queue = q; disk->part0 = bdev_alloc(disk, 0); if (!disk->part0) goto out_free_bdi; disk->node_id = node_id; mutex_init(&disk->open_mutex); xa_init(&disk->part_tbl); if (xa_insert(&disk->part_tbl, 0, disk->part0, GFP_KERNEL)) goto out_destroy_part_tbl; if (blkcg_init_disk(disk)) goto out_erase_part0; disk_init_zone_resources(disk); rand_initialize_disk(disk); disk_to_dev(disk)->class = &block_class; disk_to_dev(disk)->type = &disk_type; device_initialize(disk_to_dev(disk)); inc_diskseq(disk); q->disk = disk; lockdep_init_map(&disk->lockdep_map, "(bio completion)", lkclass, 0); #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED INIT_LIST_HEAD(&disk->slave_bdevs); #endif return disk; out_erase_part0: xa_erase(&disk->part_tbl, 0); out_destroy_part_tbl: xa_destroy(&disk->part_tbl); disk->part0->bd_disk = NULL; bdev_drop(disk->part0); out_free_bdi: bdi_put(disk->bdi); out_free_bioset: bioset_exit(&disk->bio_split); out_free_disk: kfree(disk); return NULL; } struct gendisk *__blk_alloc_disk(struct queue_limits *lim, int node, struct lock_class_key *lkclass) { struct queue_limits default_lim = { }; struct request_queue *q; struct gendisk *disk; q = blk_alloc_queue(lim ? lim : &default_lim, node); if (IS_ERR(q)) return ERR_CAST(q); disk = __alloc_disk_node(q, node, lkclass); if (!disk) { blk_put_queue(q); return ERR_PTR(-ENOMEM); } set_bit(GD_OWNS_QUEUE, &disk->state); return disk; } EXPORT_SYMBOL(__blk_alloc_disk); /** * put_disk - decrements the gendisk refcount * @disk: the struct gendisk to decrement the refcount for * * This decrements the refcount for the struct gendisk. When this reaches 0 * we'll have disk_release() called. * * Note: for blk-mq disk put_disk must be called before freeing the tag_set * when handling probe errors (that is before add_disk() is called). * * Context: Any context, but the last reference must not be dropped from * atomic context. */ void put_disk(struct gendisk *disk) { if (disk) put_device(disk_to_dev(disk)); } EXPORT_SYMBOL(put_disk); static void set_disk_ro_uevent(struct gendisk *gd, int ro) { char event[] = "DISK_RO=1"; char *envp[] = { event, NULL }; if (!ro) event[8] = '0'; kobject_uevent_env(&disk_to_dev(gd)->kobj, KOBJ_CHANGE, envp); } /** * set_disk_ro - set a gendisk read-only * @disk: gendisk to operate on * @read_only: %true to set the disk read-only, %false set the disk read/write * * This function is used to indicate whether a given disk device should have its * read-only flag set. set_disk_ro() is typically used by device drivers to * indicate whether the underlying physical device is write-protected. */ void set_disk_ro(struct gendisk *disk, bool read_only) { if (read_only) { if (test_and_set_bit(GD_READ_ONLY, &disk->state)) return; } else { if (!test_and_clear_bit(GD_READ_ONLY, &disk->state)) return; } set_disk_ro_uevent(disk, read_only); } EXPORT_SYMBOL(set_disk_ro); void inc_diskseq(struct gendisk *disk) { disk->diskseq = atomic64_inc_return(&diskseq); } |
| 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | // SPDX-License-Identifier: GPL-2.0-only /* * vivid-radio-tx.c - radio transmitter support functions. * * Copyright 2014 Cisco Systems, Inc. and/or its affiliates. All rights reserved. */ #include <linux/errno.h> #include <linux/kernel.h> #include <linux/sched/signal.h> #include <linux/delay.h> #include <linux/videodev2.h> #include <linux/v4l2-dv-timings.h> #include <media/v4l2-common.h> #include <media/v4l2-event.h> #include <media/v4l2-dv-timings.h> #include "vivid-core.h" #include "vivid-ctrls.h" #include "vivid-radio-common.h" #include "vivid-radio-tx.h" ssize_t vivid_radio_tx_write(struct file *file, const char __user *buf, size_t size, loff_t *offset) { struct vivid_dev *dev = video_drvdata(file); struct v4l2_rds_data *data = dev->rds_gen.data; ktime_t timestamp; unsigned blk; int i; if (dev->radio_tx_rds_controls) return -EINVAL; if (size < sizeof(*data)) return -EINVAL; size = sizeof(*data) * (size / sizeof(*data)); if (mutex_lock_interruptible(&dev->mutex)) return -ERESTARTSYS; if (dev->radio_tx_rds_owner && file->private_data != dev->radio_tx_rds_owner) { mutex_unlock(&dev->mutex); return -EBUSY; } dev->radio_tx_rds_owner = file->private_data; retry: timestamp = ktime_sub(ktime_get(), dev->radio_rds_init_time); blk = ktime_divns(timestamp, VIVID_RDS_NSEC_PER_BLK); if (blk - VIVID_RDS_GEN_BLOCKS >= dev->radio_tx_rds_last_block) dev->radio_tx_rds_last_block = blk - VIVID_RDS_GEN_BLOCKS + 1; /* * No data is available if there hasn't been time to get new data, * or if the RDS receiver has been disabled, or if we use the data * from the RDS transmitter and that RDS transmitter has been disabled, * or if the signal quality is too weak. */ if (blk == dev->radio_tx_rds_last_block || !(dev->radio_tx_subchans & V4L2_TUNER_SUB_RDS)) { mutex_unlock(&dev->mutex); if (file->f_flags & O_NONBLOCK) return -EWOULDBLOCK; if (msleep_interruptible(20) && signal_pending(current)) return -EINTR; if (mutex_lock_interruptible(&dev->mutex)) return -ERESTARTSYS; goto retry; } for (i = 0; i < size && blk > dev->radio_tx_rds_last_block; dev->radio_tx_rds_last_block++) { unsigned data_blk = dev->radio_tx_rds_last_block % VIVID_RDS_GEN_BLOCKS; struct v4l2_rds_data rds; if (copy_from_user(&rds, buf + i, sizeof(rds))) { i = -EFAULT; break; } i += sizeof(rds); if (!dev->radio_rds_loop) continue; if ((rds.block & V4L2_RDS_BLOCK_MSK) == V4L2_RDS_BLOCK_INVALID || (rds.block & V4L2_RDS_BLOCK_ERROR)) continue; rds.block &= V4L2_RDS_BLOCK_MSK; data[data_blk] = rds; } mutex_unlock(&dev->mutex); return i; } __poll_t vivid_radio_tx_poll(struct file *file, struct poll_table_struct *wait) { return EPOLLOUT | EPOLLWRNORM | v4l2_ctrl_poll(file, wait); } int vidioc_g_modulator(struct file *file, void *fh, struct v4l2_modulator *a) { struct vivid_dev *dev = video_drvdata(file); if (a->index > 0) return -EINVAL; strscpy(a->name, "AM/FM/SW Transmitter", sizeof(a->name)); a->capability = V4L2_TUNER_CAP_LOW | V4L2_TUNER_CAP_STEREO | V4L2_TUNER_CAP_FREQ_BANDS | V4L2_TUNER_CAP_RDS | (dev->radio_tx_rds_controls ? V4L2_TUNER_CAP_RDS_CONTROLS : V4L2_TUNER_CAP_RDS_BLOCK_IO); a->rangelow = AM_FREQ_RANGE_LOW; a->rangehigh = FM_FREQ_RANGE_HIGH; a->txsubchans = dev->radio_tx_subchans; return 0; } int vidioc_s_modulator(struct file *file, void *fh, const struct v4l2_modulator *a) { struct vivid_dev *dev = video_drvdata(file); if (a->index) return -EINVAL; if (a->txsubchans & ~0x13) return -EINVAL; dev->radio_tx_subchans = a->txsubchans; return 0; } |
| 1 1 1 1 1 1 1 1 1 2 2 55 50 48 51 54 54 54 47 49 50 47 49 16 17 7 8 8 8 2 7 7 17 17 17 17 17 8 16 16 6 6 3 1 3 3 3 5 6 2 2 2 6 2 2 4 17 16 17 17 17 17 17 17 17 16 17 17 4 16 16 16 17 6 6 1 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 | // SPDX-License-Identifier: GPL-2.0-or-later /* Generic associative array implementation. * * See Documentation/core-api/assoc_array.rst for information. * * Copyright (C) 2013 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ //#define DEBUG #include <linux/rcupdate.h> #include <linux/slab.h> #include <linux/err.h> #include <linux/assoc_array_priv.h> /* * Iterate over an associative array. The caller must hold the RCU read lock * or better. */ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root, const struct assoc_array_ptr *stop, int (*iterator)(const void *leaf, void *iterator_data), void *iterator_data) { const struct assoc_array_shortcut *shortcut; const struct assoc_array_node *node; const struct assoc_array_ptr *cursor, *ptr, *parent; unsigned long has_meta; int slot, ret; cursor = root; begin_node: if (assoc_array_ptr_is_shortcut(cursor)) { /* Descend through a shortcut */ shortcut = assoc_array_ptr_to_shortcut(cursor); cursor = READ_ONCE(shortcut->next_node); /* Address dependency. */ } node = assoc_array_ptr_to_node(cursor); slot = 0; /* We perform two passes of each node. * * The first pass does all the leaves in this node. This means we * don't miss any leaves if the node is split up by insertion whilst * we're iterating over the branches rooted here (we may, however, see * some leaves twice). */ has_meta = 0; for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) { ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */ has_meta |= (unsigned long)ptr; if (ptr && assoc_array_ptr_is_leaf(ptr)) { /* We need a barrier between the read of the pointer, * which is supplied by the above READ_ONCE(). */ /* Invoke the callback */ ret = iterator(assoc_array_ptr_to_leaf(ptr), iterator_data); if (ret) return ret; } } /* The second pass attends to all the metadata pointers. If we follow * one of these we may find that we don't come back here, but rather go * back to a replacement node with the leaves in a different layout. * * We are guaranteed to make progress, however, as the slot number for * a particular portion of the key space cannot change - and we * continue at the back pointer + 1. */ if (!(has_meta & ASSOC_ARRAY_PTR_META_TYPE)) goto finished_node; slot = 0; continue_node: node = assoc_array_ptr_to_node(cursor); for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) { ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */ if (assoc_array_ptr_is_meta(ptr)) { cursor = ptr; goto begin_node; } } finished_node: /* Move up to the parent (may need to skip back over a shortcut) */ parent = READ_ONCE(node->back_pointer); /* Address dependency. */ slot = node->parent_slot; if (parent == stop) return 0; if (assoc_array_ptr_is_shortcut(parent)) { shortcut = assoc_array_ptr_to_shortcut(parent); cursor = parent; parent = READ_ONCE(shortcut->back_pointer); /* Address dependency. */ slot = shortcut->parent_slot; if (parent == stop) return 0; } /* Ascend to next slot in parent node */ cursor = parent; slot++; goto continue_node; } /** * assoc_array_iterate - Pass all objects in the array to a callback * @array: The array to iterate over. * @iterator: The callback function. * @iterator_data: Private data for the callback function. * * Iterate over all the objects in an associative array. Each one will be * presented to the iterator function. * * If the array is being modified concurrently with the iteration then it is * possible that some objects in the array will be passed to the iterator * callback more than once - though every object should be passed at least * once. If this is undesirable then the caller must lock against modification * for the duration of this function. * * The function will return 0 if no objects were in the array or else it will * return the result of the last iterator function called. Iteration stops * immediately if any call to the iteration function results in a non-zero * return. * * The caller should hold the RCU read lock or better if concurrent * modification is possible. */ int assoc_array_iterate(const struct assoc_array *array, int (*iterator)(const void *object, void *iterator_data), void *iterator_data) { struct assoc_array_ptr *root = READ_ONCE(array->root); /* Address dependency. */ if (!root) return 0; return assoc_array_subtree_iterate(root, NULL, iterator, iterator_data); } enum assoc_array_walk_status { assoc_array_walk_tree_empty, assoc_array_walk_found_terminal_node, assoc_array_walk_found_wrong_shortcut, }; struct assoc_array_walk_result { struct { struct assoc_array_node *node; /* Node in which leaf might be found */ int level; int slot; } terminal_node; struct { struct assoc_array_shortcut *shortcut; int level; int sc_level; unsigned long sc_segments; unsigned long dissimilarity; } wrong_shortcut; }; /* * Navigate through the internal tree looking for the closest node to the key. */ static enum assoc_array_walk_status assoc_array_walk(const struct assoc_array *array, const struct assoc_array_ops *ops, const void *index_key, struct assoc_array_walk_result *result) { struct assoc_array_shortcut *shortcut; struct assoc_array_node *node; struct assoc_array_ptr *cursor, *ptr; unsigned long sc_segments, dissimilarity; unsigned long segments; int level, sc_level, next_sc_level; int slot; pr_devel("-->%s()\n", __func__); cursor = READ_ONCE(array->root); /* Address dependency. */ if (!cursor) return assoc_array_walk_tree_empty; level = 0; /* Use segments from the key for the new leaf to navigate through the * internal tree, skipping through nodes and shortcuts that are on * route to the destination. Eventually we'll come to a slot that is * either empty or contains a leaf at which point we've found a node in * which the leaf we're looking for might be found or into which it * should be inserted. */ jumped: segments = ops->get_key_chunk(index_key, level); pr_devel("segments[%d]: %lx\n", level, segments); if (assoc_array_ptr_is_shortcut(cursor)) goto follow_shortcut; consider_node: node = assoc_array_ptr_to_node(cursor); slot = segments >> (level & ASSOC_ARRAY_KEY_CHUNK_MASK); slot &= ASSOC_ARRAY_FAN_MASK; ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */ pr_devel("consider slot %x [ix=%d type=%lu]\n", slot, level, (unsigned long)ptr & 3); if (!assoc_array_ptr_is_meta(ptr)) { /* The node doesn't have a node/shortcut pointer in the slot * corresponding to the index key that we have to follow. */ result->terminal_node.node = node; result->terminal_node.level = level; result->terminal_node.slot = slot; pr_devel("<--%s() = terminal_node\n", __func__); return assoc_array_walk_found_terminal_node; } if (assoc_array_ptr_is_node(ptr)) { /* There is a pointer to a node in the slot corresponding to * this index key segment, so we need to follow it. */ cursor = ptr; level += ASSOC_ARRAY_LEVEL_STEP; if ((level & ASSOC_ARRAY_KEY_CHUNK_MASK) != 0) goto consider_node; goto jumped; } /* There is a shortcut in the slot corresponding to the index key * segment. We follow the shortcut if its partial index key matches * this leaf's. Otherwise we need to split the shortcut. */ cursor = ptr; follow_shortcut: shortcut = assoc_array_ptr_to_shortcut(cursor); pr_devel("shortcut to %d\n", shortcut->skip_to_level); sc_level = level + ASSOC_ARRAY_LEVEL_STEP; BUG_ON(sc_level > shortcut->skip_to_level); do { /* Check the leaf against the shortcut's index key a word at a * time, trimming the final word (the shortcut stores the index * key completely from the root to the shortcut's target). */ if ((sc_level & ASSOC_ARRAY_KEY_CHUNK_MASK) == 0) segments = ops->get_key_chunk(index_key, sc_level); sc_segments = shortcut->index_key[sc_level >> ASSOC_ARRAY_KEY_CHUNK_SHIFT]; dissimilarity = segments ^ sc_segments; if (round_up(sc_level, ASSOC_ARRAY_KEY_CHUNK_SIZE) > shortcut->skip_to_level) { /* Trim segments that are beyond the shortcut */ int shift = shortcut->skip_to_level & ASSOC_ARRAY_KEY_CHUNK_MASK; dissimilarity &= ~(ULONG_MAX << shift); next_sc_level = shortcut->skip_to_level; } else { next_sc_level = sc_level + ASSOC_ARRAY_KEY_CHUNK_SIZE; next_sc_level = round_down(next_sc_level, ASSOC_ARRAY_KEY_CHUNK_SIZE); } if (dissimilarity != 0) { /* This shortcut points elsewhere */ result->wrong_shortcut.shortcut = shortcut; result->wrong_shortcut.level = level; result->wrong_shortcut.sc_level = sc_level; result->wrong_shortcut.sc_segments = sc_segments; result->wrong_shortcut.dissimilarity = dissimilarity; return assoc_array_walk_found_wrong_shortcut; } sc_level = next_sc_level; } while (sc_level < shortcut->skip_to_level); /* The shortcut matches the leaf's index to this point. */ cursor = READ_ONCE(shortcut->next_node); /* Address dependency. */ if (((level ^ sc_level) & ~ASSOC_ARRAY_KEY_CHUNK_MASK) != 0) { level = sc_level; goto jumped; } else { level = sc_level; goto consider_node; } } /** * assoc_array_find - Find an object by index key * @array: The associative array to search. * @ops: The operations to use. * @index_key: The key to the object. * * Find an object in an associative array by walking through the internal tree * to the node that should contain the object and then searching the leaves * there. NULL is returned if the requested object was not found in the array. * * The caller must hold the RCU read lock or better. */ void *assoc_array_find(const struct assoc_array *array, const struct assoc_array_ops *ops, const void *index_key) { struct assoc_array_walk_result result; const struct assoc_array_node *node; const struct assoc_array_ptr *ptr; const void *leaf; int slot; if (assoc_array_walk(array, ops, index_key, &result) != assoc_array_walk_found_terminal_node) return NULL; node = result.terminal_node.node; /* If the target key is available to us, it's has to be pointed to by * the terminal node. */ for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) { ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */ if (ptr && assoc_array_ptr_is_leaf(ptr)) { /* We need a barrier between the read of the pointer * and dereferencing the pointer - but only if we are * actually going to dereference it. */ leaf = assoc_array_ptr_to_leaf(ptr); if (ops->compare_object(leaf, index_key)) return (void *)leaf; } } return NULL; } /* * Destructively iterate over an associative array. The caller must prevent * other simultaneous accesses. */ static void assoc_array_destroy_subtree(struct assoc_array_ptr *root, const struct assoc_array_ops *ops) { struct assoc_array_shortcut *shortcut; struct assoc_array_node *node; struct assoc_array_ptr *cursor, *parent = NULL; int slot = -1; pr_devel("-->%s()\n", __func__); cursor = root; if (!cursor) { pr_devel("empty\n"); return; } move_to_meta: if (assoc_array_ptr_is_shortcut(cursor)) { /* Descend through a shortcut */ pr_devel("[%d] shortcut\n", slot); BUG_ON(!assoc_array_ptr_is_shortcut(cursor)); shortcut = assoc_array_ptr_to_shortcut(cursor); BUG_ON(shortcut->back_pointer != parent); BUG_ON(slot != -1 && shortcut->parent_slot != slot); parent = cursor; cursor = shortcut->next_node; slot = -1; BUG_ON(!assoc_array_ptr_is_node(cursor)); } pr_devel("[%d] node\n", slot); node = assoc_array_ptr_to_node(cursor); BUG_ON(node->back_pointer != parent); BUG_ON(slot != -1 && node->parent_slot != slot); slot = 0; continue_node: pr_devel("Node %p [back=%p]\n", node, node->back_pointer); for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) { struct assoc_array_ptr *ptr = node->slots[slot]; if (!ptr) continue; if (assoc_array_ptr_is_meta(ptr)) { parent = cursor; cursor = ptr; goto move_to_meta; } if (ops) { pr_devel("[%d] free leaf\n", slot); ops->free_object(assoc_array_ptr_to_leaf(ptr)); } } parent = node->back_pointer; slot = node->parent_slot; pr_devel("free node\n"); kfree(node); if (!parent) return; /* Done */ /* Move back up to the parent (may need to free a shortcut on * the way up) */ if (assoc_array_ptr_is_shortcut(parent)) { shortcut = assoc_array_ptr_to_shortcut(parent); BUG_ON(shortcut->next_node != cursor); cursor = parent; parent = shortcut->back_pointer; slot = shortcut->parent_slot; pr_devel("free shortcut\n"); kfree(shortcut); if (!parent) return; BUG_ON(!assoc_array_ptr_is_node(parent)); } /* Ascend to next slot in parent node */ pr_devel("ascend to %p[%d]\n", parent, slot); cursor = parent; node = assoc_array_ptr_to_node(cursor); slot++; goto continue_node; } /** * assoc_array_destroy - Destroy an associative array * @array: The array to destroy. * @ops: The operations to use. * * Discard all metadata and free all objects in an associative array. The * array will be empty and ready to use again upon completion. This function * cannot fail. * * The caller must prevent all other accesses whilst this takes place as no * attempt is made to adjust pointers gracefully to permit RCU readlock-holding * accesses to continue. On the other hand, no memory allocation is required. */ void assoc_array_destroy(struct assoc_array *array, const struct assoc_array_ops *ops) { assoc_array_destroy_subtree(array->root, ops); array->root = NULL; } /* * Handle insertion into an empty tree. */ static bool assoc_array_insert_in_empty_tree(struct assoc_array_edit *edit) { struct assoc_array_node *new_n0; pr_devel("-->%s()\n", __func__); new_n0 = kzalloc(sizeof(struct assoc_array_node), GFP_KERNEL); if (!new_n0) return false; edit->new_meta[0] = assoc_array_node_to_ptr(new_n0); edit->leaf_p = &new_n0->slots[0]; edit->adjust_count_on = new_n0; edit->set[0].ptr = &edit->array->root; edit->set[0].to = assoc_array_node_to_ptr(new_n0); pr_devel("<--%s() = ok [no root]\n", __func__); return true; } /* * Handle insertion into a terminal node. */ static bool assoc_array_insert_into_terminal_node(struct assoc_array_edit *edit, const struct assoc_array_ops *ops, const void *index_key, struct assoc_array_walk_result *result) { struct assoc_array_shortcut *shortcut, *new_s0; struct assoc_array_node *node, *new_n0, *new_n1, *side; struct assoc_array_ptr *ptr; unsigned long dissimilarity, base_seg, blank; size_t keylen; bool have_meta; int level, diff; int slot, next_slot, free_slot, i, j; node = result->terminal_node.node; level = result->terminal_node.level; edit->segment_cache[ASSOC_ARRAY_FAN_OUT] = result->terminal_node.slot; pr_devel("-->%s()\n", __func__); /* We arrived at a node which doesn't have an onward node or shortcut * pointer that we have to follow. This means that (a) the leaf we * want must go here (either by insertion or replacement) or (b) we * need to split this node and insert in one of the fragments. */ free_slot = -1; /* Firstly, we have to check the leaves in this node to see if there's * a matching one we should replace in place. */ for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { ptr = node->slots[i]; if (!ptr) { free_slot = i; continue; } if (assoc_array_ptr_is_leaf(ptr) && ops->compare_object(assoc_array_ptr_to_leaf(ptr), index_key)) { pr_devel("replace in slot %d\n", i); edit->leaf_p = &node->slots[i]; edit->dead_leaf = node->slots[i]; pr_devel("<--%s() = ok [replace]\n", __func__); return true; } } /* If there is a free slot in this node then we can just insert the * leaf here. */ if (free_slot >= 0) { pr_devel("insert in free slot %d\n", free_slot); edit->leaf_p = &node->slots[free_slot]; edit->adjust_count_on = node; pr_devel("<--%s() = ok [insert]\n", __func__); return true; } /* The node has no spare slots - so we're either going to have to split * it or insert another node before it. * * Whatever, we're going to need at least two new nodes - so allocate * those now. We may also need a new shortcut, but we deal with that * when we need it. */ new_n0 = kzalloc(sizeof(struct assoc_array_node), GFP_KERNEL); if (!new_n0) return false; edit->new_meta[0] = assoc_array_node_to_ptr(new_n0); new_n1 = kzalloc(sizeof(struct assoc_array_node), GFP_KERNEL); if (!new_n1) return false; edit->new_meta[1] = assoc_array_node_to_ptr(new_n1); /* We need to find out how similar the leaves are. */ pr_devel("no spare slots\n"); have_meta = false; for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { ptr = node->slots[i]; if (assoc_array_ptr_is_meta(ptr)) { edit->segment_cache[i] = 0xff; have_meta = true; continue; } base_seg = ops->get_object_key_chunk( assoc_array_ptr_to_leaf(ptr), level); base_seg >>= level & ASSOC_ARRAY_KEY_CHUNK_MASK; edit->segment_cache[i] = base_seg & ASSOC_ARRAY_FAN_MASK; } if (have_meta) { pr_devel("have meta\n"); goto split_node; } /* The node contains only leaves */ dissimilarity = 0; base_seg = edit->segment_cache[0]; for (i = 1; i < ASSOC_ARRAY_FAN_OUT; i++) dissimilarity |= edit->segment_cache[i] ^ base_seg; pr_devel("only leaves; dissimilarity=%lx\n", dissimilarity); if ((dissimilarity & ASSOC_ARRAY_FAN_MASK) == 0) { /* The old leaves all cluster in the same slot. We will need * to insert a shortcut if the new node wants to cluster with them. */ if ((edit->segment_cache[ASSOC_ARRAY_FAN_OUT] ^ base_seg) == 0) goto all_leaves_cluster_together; /* Otherwise all the old leaves cluster in the same slot, but * the new leaf wants to go into a different slot - so we * create a new node (n0) to hold the new leaf and a pointer to * a new node (n1) holding all the old leaves. * * This can be done by falling through to the node splitting * path. */ pr_devel("present leaves cluster but not new leaf\n"); } split_node: pr_devel("split node\n"); /* We need to split the current node. The node must contain anything * from a single leaf (in the one leaf case, this leaf will cluster * with the new leaf) and the rest meta-pointers, to all leaves, some * of which may cluster. * * It won't contain the case in which all the current leaves plus the * new leaves want to cluster in the same slot. * * We need to expel at least two leaves out of a set consisting of the * leaves in the node and the new leaf. The current meta pointers can * just be copied as they shouldn't cluster with any of the leaves. * * We need a new node (n0) to replace the current one and a new node to * take the expelled nodes (n1). */ edit->set[0].to = assoc_array_node_to_ptr(new_n0); new_n0->back_pointer = node->back_pointer; new_n0->parent_slot = node->parent_slot; new_n1->back_pointer = assoc_array_node_to_ptr(new_n0); new_n1->parent_slot = -1; /* Need to calculate this */ do_split_node: pr_devel("do_split_node\n"); new_n0->nr_leaves_on_branch = node->nr_leaves_on_branch; new_n1->nr_leaves_on_branch = 0; /* Begin by finding two matching leaves. There have to be at least two * that match - even if there are meta pointers - because any leaf that * would match a slot with a meta pointer in it must be somewhere * behind that meta pointer and cannot be here. Further, given N * remaining leaf slots, we now have N+1 leaves to go in them. */ for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { slot = edit->segment_cache[i]; if (slot != 0xff) for (j = i + 1; j < ASSOC_ARRAY_FAN_OUT + 1; j++) if (edit->segment_cache[j] == slot) goto found_slot_for_multiple_occupancy; } found_slot_for_multiple_occupancy: pr_devel("same slot: %x %x [%02x]\n", i, j, slot); BUG_ON(i >= ASSOC_ARRAY_FAN_OUT); BUG_ON(j >= ASSOC_ARRAY_FAN_OUT + 1); BUG_ON(slot >= ASSOC_ARRAY_FAN_OUT); new_n1->parent_slot = slot; /* Metadata pointers cannot change slot */ for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) if (assoc_array_ptr_is_meta(node->slots[i])) new_n0->slots[i] = node->slots[i]; else new_n0->slots[i] = NULL; BUG_ON(new_n0->slots[slot] != NULL); new_n0->slots[slot] = assoc_array_node_to_ptr(new_n1); /* Filter the leaf pointers between the new nodes */ free_slot = -1; next_slot = 0; for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { if (assoc_array_ptr_is_meta(node->slots[i])) continue; if (edit->segment_cache[i] == slot) { new_n1->slots[next_slot++] = node->slots[i]; new_n1->nr_leaves_on_branch++; } else { do { free_slot++; } while (new_n0->slots[free_slot] != NULL); new_n0->slots[free_slot] = node->slots[i]; } } pr_devel("filtered: f=%x n=%x\n", free_slot, next_slot); if (edit->segment_cache[ASSOC_ARRAY_FAN_OUT] != slot) { do { free_slot++; } while (new_n0->slots[free_slot] != NULL); edit->leaf_p = &new_n0->slots[free_slot]; edit->adjust_count_on = new_n0; } else { edit->leaf_p = &new_n1->slots[next_slot++]; edit->adjust_count_on = new_n1; } BUG_ON(next_slot <= 1); edit->set_backpointers_to = assoc_array_node_to_ptr(new_n0); for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { if (edit->segment_cache[i] == 0xff) { ptr = node->slots[i]; BUG_ON(assoc_array_ptr_is_leaf(ptr)); if (assoc_array_ptr_is_node(ptr)) { side = assoc_array_ptr_to_node(ptr); edit->set_backpointers[i] = &side->back_pointer; } else { shortcut = assoc_array_ptr_to_shortcut(ptr); edit->set_backpointers[i] = &shortcut->back_pointer; } } } ptr = node->back_pointer; if (!ptr) edit->set[0].ptr = &edit->array->root; else if (assoc_array_ptr_is_node(ptr)) edit->set[0].ptr = &assoc_array_ptr_to_node(ptr)->slots[node->parent_slot]; else edit->set[0].ptr = &assoc_array_ptr_to_shortcut(ptr)->next_node; edit->excised_meta[0] = assoc_array_node_to_ptr(node); pr_devel("<--%s() = ok [split node]\n", __func__); return true; all_leaves_cluster_together: /* All the leaves, new and old, want to cluster together in this node * in the same slot, so we have to replace this node with a shortcut to * skip over the identical parts of the key and then place a pair of * nodes, one inside the other, at the end of the shortcut and * distribute the keys between them. * * Firstly we need to work out where the leaves start diverging as a * bit position into their keys so that we know how big the shortcut * needs to be. * * We only need to make a single pass of N of the N+1 leaves because if * any keys differ between themselves at bit X then at least one of * them must also differ with the base key at bit X or before. */ pr_devel("all leaves cluster together\n"); diff = INT_MAX; for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { int x = ops->diff_objects(assoc_array_ptr_to_leaf(node->slots[i]), index_key); if (x < diff) { BUG_ON(x < 0); diff = x; } } BUG_ON(diff == INT_MAX); BUG_ON(diff < level + ASSOC_ARRAY_LEVEL_STEP); keylen = round_up(diff, ASSOC_ARRAY_KEY_CHUNK_SIZE); keylen >>= ASSOC_ARRAY_KEY_CHUNK_SHIFT; new_s0 = kzalloc(struct_size(new_s0, index_key, keylen), GFP_KERNEL); if (!new_s0) return false; edit->new_meta[2] = assoc_array_shortcut_to_ptr(new_s0); edit->set[0].to = assoc_array_shortcut_to_ptr(new_s0); new_s0->back_pointer = node->back_pointer; new_s0->parent_slot = node->parent_slot; new_s0->next_node = assoc_array_node_to_ptr(new_n0); new_n0->back_pointer = assoc_array_shortcut_to_ptr(new_s0); new_n0->parent_slot = 0; new_n1->back_pointer = assoc_array_node_to_ptr(new_n0); new_n1->parent_slot = -1; /* Need to calculate this */ new_s0->skip_to_level = level = diff & ~ASSOC_ARRAY_LEVEL_STEP_MASK; pr_devel("skip_to_level = %d [diff %d]\n", level, diff); BUG_ON(level <= 0); for (i = 0; i < keylen; i++) new_s0->index_key[i] = ops->get_key_chunk(index_key, i * ASSOC_ARRAY_KEY_CHUNK_SIZE); if (level & ASSOC_ARRAY_KEY_CHUNK_MASK) { blank = ULONG_MAX << (level & ASSOC_ARRAY_KEY_CHUNK_MASK); pr_devel("blank off [%zu] %d: %lx\n", keylen - 1, level, blank); new_s0->index_key[keylen - 1] &= ~blank; } /* This now reduces to a node splitting exercise for which we'll need * to regenerate the disparity table. */ for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { ptr = node->slots[i]; base_seg = ops->get_object_key_chunk(assoc_array_ptr_to_leaf(ptr), level); base_seg >>= level & ASSOC_ARRAY_KEY_CHUNK_MASK; edit->segment_cache[i] = base_seg & ASSOC_ARRAY_FAN_MASK; } base_seg = ops->get_key_chunk(index_key, level); base_seg >>= level & ASSOC_ARRAY_KEY_CHUNK_MASK; edit->segment_cache[ASSOC_ARRAY_FAN_OUT] = base_seg & ASSOC_ARRAY_FAN_MASK; goto do_split_node; } /* * Handle insertion into the middle of a shortcut. */ static bool assoc_array_insert_mid_shortcut(struct assoc_array_edit *edit, const struct assoc_array_ops *ops, struct assoc_array_walk_result *result) { struct assoc_array_shortcut *shortcut, *new_s0, *new_s1; struct assoc_array_node *node, *new_n0, *side; unsigned long sc_segments, dissimilarity, blank; size_t keylen; int level, sc_level, diff; int sc_slot; shortcut = result->wrong_shortcut.shortcut; level = result->wrong_shortcut.level; sc_level = result->wrong_shortcut.sc_level; sc_segments = result->wrong_shortcut.sc_segments; dissimilarity = result->wrong_shortcut.dissimilarity; pr_devel("-->%s(ix=%d dis=%lx scix=%d)\n", __func__, level, dissimilarity, sc_level); /* We need to split a shortcut and insert a node between the two * pieces. Zero-length pieces will be dispensed with entirely. * * First of all, we need to find out in which level the first * difference was. */ diff = __ffs(dissimilarity); diff &= ~ASSOC_ARRAY_LEVEL_STEP_MASK; diff += sc_level & ~ASSOC_ARRAY_KEY_CHUNK_MASK; pr_devel("diff=%d\n", diff); if (!shortcut->back_pointer) { edit->set[0].ptr = &edit->array->root; } else if (assoc_array_ptr_is_node(shortcut->back_pointer)) { node = assoc_array_ptr_to_node(shortcut->back_pointer); edit->set[0].ptr = &node->slots[shortcut->parent_slot]; } else { BUG(); } edit->excised_meta[0] = assoc_array_shortcut_to_ptr(shortcut); /* Create a new node now since we're going to need it anyway */ new_n0 = kzalloc(sizeof(struct assoc_array_node), GFP_KERNEL); if (!new_n0) return false; edit->new_meta[0] = assoc_array_node_to_ptr(new_n0); edit->adjust_count_on = new_n0; /* Insert a new shortcut before the new node if this segment isn't of * zero length - otherwise we just connect the new node directly to the * parent. */ level += ASSOC_ARRAY_LEVEL_STEP; if (diff > level) { pr_devel("pre-shortcut %d...%d\n", level, diff); keylen = round_up(diff, ASSOC_ARRAY_KEY_CHUNK_SIZE); keylen >>= ASSOC_ARRAY_KEY_CHUNK_SHIFT; new_s0 = kzalloc(struct_size(new_s0, index_key, keylen), GFP_KERNEL); if (!new_s0) return false; edit->new_meta[1] = assoc_array_shortcut_to_ptr(new_s0); edit->set[0].to = assoc_array_shortcut_to_ptr(new_s0); new_s0->back_pointer = shortcut->back_pointer; new_s0->parent_slot = shortcut->parent_slot; new_s0->next_node = assoc_array_node_to_ptr(new_n0); new_s0->skip_to_level = diff; new_n0->back_pointer = assoc_array_shortcut_to_ptr(new_s0); new_n0->parent_slot = 0; memcpy(new_s0->index_key, shortcut->index_key, flex_array_size(new_s0, index_key, keylen)); blank = ULONG_MAX << (diff & ASSOC_ARRAY_KEY_CHUNK_MASK); pr_devel("blank off [%zu] %d: %lx\n", keylen - 1, diff, blank); new_s0->index_key[keylen - 1] &= ~blank; } else { pr_devel("no pre-shortcut\n"); edit->set[0].to = assoc_array_node_to_ptr(new_n0); new_n0->back_pointer = shortcut->back_pointer; new_n0->parent_slot = shortcut->parent_slot; } side = assoc_array_ptr_to_node(shortcut->next_node); new_n0->nr_leaves_on_branch = side->nr_leaves_on_branch; /* We need to know which slot in the new node is going to take a * metadata pointer. */ sc_slot = sc_segments >> (diff & ASSOC_ARRAY_KEY_CHUNK_MASK); sc_slot &= ASSOC_ARRAY_FAN_MASK; pr_devel("new slot %lx >> %d -> %d\n", sc_segments, diff & ASSOC_ARRAY_KEY_CHUNK_MASK, sc_slot); /* Determine whether we need to follow the new node with a replacement * for the current shortcut. We could in theory reuse the current * shortcut if its parent slot number doesn't change - but that's a * 1-in-16 chance so not worth expending the code upon. */ level = diff + ASSOC_ARRAY_LEVEL_STEP; if (level < shortcut->skip_to_level) { pr_devel("post-shortcut %d...%d\n", level, shortcut->skip_to_level); keylen = round_up(shortcut->skip_to_level, ASSOC_ARRAY_KEY_CHUNK_SIZE); keylen >>= ASSOC_ARRAY_KEY_CHUNK_SHIFT; new_s1 = kzalloc(struct_size(new_s1, index_key, keylen), GFP_KERNEL); if (!new_s1) return false; edit->new_meta[2] = assoc_array_shortcut_to_ptr(new_s1); new_s1->back_pointer = assoc_array_node_to_ptr(new_n0); new_s1->parent_slot = sc_slot; new_s1->next_node = shortcut->next_node; new_s1->skip_to_level = shortcut->skip_to_level; new_n0->slots[sc_slot] = assoc_array_shortcut_to_ptr(new_s1); memcpy(new_s1->index_key, shortcut->index_key, flex_array_size(new_s1, index_key, keylen)); edit->set[1].ptr = &side->back_pointer; edit->set[1].to = assoc_array_shortcut_to_ptr(new_s1); } else { pr_devel("no post-shortcut\n"); /* We don't have to replace the pointed-to node as long as we * use memory barriers to make sure the parent slot number is * changed before the back pointer (the parent slot number is * irrelevant to the old parent shortcut). */ new_n0->slots[sc_slot] = shortcut->next_node; edit->set_parent_slot[0].p = &side->parent_slot; edit->set_parent_slot[0].to = sc_slot; edit->set[1].ptr = &side->back_pointer; edit->set[1].to = assoc_array_node_to_ptr(new_n0); } /* Install the new leaf in a spare slot in the new node. */ if (sc_slot == 0) edit->leaf_p = &new_n0->slots[1]; else edit->leaf_p = &new_n0->slots[0]; pr_devel("<--%s() = ok [split shortcut]\n", __func__); return true; } /** * assoc_array_insert - Script insertion of an object into an associative array * @array: The array to insert into. * @ops: The operations to use. * @index_key: The key to insert at. * @object: The object to insert. * * Precalculate and preallocate a script for the insertion or replacement of an * object in an associative array. This results in an edit script that can * either be applied or cancelled. * * The function returns a pointer to an edit script or -ENOMEM. * * The caller should lock against other modifications and must continue to hold * the lock until assoc_array_apply_edit() has been called. * * Accesses to the tree may take place concurrently with this function, * provided they hold the RCU read lock. */ struct assoc_array_edit *assoc_array_insert(struct assoc_array *array, const struct assoc_array_ops *ops, const void *index_key, void *object) { struct assoc_array_walk_result result; struct assoc_array_edit *edit; pr_devel("-->%s()\n", __func__); /* The leaf pointer we're given must not have the bottom bit set as we * use those for type-marking the pointer. NULL pointers are also not * allowed as they indicate an empty slot but we have to allow them * here as they can be updated later. */ BUG_ON(assoc_array_ptr_is_meta(object)); edit = kzalloc(sizeof(struct assoc_array_edit), GFP_KERNEL); if (!edit) return ERR_PTR(-ENOMEM); edit->array = array; edit->ops = ops; edit->leaf = assoc_array_leaf_to_ptr(object); edit->adjust_count_by = 1; switch (assoc_array_walk(array, ops, index_key, &result)) { case assoc_array_walk_tree_empty: /* Allocate a root node if there isn't one yet */ if (!assoc_array_insert_in_empty_tree(edit)) goto enomem; return edit; case assoc_array_walk_found_terminal_node: /* We found a node that doesn't have a node/shortcut pointer in * the slot corresponding to the index key that we have to * follow. */ if (!assoc_array_insert_into_terminal_node(edit, ops, index_key, &result)) goto enomem; return edit; case assoc_array_walk_found_wrong_shortcut: /* We found a shortcut that didn't match our key in a slot we * needed to follow. */ if (!assoc_array_insert_mid_shortcut(edit, ops, &result)) goto enomem; return edit; } enomem: /* Clean up after an out of memory error */ pr_devel("enomem\n"); assoc_array_cancel_edit(edit); return ERR_PTR(-ENOMEM); } /** * assoc_array_insert_set_object - Set the new object pointer in an edit script * @edit: The edit script to modify. * @object: The object pointer to set. * * Change the object to be inserted in an edit script. The object pointed to * by the old object is not freed. This must be done prior to applying the * script. */ void assoc_array_insert_set_object(struct assoc_array_edit *edit, void *object) { BUG_ON(!object); edit->leaf = assoc_array_leaf_to_ptr(object); } struct assoc_array_delete_collapse_context { struct assoc_array_node *node; const void *skip_leaf; int slot; }; /* * Subtree collapse to node iterator. */ static int assoc_array_delete_collapse_iterator(const void *leaf, void *iterator_data) { struct assoc_array_delete_collapse_context *collapse = iterator_data; if (leaf == collapse->skip_leaf) return 0; BUG_ON(collapse->slot >= ASSOC_ARRAY_FAN_OUT); collapse->node->slots[collapse->slot++] = assoc_array_leaf_to_ptr(leaf); return 0; } /** * assoc_array_delete - Script deletion of an object from an associative array * @array: The array to search. * @ops: The operations to use. * @index_key: The key to the object. * * Precalculate and preallocate a script for the deletion of an object from an * associative array. This results in an edit script that can either be * applied or cancelled. * * The function returns a pointer to an edit script if the object was found, * NULL if the object was not found or -ENOMEM. * * The caller should lock against other modifications and must continue to hold * the lock until assoc_array_apply_edit() has been called. * * Accesses to the tree may take place concurrently with this function, * provided they hold the RCU read lock. */ struct assoc_array_edit *assoc_array_delete(struct assoc_array *array, const struct assoc_array_ops *ops, const void *index_key) { struct assoc_array_delete_collapse_context collapse; struct assoc_array_walk_result result; struct assoc_array_node *node, *new_n0; struct assoc_array_edit *edit; struct assoc_array_ptr *ptr; bool has_meta; int slot, i; pr_devel("-->%s()\n", __func__); edit = kzalloc(sizeof(struct assoc_array_edit), GFP_KERNEL); if (!edit) return ERR_PTR(-ENOMEM); edit->array = array; edit->ops = ops; edit->adjust_count_by = -1; switch (assoc_array_walk(array, ops, index_key, &result)) { case assoc_array_walk_found_terminal_node: /* We found a node that should contain the leaf we've been * asked to remove - *if* it's in the tree. */ pr_devel("terminal_node\n"); node = result.terminal_node.node; for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) { ptr = node->slots[slot]; if (ptr && assoc_array_ptr_is_leaf(ptr) && ops->compare_object(assoc_array_ptr_to_leaf(ptr), index_key)) goto found_leaf; } fallthrough; case assoc_array_walk_tree_empty: case assoc_array_walk_found_wrong_shortcut: default: assoc_array_cancel_edit(edit); pr_devel("not found\n"); return NULL; } found_leaf: BUG_ON(array->nr_leaves_on_tree <= 0); /* In the simplest form of deletion we just clear the slot and release * the leaf after a suitable interval. */ edit->dead_leaf = node->slots[slot]; edit->set[0].ptr = &node->slots[slot]; edit->set[0].to = NULL; edit->adjust_count_on = node; /* If that concludes erasure of the last leaf, then delete the entire * internal array. */ if (array->nr_leaves_on_tree == 1) { edit->set[1].ptr = &array->root; edit->set[1].to = NULL; edit->adjust_count_on = NULL; edit->excised_subtree = array->root; pr_devel("all gone\n"); return edit; } /* However, we'd also like to clear up some metadata blocks if we * possibly can. * * We go for a simple algorithm of: if this node has FAN_OUT or fewer * leaves in it, then attempt to collapse it - and attempt to * recursively collapse up the tree. * * We could also try and collapse in partially filled subtrees to take * up space in this node. */ if (node->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT + 1) { struct assoc_array_node *parent, *grandparent; struct assoc_array_ptr *ptr; /* First of all, we need to know if this node has metadata so * that we don't try collapsing if all the leaves are already * here. */ has_meta = false; for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { ptr = node->slots[i]; if (assoc_array_ptr_is_meta(ptr)) { has_meta = true; break; } } pr_devel("leaves: %ld [m=%d]\n", node->nr_leaves_on_branch - 1, has_meta); /* Look further up the tree to see if we can collapse this node * into a more proximal node too. */ parent = node; collapse_up: pr_devel("collapse subtree: %ld\n", parent->nr_leaves_on_branch); ptr = parent->back_pointer; if (!ptr) goto do_collapse; if (assoc_array_ptr_is_shortcut(ptr)) { struct assoc_array_shortcut *s = assoc_array_ptr_to_shortcut(ptr); ptr = s->back_pointer; if (!ptr) goto do_collapse; } grandparent = assoc_array_ptr_to_node(ptr); if (grandparent->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT + 1) { parent = grandparent; goto collapse_up; } do_collapse: /* There's no point collapsing if the original node has no meta * pointers to discard and if we didn't merge into one of that * node's ancestry. */ if (has_meta || parent != node) { node = parent; /* Create a new node to collapse into */ new_n0 = kzalloc(sizeof(struct assoc_array_node), GFP_KERNEL); if (!new_n0) goto enomem; edit->new_meta[0] = assoc_array_node_to_ptr(new_n0); new_n0->back_pointer = node->back_pointer; new_n0->parent_slot = node->parent_slot; new_n0->nr_leaves_on_branch = node->nr_leaves_on_branch; edit->adjust_count_on = new_n0; collapse.node = new_n0; collapse.skip_leaf = assoc_array_ptr_to_leaf(edit->dead_leaf); collapse.slot = 0; assoc_array_subtree_iterate(assoc_array_node_to_ptr(node), node->back_pointer, assoc_array_delete_collapse_iterator, &collapse); pr_devel("collapsed %d,%lu\n", collapse.slot, new_n0->nr_leaves_on_branch); BUG_ON(collapse.slot != new_n0->nr_leaves_on_branch - 1); if (!node->back_pointer) { edit->set[1].ptr = &array->root; } else if (assoc_array_ptr_is_leaf(node->back_pointer)) { BUG(); } else if (assoc_array_ptr_is_node(node->back_pointer)) { struct assoc_array_node *p = assoc_array_ptr_to_node(node->back_pointer); edit->set[1].ptr = &p->slots[node->parent_slot]; } else if (assoc_array_ptr_is_shortcut(node->back_pointer)) { struct assoc_array_shortcut *s = assoc_array_ptr_to_shortcut(node->back_pointer); edit->set[1].ptr = &s->next_node; } edit->set[1].to = assoc_array_node_to_ptr(new_n0); edit->excised_subtree = assoc_array_node_to_ptr(node); } } return edit; enomem: /* Clean up after an out of memory error */ pr_devel("enomem\n"); assoc_array_cancel_edit(edit); return ERR_PTR(-ENOMEM); } /** * assoc_array_clear - Script deletion of all objects from an associative array * @array: The array to clear. * @ops: The operations to use. * * Precalculate and preallocate a script for the deletion of all the objects * from an associative array. This results in an edit script that can either * be applied or cancelled. * * The function returns a pointer to an edit script if there are objects to be * deleted, NULL if there are no objects in the array or -ENOMEM. * * The caller should lock against other modifications and must continue to hold * the lock until assoc_array_apply_edit() has been called. * * Accesses to the tree may take place concurrently with this function, * provided they hold the RCU read lock. */ struct assoc_array_edit *assoc_array_clear(struct assoc_array *array, const struct assoc_array_ops *ops) { struct assoc_array_edit *edit; pr_devel("-->%s()\n", __func__); if (!array->root) return NULL; edit = kzalloc(sizeof(struct assoc_array_edit), GFP_KERNEL); if (!edit) return ERR_PTR(-ENOMEM); edit->array = array; edit->ops = ops; edit->set[1].ptr = &array->root; edit->set[1].to = NULL; edit->excised_subtree = array->root; edit->ops_for_excised_subtree = ops; pr_devel("all gone\n"); return edit; } /* * Handle the deferred destruction after an applied edit. */ static void assoc_array_rcu_cleanup(struct rcu_head *head) { struct assoc_array_edit *edit = container_of(head, struct assoc_array_edit, rcu); int i; pr_devel("-->%s()\n", __func__); if (edit->dead_leaf) edit->ops->free_object(assoc_array_ptr_to_leaf(edit->dead_leaf)); for (i = 0; i < ARRAY_SIZE(edit->excised_meta); i++) if (edit->excised_meta[i]) kfree(assoc_array_ptr_to_node(edit->excised_meta[i])); if (edit->excised_subtree) { BUG_ON(assoc_array_ptr_is_leaf(edit->excised_subtree)); if (assoc_array_ptr_is_node(edit->excised_subtree)) { struct assoc_array_node *n = assoc_array_ptr_to_node(edit->excised_subtree); n->back_pointer = NULL; } else { struct assoc_array_shortcut *s = assoc_array_ptr_to_shortcut(edit->excised_subtree); s->back_pointer = NULL; } assoc_array_destroy_subtree(edit->excised_subtree, edit->ops_for_excised_subtree); } kfree(edit); } /** * assoc_array_apply_edit - Apply an edit script to an associative array * @edit: The script to apply. * * Apply an edit script to an associative array to effect an insertion, * deletion or clearance. As the edit script includes preallocated memory, * this is guaranteed not to fail. * * The edit script, dead objects and dead metadata will be scheduled for * destruction after an RCU grace period to permit those doing read-only * accesses on the array to continue to do so under the RCU read lock whilst * the edit is taking place. */ void assoc_array_apply_edit(struct assoc_array_edit *edit) { struct assoc_array_shortcut *shortcut; struct assoc_array_node *node; struct assoc_array_ptr *ptr; int i; pr_devel("-->%s()\n", __func__); smp_wmb(); if (edit->leaf_p) *edit->leaf_p = edit->leaf; smp_wmb(); for (i = 0; i < ARRAY_SIZE(edit->set_parent_slot); i++) if (edit->set_parent_slot[i].p) *edit->set_parent_slot[i].p = edit->set_parent_slot[i].to; smp_wmb(); for (i = 0; i < ARRAY_SIZE(edit->set_backpointers); i++) if (edit->set_backpointers[i]) *edit->set_backpointers[i] = edit->set_backpointers_to; smp_wmb(); for (i = 0; i < ARRAY_SIZE(edit->set); i++) if (edit->set[i].ptr) *edit->set[i].ptr = edit->set[i].to; if (edit->array->root == NULL) { edit->array->nr_leaves_on_tree = 0; } else if (edit->adjust_count_on) { node = edit->adjust_count_on; for (;;) { node->nr_leaves_on_branch += edit->adjust_count_by; ptr = node->back_pointer; if (!ptr) break; if (assoc_array_ptr_is_shortcut(ptr)) { shortcut = assoc_array_ptr_to_shortcut(ptr); ptr = shortcut->back_pointer; if (!ptr) break; } BUG_ON(!assoc_array_ptr_is_node(ptr)); node = assoc_array_ptr_to_node(ptr); } edit->array->nr_leaves_on_tree += edit->adjust_count_by; } call_rcu(&edit->rcu, assoc_array_rcu_cleanup); } /** * assoc_array_cancel_edit - Discard an edit script. * @edit: The script to discard. * * Free an edit script and all the preallocated data it holds without making * any changes to the associative array it was intended for. * * NOTE! In the case of an insertion script, this does _not_ release the leaf * that was to be inserted. That is left to the caller. */ void assoc_array_cancel_edit(struct assoc_array_edit *edit) { struct assoc_array_ptr *ptr; int i; pr_devel("-->%s()\n", __func__); /* Clean up after an out of memory error */ for (i = 0; i < ARRAY_SIZE(edit->new_meta); i++) { ptr = edit->new_meta[i]; if (ptr) { if (assoc_array_ptr_is_node(ptr)) kfree(assoc_array_ptr_to_node(ptr)); else kfree(assoc_array_ptr_to_shortcut(ptr)); } } kfree(edit); } /** * assoc_array_gc - Garbage collect an associative array. * @array: The array to clean. * @ops: The operations to use. * @iterator: A callback function to pass judgement on each object. * @iterator_data: Private data for the callback function. * * Collect garbage from an associative array and pack down the internal tree to * save memory. * * The iterator function is asked to pass judgement upon each object in the * array. If it returns false, the object is discard and if it returns true, * the object is kept. If it returns true, it must increment the object's * usage count (or whatever it needs to do to retain it) before returning. * * This function returns 0 if successful or -ENOMEM if out of memory. In the * latter case, the array is not changed. * * The caller should lock against other modifications and must continue to hold * the lock until assoc_array_apply_edit() has been called. * * Accesses to the tree may take place concurrently with this function, * provided they hold the RCU read lock. */ int assoc_array_gc(struct assoc_array *array, const struct assoc_array_ops *ops, bool (*iterator)(void *object, void *iterator_data), void *iterator_data) { struct assoc_array_shortcut *shortcut, *new_s; struct assoc_array_node *node, *new_n; struct assoc_array_edit *edit; struct assoc_array_ptr *cursor, *ptr; struct assoc_array_ptr *new_root, *new_parent, **new_ptr_pp; unsigned long nr_leaves_on_tree; bool retained; int keylen, slot, nr_free, next_slot, i; pr_devel("-->%s()\n", __func__); if (!array->root) return 0; edit = kzalloc(sizeof(struct assoc_array_edit), GFP_KERNEL); if (!edit) return -ENOMEM; edit->array = array; edit->ops = ops; edit->ops_for_excised_subtree = ops; edit->set[0].ptr = &array->root; edit->excised_subtree = array->root; new_root = new_parent = NULL; new_ptr_pp = &new_root; cursor = array->root; descend: /* If this point is a shortcut, then we need to duplicate it and * advance the target cursor. */ if (assoc_array_ptr_is_shortcut(cursor)) { shortcut = assoc_array_ptr_to_shortcut(cursor); keylen = round_up(shortcut->skip_to_level, ASSOC_ARRAY_KEY_CHUNK_SIZE); keylen >>= ASSOC_ARRAY_KEY_CHUNK_SHIFT; new_s = kmalloc(struct_size(new_s, index_key, keylen), GFP_KERNEL); if (!new_s) goto enomem; pr_devel("dup shortcut %p -> %p\n", shortcut, new_s); memcpy(new_s, shortcut, struct_size(new_s, index_key, keylen)); new_s->back_pointer = new_parent; new_s->parent_slot = shortcut->parent_slot; *new_ptr_pp = new_parent = assoc_array_shortcut_to_ptr(new_s); new_ptr_pp = &new_s->next_node; cursor = shortcut->next_node; } /* Duplicate the node at this position */ node = assoc_array_ptr_to_node(cursor); new_n = kzalloc(sizeof(struct assoc_array_node), GFP_KERNEL); if (!new_n) goto enomem; pr_devel("dup node %p -> %p\n", node, new_n); new_n->back_pointer = new_parent; new_n->parent_slot = node->parent_slot; *new_ptr_pp = new_parent = assoc_array_node_to_ptr(new_n); new_ptr_pp = NULL; slot = 0; continue_node: /* Filter across any leaves and gc any subtrees */ for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) { ptr = node->slots[slot]; if (!ptr) continue; if (assoc_array_ptr_is_leaf(ptr)) { if (iterator(assoc_array_ptr_to_leaf(ptr), iterator_data)) /* The iterator will have done any reference * counting on the object for us. */ new_n->slots[slot] = ptr; continue; } new_ptr_pp = &new_n->slots[slot]; cursor = ptr; goto descend; } retry_compress: pr_devel("-- compress node %p --\n", new_n); /* Count up the number of empty slots in this node and work out the * subtree leaf count. */ new_n->nr_leaves_on_branch = 0; nr_free = 0; for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) { ptr = new_n->slots[slot]; if (!ptr) nr_free++; else if (assoc_array_ptr_is_leaf(ptr)) new_n->nr_leaves_on_branch++; } pr_devel("free=%d, leaves=%lu\n", nr_free, new_n->nr_leaves_on_branch); /* See what we can fold in */ retained = false; next_slot = 0; for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) { struct assoc_array_shortcut *s; struct assoc_array_node *child; ptr = new_n->slots[slot]; if (!ptr || assoc_array_ptr_is_leaf(ptr)) continue; s = NULL; if (assoc_array_ptr_is_shortcut(ptr)) { s = assoc_array_ptr_to_shortcut(ptr); ptr = s->next_node; } child = assoc_array_ptr_to_node(ptr); new_n->nr_leaves_on_branch += child->nr_leaves_on_branch; if (child->nr_leaves_on_branch <= nr_free + 1) { /* Fold the child node into this one */ pr_devel("[%d] fold node %lu/%d [nx %d]\n", slot, child->nr_leaves_on_branch, nr_free + 1, next_slot); /* We would already have reaped an intervening shortcut * on the way back up the tree. */ BUG_ON(s); new_n->slots[slot] = NULL; nr_free++; if (slot < next_slot) next_slot = slot; for (i = 0; i < ASSOC_ARRAY_FAN_OUT; i++) { struct assoc_array_ptr *p = child->slots[i]; if (!p) continue; BUG_ON(assoc_array_ptr_is_meta(p)); while (new_n->slots[next_slot]) next_slot++; BUG_ON(next_slot >= ASSOC_ARRAY_FAN_OUT); new_n->slots[next_slot++] = p; nr_free--; } kfree(child); } else { pr_devel("[%d] retain node %lu/%d [nx %d]\n", slot, child->nr_leaves_on_branch, nr_free + 1, next_slot); retained = true; } } if (retained && new_n->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT) { pr_devel("internal nodes remain despite enough space, retrying\n"); goto retry_compress; } pr_devel("after: %lu\n", new_n->nr_leaves_on_branch); nr_leaves_on_tree = new_n->nr_leaves_on_branch; /* Excise this node if it is singly occupied by a shortcut */ if (nr_free == ASSOC_ARRAY_FAN_OUT - 1) { for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) if ((ptr = new_n->slots[slot])) break; if (assoc_array_ptr_is_meta(ptr) && assoc_array_ptr_is_shortcut(ptr)) { pr_devel("excise node %p with 1 shortcut\n", new_n); new_s = assoc_array_ptr_to_shortcut(ptr); new_parent = new_n->back_pointer; slot = new_n->parent_slot; kfree(new_n); if (!new_parent) { new_s->back_pointer = NULL; new_s->parent_slot = 0; new_root = ptr; goto gc_complete; } if (assoc_array_ptr_is_shortcut(new_parent)) { /* We can discard any preceding shortcut also */ struct assoc_array_shortcut *s = assoc_array_ptr_to_shortcut(new_parent); pr_devel("excise preceding shortcut\n"); new_parent = new_s->back_pointer = s->back_pointer; slot = new_s->parent_slot = s->parent_slot; kfree(s); if (!new_parent) { new_s->back_pointer = NULL; new_s->parent_slot = 0; new_root = ptr; goto gc_complete; } } new_s->back_pointer = new_parent; new_s->parent_slot = slot; new_n = assoc_array_ptr_to_node(new_parent); new_n->slots[slot] = ptr; goto ascend_old_tree; } } /* Excise any shortcuts we might encounter that point to nodes that * only contain leaves. */ ptr = new_n->back_pointer; if (!ptr) goto gc_complete; if (assoc_array_ptr_is_shortcut(ptr)) { new_s = assoc_array_ptr_to_shortcut(ptr); new_parent = new_s->back_pointer; slot = new_s->parent_slot; if (new_n->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT) { struct assoc_array_node *n; pr_devel("excise shortcut\n"); new_n->back_pointer = new_parent; new_n->parent_slot = slot; kfree(new_s); if (!new_parent) { new_root = assoc_array_node_to_ptr(new_n); goto gc_complete; } n = assoc_array_ptr_to_node(new_parent); n->slots[slot] = assoc_array_node_to_ptr(new_n); } } else { new_parent = ptr; } new_n = assoc_array_ptr_to_node(new_parent); ascend_old_tree: ptr = node->back_pointer; if (assoc_array_ptr_is_shortcut(ptr)) { shortcut = assoc_array_ptr_to_shortcut(ptr); slot = shortcut->parent_slot; cursor = shortcut->back_pointer; if (!cursor) goto gc_complete; } else { slot = node->parent_slot; cursor = ptr; } BUG_ON(!cursor); node = assoc_array_ptr_to_node(cursor); slot++; goto continue_node; gc_complete: edit->set[0].to = new_root; assoc_array_apply_edit(edit); array->nr_leaves_on_tree = nr_leaves_on_tree; return 0; enomem: pr_devel("enomem\n"); assoc_array_destroy_subtree(new_root, edit->ops); kfree(edit); return -ENOMEM; } |
| 3 3 3 3 3 1 1 1 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 2 2 3 3 3 1 1 1 2 1 1 1 1 1 1 1 1 2 11 11 2 2 2 2 2 2 2 2 2 2 2 2 11 11 11 11 2 2 11 6 6 7 7 5 2 17 7 17 17 17 7 17 17 17 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 | // SPDX-License-Identifier: GPL-2.0 /* Copyright (C) B.A.T.M.A.N. contributors: * * Antonio Quartulli */ #include "distributed-arp-table.h" #include "main.h" #include <linux/atomic.h> #include <linux/bitops.h> #include <linux/byteorder/generic.h> #include <linux/container_of.h> #include <linux/err.h> #include <linux/errno.h> #include <linux/etherdevice.h> #include <linux/gfp.h> #include <linux/if_arp.h> #include <linux/if_ether.h> #include <linux/if_vlan.h> #include <linux/in.h> #include <linux/ip.h> #include <linux/jiffies.h> #include <linux/kref.h> #include <linux/list.h> #include <linux/netlink.h> #include <linux/rculist.h> #include <linux/rcupdate.h> #include <linux/skbuff.h> #include <linux/slab.h> #include <linux/spinlock.h> #include <linux/stddef.h> #include <linux/string.h> #include <linux/udp.h> #include <linux/unaligned.h> #include <linux/workqueue.h> #include <net/arp.h> #include <net/genetlink.h> #include <net/netlink.h> #include <uapi/linux/batman_adv.h> #include "bridge_loop_avoidance.h" #include "hard-interface.h" #include "hash.h" #include "log.h" #include "netlink.h" #include "originator.h" #include "send.h" #include "translation-table.h" #include "tvlv.h" enum batadv_bootpop { BATADV_BOOTREPLY = 2, }; enum batadv_boothtype { BATADV_HTYPE_ETHERNET = 1, }; enum batadv_dhcpoptioncode { BATADV_DHCP_OPT_PAD = 0, BATADV_DHCP_OPT_MSG_TYPE = 53, BATADV_DHCP_OPT_END = 255, }; enum batadv_dhcptype { BATADV_DHCPACK = 5, }; /* { 99, 130, 83, 99 } */ #define BATADV_DHCP_MAGIC 1669485411 struct batadv_dhcp_packet { __u8 op; __u8 htype; __u8 hlen; __u8 hops; __be32 xid; __be16 secs; __be16 flags; __be32 ciaddr; __be32 yiaddr; __be32 siaddr; __be32 giaddr; __u8 chaddr[16]; __u8 sname[64]; __u8 file[128]; __be32 magic; /* __u8 options[]; */ }; #define BATADV_DHCP_YIADDR_LEN sizeof(((struct batadv_dhcp_packet *)0)->yiaddr) #define BATADV_DHCP_CHADDR_LEN sizeof(((struct batadv_dhcp_packet *)0)->chaddr) static void batadv_dat_purge(struct work_struct *work); /** * batadv_dat_start_timer() - initialise the DAT periodic worker * @bat_priv: the bat priv with all the mesh interface information */ static void batadv_dat_start_timer(struct batadv_priv *bat_priv) { queue_delayed_work(batadv_event_workqueue, &bat_priv->dat.work, msecs_to_jiffies(10000)); } /** * batadv_dat_entry_release() - release dat_entry from lists and queue for free * after rcu grace period * @ref: kref pointer of the dat_entry */ static void batadv_dat_entry_release(struct kref *ref) { struct batadv_dat_entry *dat_entry; dat_entry = container_of(ref, struct batadv_dat_entry, refcount); kfree_rcu(dat_entry, rcu); } /** * batadv_dat_entry_put() - decrement the dat_entry refcounter and possibly * release it * @dat_entry: dat_entry to be free'd */ static void batadv_dat_entry_put(struct batadv_dat_entry *dat_entry) { if (!dat_entry) return; kref_put(&dat_entry->refcount, batadv_dat_entry_release); } /** * batadv_dat_to_purge() - check whether a dat_entry has to be purged or not * @dat_entry: the entry to check * * Return: true if the entry has to be purged now, false otherwise. */ static bool batadv_dat_to_purge(struct batadv_dat_entry *dat_entry) { return batadv_has_timed_out(dat_entry->last_update, BATADV_DAT_ENTRY_TIMEOUT); } /** * __batadv_dat_purge() - delete entries from the DAT local storage * @bat_priv: the bat priv with all the mesh interface information * @to_purge: function in charge to decide whether an entry has to be purged or * not. This function takes the dat_entry as argument and has to * returns a boolean value: true is the entry has to be deleted, * false otherwise * * Loops over each entry in the DAT local storage and deletes it if and only if * the to_purge function passed as argument returns true. */ static void __batadv_dat_purge(struct batadv_priv *bat_priv, bool (*to_purge)(struct batadv_dat_entry *)) { spinlock_t *list_lock; /* protects write access to the hash lists */ struct batadv_dat_entry *dat_entry; struct hlist_node *node_tmp; struct hlist_head *head; u32 i; if (!bat_priv->dat.hash) return; for (i = 0; i < bat_priv->dat.hash->size; i++) { head = &bat_priv->dat.hash->table[i]; list_lock = &bat_priv->dat.hash->list_locks[i]; spin_lock_bh(list_lock); hlist_for_each_entry_safe(dat_entry, node_tmp, head, hash_entry) { /* if a helper function has been passed as parameter, * ask it if the entry has to be purged or not */ if (to_purge && !to_purge(dat_entry)) continue; hlist_del_rcu(&dat_entry->hash_entry); batadv_dat_entry_put(dat_entry); } spin_unlock_bh(list_lock); } } /** * batadv_dat_purge() - periodic task that deletes old entries from the local * DAT hash table * @work: kernel work struct */ static void batadv_dat_purge(struct work_struct *work) { struct delayed_work *delayed_work; struct batadv_priv_dat *priv_dat; struct batadv_priv *bat_priv; delayed_work = to_delayed_work(work); priv_dat = container_of(delayed_work, struct batadv_priv_dat, work); bat_priv = container_of(priv_dat, struct batadv_priv, dat); __batadv_dat_purge(bat_priv, batadv_dat_to_purge); batadv_dat_start_timer(bat_priv); } /** * batadv_compare_dat() - comparing function used in the local DAT hash table * @node: node in the local table * @data2: second object to compare the node to * * Return: true if the two entries are the same, false otherwise. */ static bool batadv_compare_dat(const struct hlist_node *node, const void *data2) { const void *data1 = container_of(node, struct batadv_dat_entry, hash_entry); return memcmp(data1, data2, sizeof(__be32)) == 0; } /** * batadv_arp_hw_src() - extract the hw_src field from an ARP packet * @skb: ARP packet * @hdr_size: size of the possible header before the ARP packet * * Return: the value of the hw_src field in the ARP packet. */ static u8 *batadv_arp_hw_src(struct sk_buff *skb, int hdr_size) { u8 *addr; addr = (u8 *)(skb->data + hdr_size); addr += ETH_HLEN + sizeof(struct arphdr); return addr; } /** * batadv_arp_ip_src() - extract the ip_src field from an ARP packet * @skb: ARP packet * @hdr_size: size of the possible header before the ARP packet * * Return: the value of the ip_src field in the ARP packet. */ static __be32 batadv_arp_ip_src(struct sk_buff *skb, int hdr_size) { return *(__force __be32 *)(batadv_arp_hw_src(skb, hdr_size) + ETH_ALEN); } /** * batadv_arp_hw_dst() - extract the hw_dst field from an ARP packet * @skb: ARP packet * @hdr_size: size of the possible header before the ARP packet * * Return: the value of the hw_dst field in the ARP packet. */ static u8 *batadv_arp_hw_dst(struct sk_buff *skb, int hdr_size) { return batadv_arp_hw_src(skb, hdr_size) + ETH_ALEN + 4; } /** * batadv_arp_ip_dst() - extract the ip_dst field from an ARP packet * @skb: ARP packet * @hdr_size: size of the possible header before the ARP packet * * Return: the value of the ip_dst field in the ARP packet. */ static __be32 batadv_arp_ip_dst(struct sk_buff *skb, int hdr_size) { u8 *dst = batadv_arp_hw_src(skb, hdr_size) + ETH_ALEN * 2 + 4; return *(__force __be32 *)dst; } /** * batadv_hash_dat() - compute the hash value for an IP address * @data: data to hash * @size: size of the hash table * * Return: the selected index in the hash table for the given data. */ static u32 batadv_hash_dat(const void *data, u32 size) { u32 hash = 0; const struct batadv_dat_entry *dat = data; const unsigned char *key; __be16 vid; u32 i; key = (__force const unsigned char *)&dat->ip; for (i = 0; i < sizeof(dat->ip); i++) { hash += key[i]; hash += (hash << 10); hash ^= (hash >> 6); } vid = htons(dat->vid); key = (__force const unsigned char *)&vid; for (i = 0; i < sizeof(dat->vid); i++) { hash += key[i]; hash += (hash << 10); hash ^= (hash >> 6); } hash += (hash << 3); hash ^= (hash >> 11); hash += (hash << 15); return hash % size; } /** * batadv_dat_entry_hash_find() - look for a given dat_entry in the local hash * table * @bat_priv: the bat priv with all the mesh interface information * @ip: search key * @vid: VLAN identifier * * Return: the dat_entry if found, NULL otherwise. */ static struct batadv_dat_entry * batadv_dat_entry_hash_find(struct batadv_priv *bat_priv, __be32 ip, unsigned short vid) { struct hlist_head *head; struct batadv_dat_entry to_find, *dat_entry, *dat_entry_tmp = NULL; struct batadv_hashtable *hash = bat_priv->dat.hash; u32 index; if (!hash) return NULL; to_find.ip = ip; to_find.vid = vid; index = batadv_hash_dat(&to_find, hash->size); head = &hash->table[index]; rcu_read_lock(); hlist_for_each_entry_rcu(dat_entry, head, hash_entry) { if (dat_entry->ip != ip) continue; if (!kref_get_unless_zero(&dat_entry->refcount)) continue; dat_entry_tmp = dat_entry; break; } rcu_read_unlock(); return dat_entry_tmp; } /** * batadv_dat_entry_add() - add a new dat entry or update it if already exists * @bat_priv: the bat priv with all the mesh interface information * @ip: ipv4 to add/edit * @mac_addr: mac address to assign to the given ipv4 * @vid: VLAN identifier */ static void batadv_dat_entry_add(struct batadv_priv *bat_priv, __be32 ip, u8 *mac_addr, unsigned short vid) { struct batadv_dat_entry *dat_entry; int hash_added; dat_entry = batadv_dat_entry_hash_find(bat_priv, ip, vid); /* if this entry is already known, just update it */ if (dat_entry) { if (!batadv_compare_eth(dat_entry->mac_addr, mac_addr)) ether_addr_copy(dat_entry->mac_addr, mac_addr); dat_entry->last_update = jiffies; batadv_dbg(BATADV_DBG_DAT, bat_priv, "Entry updated: %pI4 %pM (vid: %d)\n", &dat_entry->ip, dat_entry->mac_addr, batadv_print_vid(vid)); goto out; } dat_entry = kmalloc(sizeof(*dat_entry), GFP_ATOMIC); if (!dat_entry) goto out; dat_entry->ip = ip; dat_entry->vid = vid; ether_addr_copy(dat_entry->mac_addr, mac_addr); dat_entry->last_update = jiffies; kref_init(&dat_entry->refcount); kref_get(&dat_entry->refcount); hash_added = batadv_hash_add(bat_priv->dat.hash, batadv_compare_dat, batadv_hash_dat, dat_entry, &dat_entry->hash_entry); if (unlikely(hash_added != 0)) { /* remove the reference for the hash */ batadv_dat_entry_put(dat_entry); goto out; } batadv_dbg(BATADV_DBG_DAT, bat_priv, "New entry added: %pI4 %pM (vid: %d)\n", &dat_entry->ip, dat_entry->mac_addr, batadv_print_vid(vid)); out: batadv_dat_entry_put(dat_entry); } #ifdef CONFIG_BATMAN_ADV_DEBUG /** * batadv_dbg_arp() - print a debug message containing all the ARP packet * details * @bat_priv: the bat priv with all the mesh interface information * @skb: ARP packet * @hdr_size: size of the possible header before the ARP packet * @msg: message to print together with the debugging information */ static void batadv_dbg_arp(struct batadv_priv *bat_priv, struct sk_buff *skb, int hdr_size, char *msg) { struct batadv_unicast_4addr_packet *unicast_4addr_packet; struct batadv_bcast_packet *bcast_pkt; u8 *orig_addr; __be32 ip_src, ip_dst; if (msg) batadv_dbg(BATADV_DBG_DAT, bat_priv, "%s\n", msg); ip_src = batadv_arp_ip_src(skb, hdr_size); ip_dst = batadv_arp_ip_dst(skb, hdr_size); batadv_dbg(BATADV_DBG_DAT, bat_priv, "ARP MSG = [src: %pM-%pI4 dst: %pM-%pI4]\n", batadv_arp_hw_src(skb, hdr_size), &ip_src, batadv_arp_hw_dst(skb, hdr_size), &ip_dst); if (hdr_size < sizeof(struct batadv_unicast_packet)) return; unicast_4addr_packet = (struct batadv_unicast_4addr_packet *)skb->data; switch (unicast_4addr_packet->u.packet_type) { case BATADV_UNICAST: batadv_dbg(BATADV_DBG_DAT, bat_priv, "* encapsulated within a UNICAST packet\n"); break; case BATADV_UNICAST_4ADDR: batadv_dbg(BATADV_DBG_DAT, bat_priv, "* encapsulated within a UNICAST_4ADDR packet (src: %pM)\n", unicast_4addr_packet->src); switch (unicast_4addr_packet->subtype) { case BATADV_P_DAT_DHT_PUT: batadv_dbg(BATADV_DBG_DAT, bat_priv, "* type: DAT_DHT_PUT\n"); break; case BATADV_P_DAT_DHT_GET: batadv_dbg(BATADV_DBG_DAT, bat_priv, "* type: DAT_DHT_GET\n"); break; case BATADV_P_DAT_CACHE_REPLY: batadv_dbg(BATADV_DBG_DAT, bat_priv, "* type: DAT_CACHE_REPLY\n"); break; case BATADV_P_DATA: batadv_dbg(BATADV_DBG_DAT, bat_priv, "* type: DATA\n"); break; default: batadv_dbg(BATADV_DBG_DAT, bat_priv, "* type: Unknown (%u)!\n", unicast_4addr_packet->u.packet_type); } break; case BATADV_BCAST: bcast_pkt = (struct batadv_bcast_packet *)unicast_4addr_packet; orig_addr = bcast_pkt->orig; batadv_dbg(BATADV_DBG_DAT, bat_priv, "* encapsulated within a BCAST packet (src: %pM)\n", orig_addr); break; default: batadv_dbg(BATADV_DBG_DAT, bat_priv, "* encapsulated within an unknown packet type (0x%x)\n", unicast_4addr_packet->u.packet_type); } } #else static void batadv_dbg_arp(struct batadv_priv *bat_priv, struct sk_buff *skb, int hdr_size, char *msg) { } #endif /* CONFIG_BATMAN_ADV_DEBUG */ /** * batadv_is_orig_node_eligible() - check whether a node can be a DHT candidate * @res: the array with the already selected candidates * @select: number of already selected candidates * @tmp_max: address of the currently evaluated node * @max: current round max address * @last_max: address of the last selected candidate * @candidate: orig_node under evaluation * @max_orig_node: last selected candidate * * Return: true if the node has been elected as next candidate or false * otherwise. */ static bool batadv_is_orig_node_eligible(struct batadv_dat_candidate *res, int select, batadv_dat_addr_t tmp_max, batadv_dat_addr_t max, batadv_dat_addr_t last_max, struct batadv_orig_node *candidate, struct batadv_orig_node *max_orig_node) { bool ret = false; int j; /* check if orig node candidate is running DAT */ if (!test_bit(BATADV_ORIG_CAPA_HAS_DAT, &candidate->capabilities)) goto out; /* Check if this node has already been selected... */ for (j = 0; j < select; j++) if (res[j].orig_node == candidate) break; /* ..and possibly skip it */ if (j < select) goto out; /* sanity check: has it already been selected? This should not happen */ if (tmp_max > last_max) goto out; /* check if during this iteration an originator with a closer dht * address has already been found */ if (tmp_max < max) goto out; /* this is an hash collision with the temporary selected node. Choose * the one with the lowest address */ if (tmp_max == max && max_orig_node && batadv_compare_eth(candidate->orig, max_orig_node->orig)) goto out; ret = true; out: return ret; } /** * batadv_choose_next_candidate() - select the next DHT candidate * @bat_priv: the bat priv with all the mesh interface information * @cands: candidates array * @select: number of candidates already present in the array * @ip_key: key to look up in the DHT * @last_max: pointer where the address of the selected candidate will be saved */ static void batadv_choose_next_candidate(struct batadv_priv *bat_priv, struct batadv_dat_candidate *cands, int select, batadv_dat_addr_t ip_key, batadv_dat_addr_t *last_max) { batadv_dat_addr_t max = 0; batadv_dat_addr_t tmp_max = 0; struct batadv_orig_node *orig_node, *max_orig_node = NULL; struct batadv_hashtable *hash = bat_priv->orig_hash; struct hlist_head *head; int i; /* if no node is eligible as candidate, leave the candidate type as * NOT_FOUND */ cands[select].type = BATADV_DAT_CANDIDATE_NOT_FOUND; /* iterate over the originator list and find the node with the closest * dat_address which has not been selected yet */ for (i = 0; i < hash->size; i++) { head = &hash->table[i]; rcu_read_lock(); hlist_for_each_entry_rcu(orig_node, head, hash_entry) { /* the dht space is a ring using unsigned addresses */ tmp_max = BATADV_DAT_ADDR_MAX - orig_node->dat_addr + ip_key; if (!batadv_is_orig_node_eligible(cands, select, tmp_max, max, *last_max, orig_node, max_orig_node)) continue; if (!kref_get_unless_zero(&orig_node->refcount)) continue; max = tmp_max; batadv_orig_node_put(max_orig_node); max_orig_node = orig_node; } rcu_read_unlock(); } if (max_orig_node) { cands[select].type = BATADV_DAT_CANDIDATE_ORIG; cands[select].orig_node = max_orig_node; batadv_dbg(BATADV_DBG_DAT, bat_priv, "dat_select_candidates() %d: selected %pM addr=%u dist=%u\n", select, max_orig_node->orig, max_orig_node->dat_addr, max); } *last_max = max; } /** * batadv_dat_select_candidates() - select the nodes which the DHT message has * to be sent to * @bat_priv: the bat priv with all the mesh interface information * @ip_dst: ipv4 to look up in the DHT * @vid: VLAN identifier * * An originator O is selected if and only if its DHT_ID value is one of three * closest values (from the LEFT, with wrap around if needed) then the hash * value of the key. ip_dst is the key. * * Return: the candidate array of size BATADV_DAT_CANDIDATE_NUM. */ static struct batadv_dat_candidate * batadv_dat_select_candidates(struct batadv_priv *bat_priv, __be32 ip_dst, unsigned short vid) { int select; batadv_dat_addr_t last_max = BATADV_DAT_ADDR_MAX, ip_key; struct batadv_dat_candidate *res; struct batadv_dat_entry dat; if (!bat_priv->orig_hash) return NULL; res = kmalloc_array(BATADV_DAT_CANDIDATES_NUM, sizeof(*res), GFP_ATOMIC); if (!res) return NULL; dat.ip = ip_dst; dat.vid = vid; ip_key = (batadv_dat_addr_t)batadv_hash_dat(&dat, BATADV_DAT_ADDR_MAX); batadv_dbg(BATADV_DBG_DAT, bat_priv, "%s(): IP=%pI4 hash(IP)=%u\n", __func__, &ip_dst, ip_key); for (select = 0; select < BATADV_DAT_CANDIDATES_NUM; select++) batadv_choose_next_candidate(bat_priv, res, select, ip_key, &last_max); return res; } /** * batadv_dat_forward_data() - copy and send payload to the selected candidates * @bat_priv: the bat priv with all the mesh interface information * @skb: payload to send * @ip: the DHT key * @vid: VLAN identifier * @packet_subtype: unicast4addr packet subtype to use * * This function copies the skb with pskb_copy() and is sent as a unicast packet * to each of the selected candidates. * * Return: true if the packet is sent to at least one candidate, false * otherwise. */ static bool batadv_dat_forward_data(struct batadv_priv *bat_priv, struct sk_buff *skb, __be32 ip, unsigned short vid, int packet_subtype) { int i; bool ret = false; int send_status; struct batadv_neigh_node *neigh_node = NULL; struct sk_buff *tmp_skb; struct batadv_dat_candidate *cand; cand = batadv_dat_select_candidates(bat_priv, ip, vid); if (!cand) return ret; batadv_dbg(BATADV_DBG_DAT, bat_priv, "DHT_SEND for %pI4\n", &ip); for (i = 0; i < BATADV_DAT_CANDIDATES_NUM; i++) { if (cand[i].type == BATADV_DAT_CANDIDATE_NOT_FOUND) continue; neigh_node = batadv_orig_router_get(cand[i].orig_node, BATADV_IF_DEFAULT); if (!neigh_node) goto free_orig; tmp_skb = pskb_copy_for_clone(skb, GFP_ATOMIC); if (!batadv_send_skb_prepare_unicast_4addr(bat_priv, tmp_skb, cand[i].orig_node, packet_subtype)) { kfree_skb(tmp_skb); goto free_neigh; } send_status = batadv_send_unicast_skb(tmp_skb, neigh_node); if (send_status == NET_XMIT_SUCCESS) { /* count the sent packet */ switch (packet_subtype) { case BATADV_P_DAT_DHT_GET: batadv_inc_counter(bat_priv, BATADV_CNT_DAT_GET_TX); break; case BATADV_P_DAT_DHT_PUT: batadv_inc_counter(bat_priv, BATADV_CNT_DAT_PUT_TX); break; } /* packet sent to a candidate: return true */ ret = true; } free_neigh: batadv_neigh_node_put(neigh_node); free_orig: batadv_orig_node_put(cand[i].orig_node); } kfree(cand); return ret; } /** * batadv_dat_tvlv_container_update() - update the dat tvlv container after dat * setting change * @bat_priv: the bat priv with all the mesh interface information */ static void batadv_dat_tvlv_container_update(struct batadv_priv *bat_priv) { char dat_mode; dat_mode = atomic_read(&bat_priv->distributed_arp_table); switch (dat_mode) { case 0: batadv_tvlv_container_unregister(bat_priv, BATADV_TVLV_DAT, 1); break; case 1: batadv_tvlv_container_register(bat_priv, BATADV_TVLV_DAT, 1, NULL, 0); break; } } /** * batadv_dat_status_update() - update the dat tvlv container after dat * setting change * @net_dev: the mesh interface net device */ void batadv_dat_status_update(struct net_device *net_dev) { struct batadv_priv *bat_priv = netdev_priv(net_dev); batadv_dat_tvlv_container_update(bat_priv); } /** * batadv_dat_tvlv_ogm_handler_v1() - process incoming dat tvlv container * @bat_priv: the bat priv with all the mesh interface information * @orig: the orig_node of the ogm * @flags: flags indicating the tvlv state (see batadv_tvlv_handler_flags) * @tvlv_value: tvlv buffer containing the gateway data * @tvlv_value_len: tvlv buffer length */ static void batadv_dat_tvlv_ogm_handler_v1(struct batadv_priv *bat_priv, struct batadv_orig_node *orig, u8 flags, void *tvlv_value, u16 tvlv_value_len) { if (flags & BATADV_TVLV_HANDLER_OGM_CIFNOTFND) clear_bit(BATADV_ORIG_CAPA_HAS_DAT, &orig->capabilities); else set_bit(BATADV_ORIG_CAPA_HAS_DAT, &orig->capabilities); } /** * batadv_dat_hash_free() - free the local DAT hash table * @bat_priv: the bat priv with all the mesh interface information */ static void batadv_dat_hash_free(struct batadv_priv *bat_priv) { if (!bat_priv->dat.hash) return; __batadv_dat_purge(bat_priv, NULL); batadv_hash_destroy(bat_priv->dat.hash); bat_priv->dat.hash = NULL; } /** * batadv_dat_init() - initialise the DAT internals * @bat_priv: the bat priv with all the mesh interface information * * Return: 0 in case of success, a negative error code otherwise */ int batadv_dat_init(struct batadv_priv *bat_priv) { if (bat_priv->dat.hash) return 0; bat_priv->dat.hash = batadv_hash_new(1024); if (!bat_priv->dat.hash) return -ENOMEM; INIT_DELAYED_WORK(&bat_priv->dat.work, batadv_dat_purge); batadv_dat_start_timer(bat_priv); batadv_tvlv_handler_register(bat_priv, batadv_dat_tvlv_ogm_handler_v1, NULL, NULL, BATADV_TVLV_DAT, 1, BATADV_TVLV_HANDLER_OGM_CIFNOTFND); batadv_dat_tvlv_container_update(bat_priv); return 0; } /** * batadv_dat_free() - free the DAT internals * @bat_priv: the bat priv with all the mesh interface information */ void batadv_dat_free(struct batadv_priv *bat_priv) { batadv_tvlv_container_unregister(bat_priv, BATADV_TVLV_DAT, 1); batadv_tvlv_handler_unregister(bat_priv, BATADV_TVLV_DAT, 1); cancel_delayed_work_sync(&bat_priv->dat.work); batadv_dat_hash_free(bat_priv); } /** * batadv_dat_cache_dump_entry() - dump one entry of the DAT cache table to a * netlink socket * @msg: buffer for the message * @portid: netlink port * @cb: Control block containing additional options * @dat_entry: entry to dump * * Return: 0 or error code. */ static int batadv_dat_cache_dump_entry(struct sk_buff *msg, u32 portid, struct netlink_callback *cb, struct batadv_dat_entry *dat_entry) { int msecs; void *hdr; hdr = genlmsg_put(msg, portid, cb->nlh->nlmsg_seq, &batadv_netlink_family, NLM_F_MULTI, BATADV_CMD_GET_DAT_CACHE); if (!hdr) return -ENOBUFS; genl_dump_check_consistent(cb, hdr); msecs = jiffies_to_msecs(jiffies - dat_entry->last_update); if (nla_put_in_addr(msg, BATADV_ATTR_DAT_CACHE_IP4ADDRESS, dat_entry->ip) || nla_put(msg, BATADV_ATTR_DAT_CACHE_HWADDRESS, ETH_ALEN, dat_entry->mac_addr) || nla_put_u16(msg, BATADV_ATTR_DAT_CACHE_VID, dat_entry->vid) || nla_put_u32(msg, BATADV_ATTR_LAST_SEEN_MSECS, msecs)) { genlmsg_cancel(msg, hdr); return -EMSGSIZE; } genlmsg_end(msg, hdr); return 0; } /** * batadv_dat_cache_dump_bucket() - dump one bucket of the DAT cache table to * a netlink socket * @msg: buffer for the message * @portid: netlink port * @cb: Control block containing additional options * @hash: hash to dump * @bucket: bucket index to dump * @idx_skip: How many entries to skip * * Return: 0 or error code. */ static int batadv_dat_cache_dump_bucket(struct sk_buff *msg, u32 portid, struct netlink_callback *cb, struct batadv_hashtable *hash, unsigned int bucket, int *idx_skip) { struct batadv_dat_entry *dat_entry; int idx = 0; spin_lock_bh(&hash->list_locks[bucket]); cb->seq = atomic_read(&hash->generation) << 1 | 1; hlist_for_each_entry(dat_entry, &hash->table[bucket], hash_entry) { if (idx < *idx_skip) goto skip; if (batadv_dat_cache_dump_entry(msg, portid, cb, dat_entry)) { spin_unlock_bh(&hash->list_locks[bucket]); *idx_skip = idx; return -EMSGSIZE; } skip: idx++; } spin_unlock_bh(&hash->list_locks[bucket]); return 0; } /** * batadv_dat_cache_dump() - dump DAT cache table to a netlink socket * @msg: buffer for the message * @cb: callback structure containing arguments * * Return: message length. */ int batadv_dat_cache_dump(struct sk_buff *msg, struct netlink_callback *cb) { struct batadv_hard_iface *primary_if = NULL; int portid = NETLINK_CB(cb->skb).portid; struct net_device *mesh_iface; struct batadv_hashtable *hash; struct batadv_priv *bat_priv; int bucket = cb->args[0]; int idx = cb->args[1]; int ret = 0; mesh_iface = batadv_netlink_get_meshif(cb); if (IS_ERR(mesh_iface)) return PTR_ERR(mesh_iface); bat_priv = netdev_priv(mesh_iface); hash = bat_priv->dat.hash; primary_if = batadv_primary_if_get_selected(bat_priv); if (!primary_if || primary_if->if_status != BATADV_IF_ACTIVE) { ret = -ENOENT; goto out; } while (bucket < hash->size) { if (batadv_dat_cache_dump_bucket(msg, portid, cb, hash, bucket, &idx)) break; bucket++; idx = 0; } cb->args[0] = bucket; cb->args[1] = idx; ret = msg->len; out: batadv_hardif_put(primary_if); dev_put(mesh_iface); return ret; } /** * batadv_arp_get_type() - parse an ARP packet and gets the type * @bat_priv: the bat priv with all the mesh interface information * @skb: packet to analyse * @hdr_size: size of the possible header before the ARP packet in the skb * * Return: the ARP type if the skb contains a valid ARP packet, 0 otherwise. */ static u16 batadv_arp_get_type(struct batadv_priv *bat_priv, struct sk_buff *skb, int hdr_size) { struct arphdr *arphdr; struct ethhdr *ethhdr; __be32 ip_src, ip_dst; u8 *hw_src, *hw_dst; u16 type = 0; /* pull the ethernet header */ if (unlikely(!pskb_may_pull(skb, hdr_size + ETH_HLEN))) goto out; ethhdr = (struct ethhdr *)(skb->data + hdr_size); if (ethhdr->h_proto != htons(ETH_P_ARP)) goto out; /* pull the ARP payload */ if (unlikely(!pskb_may_pull(skb, hdr_size + ETH_HLEN + arp_hdr_len(skb->dev)))) goto out; arphdr = (struct arphdr *)(skb->data + hdr_size + ETH_HLEN); /* check whether the ARP packet carries a valid IP information */ if (arphdr->ar_hrd != htons(ARPHRD_ETHER)) goto out; if (arphdr->ar_pro != htons(ETH_P_IP)) goto out; if (arphdr->ar_hln != ETH_ALEN) goto out; if (arphdr->ar_pln != 4) goto out; /* Check for bad reply/request. If the ARP message is not sane, DAT * will simply ignore it */ ip_src = batadv_arp_ip_src(skb, hdr_size); ip_dst = batadv_arp_ip_dst(skb, hdr_size); if (ipv4_is_loopback(ip_src) || ipv4_is_multicast(ip_src) || ipv4_is_loopback(ip_dst) || ipv4_is_multicast(ip_dst) || ipv4_is_zeronet(ip_src) || ipv4_is_lbcast(ip_src) || ipv4_is_zeronet(ip_dst) || ipv4_is_lbcast(ip_dst)) goto out; hw_src = batadv_arp_hw_src(skb, hdr_size); if (is_zero_ether_addr(hw_src) || is_multicast_ether_addr(hw_src)) goto out; /* don't care about the destination MAC address in ARP requests */ if (arphdr->ar_op != htons(ARPOP_REQUEST)) { hw_dst = batadv_arp_hw_dst(skb, hdr_size); if (is_zero_ether_addr(hw_dst) || is_multicast_ether_addr(hw_dst)) goto out; } type = ntohs(arphdr->ar_op); out: return type; } /** * batadv_dat_get_vid() - extract the VLAN identifier from skb if any * @skb: the buffer containing the packet to extract the VID from * @hdr_size: the size of the batman-adv header encapsulating the packet * * Return: If the packet embedded in the skb is vlan tagged this function * returns the VID with the BATADV_VLAN_HAS_TAG flag. Otherwise BATADV_NO_FLAGS * is returned. */ static unsigned short batadv_dat_get_vid(struct sk_buff *skb, int *hdr_size) { unsigned short vid; vid = batadv_get_vid(skb, *hdr_size); /* ARP parsing functions jump forward of hdr_size + ETH_HLEN. * If the header contained in the packet is a VLAN one (which is longer) * hdr_size is updated so that the functions will still skip the * correct amount of bytes. */ if (vid & BATADV_VLAN_HAS_TAG) *hdr_size += VLAN_HLEN; return vid; } /** * batadv_dat_arp_create_reply() - create an ARP Reply * @bat_priv: the bat priv with all the mesh interface information * @ip_src: ARP sender IP * @ip_dst: ARP target IP * @hw_src: Ethernet source and ARP sender MAC * @hw_dst: Ethernet destination and ARP target MAC * @vid: VLAN identifier (optional, set to zero otherwise) * * Creates an ARP Reply from the given values, optionally encapsulated in a * VLAN header. * * Return: An skb containing an ARP Reply. */ static struct sk_buff * batadv_dat_arp_create_reply(struct batadv_priv *bat_priv, __be32 ip_src, __be32 ip_dst, u8 *hw_src, u8 *hw_dst, unsigned short vid) { struct sk_buff *skb; skb = arp_create(ARPOP_REPLY, ETH_P_ARP, ip_dst, bat_priv->mesh_iface, ip_src, hw_dst, hw_src, hw_dst); if (!skb) return NULL; skb_reset_mac_header(skb); if (vid & BATADV_VLAN_HAS_TAG) skb = vlan_insert_tag(skb, htons(ETH_P_8021Q), vid & VLAN_VID_MASK); return skb; } /** * batadv_dat_snoop_outgoing_arp_request() - snoop the ARP request and try to * answer using DAT * @bat_priv: the bat priv with all the mesh interface information * @skb: packet to check * * Return: true if the message has been sent to the dht candidates, false * otherwise. In case of a positive return value the message has to be enqueued * to permit the fallback. */ bool batadv_dat_snoop_outgoing_arp_request(struct batadv_priv *bat_priv, struct sk_buff *skb) { u16 type = 0; __be32 ip_dst, ip_src; u8 *hw_src; bool ret = false; struct batadv_dat_entry *dat_entry = NULL; struct sk_buff *skb_new; struct net_device *mesh_iface = bat_priv->mesh_iface; int hdr_size = 0; unsigned short vid; if (!atomic_read(&bat_priv->distributed_arp_table)) goto out; vid = batadv_dat_get_vid(skb, &hdr_size); type = batadv_arp_get_type(bat_priv, skb, hdr_size); /* If the node gets an ARP_REQUEST it has to send a DHT_GET unicast * message to the selected DHT candidates */ if (type != ARPOP_REQUEST) goto out; batadv_dbg_arp(bat_priv, skb, hdr_size, "Parsing outgoing ARP REQUEST"); ip_src = batadv_arp_ip_src(skb, hdr_size); hw_src = batadv_arp_hw_src(skb, hdr_size); ip_dst = batadv_arp_ip_dst(skb, hdr_size); batadv_dat_entry_add(bat_priv, ip_src, hw_src, vid); dat_entry = batadv_dat_entry_hash_find(bat_priv, ip_dst, vid); if (dat_entry) { /* If the ARP request is destined for a local client the local * client will answer itself. DAT would only generate a * duplicate packet. * * Moreover, if the mesh-interface is enslaved into a bridge, an * additional DAT answer may trigger kernel warnings about * a packet coming from the wrong port. */ if (batadv_is_my_client(bat_priv, dat_entry->mac_addr, vid)) { ret = true; goto out; } /* If BLA is enabled, only send ARP replies if we have claimed * the destination for the ARP request or if no one else of * the backbone gws belonging to our backbone has claimed the * destination. */ if (!batadv_bla_check_claim(bat_priv, dat_entry->mac_addr, vid)) { batadv_dbg(BATADV_DBG_DAT, bat_priv, "Device %pM claimed by another backbone gw. Don't send ARP reply!", dat_entry->mac_addr); ret = true; goto out; } skb_new = batadv_dat_arp_create_reply(bat_priv, ip_dst, ip_src, dat_entry->mac_addr, hw_src, vid); if (!skb_new) goto out; skb_new->protocol = eth_type_trans(skb_new, mesh_iface); batadv_inc_counter(bat_priv, BATADV_CNT_RX); batadv_add_counter(bat_priv, BATADV_CNT_RX_BYTES, skb->len + ETH_HLEN + hdr_size); netif_rx(skb_new); batadv_dbg(BATADV_DBG_DAT, bat_priv, "ARP request replied locally\n"); ret = true; } else { /* Send the request to the DHT */ ret = batadv_dat_forward_data(bat_priv, skb, ip_dst, vid, BATADV_P_DAT_DHT_GET); } out: batadv_dat_entry_put(dat_entry); return ret; } /** * batadv_dat_snoop_incoming_arp_request() - snoop the ARP request and try to * answer using the local DAT storage * @bat_priv: the bat priv with all the mesh interface information * @skb: packet to check * @hdr_size: size of the encapsulation header * * Return: true if the request has been answered, false otherwise. */ bool batadv_dat_snoop_incoming_arp_request(struct batadv_priv *bat_priv, struct sk_buff *skb, int hdr_size) { u16 type; __be32 ip_src, ip_dst; u8 *hw_src; struct sk_buff *skb_new; struct batadv_dat_entry *dat_entry = NULL; bool ret = false; unsigned short vid; int err; if (!atomic_read(&bat_priv->distributed_arp_table)) goto out; vid = batadv_dat_get_vid(skb, &hdr_size); type = batadv_arp_get_type(bat_priv, skb, hdr_size); if (type != ARPOP_REQUEST) goto out; hw_src = batadv_arp_hw_src(skb, hdr_size); ip_src = batadv_arp_ip_src(skb, hdr_size); ip_dst = batadv_arp_ip_dst(skb, hdr_size); batadv_dbg_arp(bat_priv, skb, hdr_size, "Parsing incoming ARP REQUEST"); batadv_dat_entry_add(bat_priv, ip_src, hw_src, vid); dat_entry = batadv_dat_entry_hash_find(bat_priv, ip_dst, vid); if (!dat_entry) goto out; skb_new = batadv_dat_arp_create_reply(bat_priv, ip_dst, ip_src, dat_entry->mac_addr, hw_src, vid); if (!skb_new) goto out; /* To preserve backwards compatibility, the node has choose the outgoing * format based on the incoming request packet type. The assumption is * that a node not using the 4addr packet format doesn't support it. */ if (hdr_size == sizeof(struct batadv_unicast_4addr_packet)) err = batadv_send_skb_via_tt_4addr(bat_priv, skb_new, BATADV_P_DAT_CACHE_REPLY, NULL, vid); else err = batadv_send_skb_via_tt(bat_priv, skb_new, NULL, vid); if (err != NET_XMIT_DROP) { batadv_inc_counter(bat_priv, BATADV_CNT_DAT_CACHED_REPLY_TX); ret = true; } out: batadv_dat_entry_put(dat_entry); if (ret) kfree_skb(skb); return ret; } /** * batadv_dat_snoop_outgoing_arp_reply() - snoop the ARP reply and fill the DHT * @bat_priv: the bat priv with all the mesh interface information * @skb: packet to check */ void batadv_dat_snoop_outgoing_arp_reply(struct batadv_priv *bat_priv, struct sk_buff *skb) { u16 type; __be32 ip_src, ip_dst; u8 *hw_src, *hw_dst; int hdr_size = 0; unsigned short vid; if (!atomic_read(&bat_priv->distributed_arp_table)) return; vid = batadv_dat_get_vid(skb, &hdr_size); type = batadv_arp_get_type(bat_priv, skb, hdr_size); if (type != ARPOP_REPLY) return; batadv_dbg_arp(bat_priv, skb, hdr_size, "Parsing outgoing ARP REPLY"); hw_src = batadv_arp_hw_src(skb, hdr_size); ip_src = batadv_arp_ip_src(skb, hdr_size); hw_dst = batadv_arp_hw_dst(skb, hdr_size); ip_dst = batadv_arp_ip_dst(skb, hdr_size); batadv_dat_entry_add(bat_priv, ip_src, hw_src, vid); batadv_dat_entry_add(bat_priv, ip_dst, hw_dst, vid); /* Send the ARP reply to the candidates for both the IP addresses that * the node obtained from the ARP reply */ batadv_dat_forward_data(bat_priv, skb, ip_src, vid, BATADV_P_DAT_DHT_PUT); batadv_dat_forward_data(bat_priv, skb, ip_dst, vid, BATADV_P_DAT_DHT_PUT); } /** * batadv_dat_snoop_incoming_arp_reply() - snoop the ARP reply and fill the * local DAT storage only * @bat_priv: the bat priv with all the mesh interface information * @skb: packet to check * @hdr_size: size of the encapsulation header * * Return: true if the packet was snooped and consumed by DAT. False if the * packet has to be delivered to the interface */ bool batadv_dat_snoop_incoming_arp_reply(struct batadv_priv *bat_priv, struct sk_buff *skb, int hdr_size) { struct batadv_dat_entry *dat_entry = NULL; u16 type; __be32 ip_src, ip_dst; u8 *hw_src, *hw_dst; bool dropped = false; unsigned short vid; if (!atomic_read(&bat_priv->distributed_arp_table)) goto out; vid = batadv_dat_get_vid(skb, &hdr_size); type = batadv_arp_get_type(bat_priv, skb, hdr_size); if (type != ARPOP_REPLY) goto out; batadv_dbg_arp(bat_priv, skb, hdr_size, "Parsing incoming ARP REPLY"); hw_src = batadv_arp_hw_src(skb, hdr_size); ip_src = batadv_arp_ip_src(skb, hdr_size); hw_dst = batadv_arp_hw_dst(skb, hdr_size); ip_dst = batadv_arp_ip_dst(skb, hdr_size); /* If ip_dst is already in cache and has the right mac address, * drop this frame if this ARP reply is destined for us because it's * most probably an ARP reply generated by another node of the DHT. * We have most probably received already a reply earlier. Delivering * this frame would lead to doubled receive of an ARP reply. */ dat_entry = batadv_dat_entry_hash_find(bat_priv, ip_src, vid); if (dat_entry && batadv_compare_eth(hw_src, dat_entry->mac_addr)) { batadv_dbg(BATADV_DBG_DAT, bat_priv, "Doubled ARP reply removed: ARP MSG = [src: %pM-%pI4 dst: %pM-%pI4]; dat_entry: %pM-%pI4\n", hw_src, &ip_src, hw_dst, &ip_dst, dat_entry->mac_addr, &dat_entry->ip); dropped = true; } /* Update our internal cache with both the IP addresses the node got * within the ARP reply */ batadv_dat_entry_add(bat_priv, ip_src, hw_src, vid); batadv_dat_entry_add(bat_priv, ip_dst, hw_dst, vid); if (dropped) goto out; /* If BLA is enabled, only forward ARP replies if we have claimed the * source of the ARP reply or if no one else of the same backbone has * already claimed that client. This prevents that different gateways * to the same backbone all forward the ARP reply leading to multiple * replies in the backbone. */ if (!batadv_bla_check_claim(bat_priv, hw_src, vid)) { batadv_dbg(BATADV_DBG_DAT, bat_priv, "Device %pM claimed by another backbone gw. Drop ARP reply.\n", hw_src); dropped = true; goto out; } /* if this REPLY is directed to a client of mine, let's deliver the * packet to the interface */ dropped = !batadv_is_my_client(bat_priv, hw_dst, vid); /* if this REPLY is sent on behalf of a client of mine, let's drop the * packet because the client will reply by itself */ dropped |= batadv_is_my_client(bat_priv, hw_src, vid); out: if (dropped) kfree_skb(skb); batadv_dat_entry_put(dat_entry); /* if dropped == false -> deliver to the interface */ return dropped; } /** * batadv_dat_check_dhcp_ipudp() - check skb for IP+UDP headers valid for DHCP * @skb: the packet to check * @ip_src: a buffer to store the IPv4 source address in * * Checks whether the given skb has an IP and UDP header valid for a DHCP * message from a DHCP server. And if so, stores the IPv4 source address in * the provided buffer. * * Return: True if valid, false otherwise. */ static bool batadv_dat_check_dhcp_ipudp(struct sk_buff *skb, __be32 *ip_src) { unsigned int offset = skb_network_offset(skb); struct udphdr *udphdr, _udphdr; struct iphdr *iphdr, _iphdr; iphdr = skb_header_pointer(skb, offset, sizeof(_iphdr), &_iphdr); if (!iphdr || iphdr->version != 4 || iphdr->ihl * 4 < sizeof(_iphdr)) return false; if (iphdr->protocol != IPPROTO_UDP) return false; offset += iphdr->ihl * 4; skb_set_transport_header(skb, offset); udphdr = skb_header_pointer(skb, offset, sizeof(_udphdr), &_udphdr); if (!udphdr || udphdr->source != htons(67)) return false; *ip_src = get_unaligned(&iphdr->saddr); return true; } /** * batadv_dat_check_dhcp() - examine packet for valid DHCP message * @skb: the packet to check * @proto: ethernet protocol hint (behind a potential vlan) * @ip_src: a buffer to store the IPv4 source address in * * Checks whether the given skb is a valid DHCP packet. And if so, stores the * IPv4 source address in the provided buffer. * * Caller needs to ensure that the skb network header is set correctly. * * Return: If skb is a valid DHCP packet, then returns its op code * (e.g. BOOTREPLY vs. BOOTREQUEST). Otherwise returns -EINVAL. */ static int batadv_dat_check_dhcp(struct sk_buff *skb, __be16 proto, __be32 *ip_src) { __be32 *magic, _magic; unsigned int offset; struct { __u8 op; __u8 htype; __u8 hlen; __u8 hops; } *dhcp_h, _dhcp_h; if (proto != htons(ETH_P_IP)) return -EINVAL; if (!batadv_dat_check_dhcp_ipudp(skb, ip_src)) return -EINVAL; offset = skb_transport_offset(skb) + sizeof(struct udphdr); if (skb->len < offset + sizeof(struct batadv_dhcp_packet)) return -EINVAL; dhcp_h = skb_header_pointer(skb, offset, sizeof(_dhcp_h), &_dhcp_h); if (!dhcp_h || dhcp_h->htype != BATADV_HTYPE_ETHERNET || dhcp_h->hlen != ETH_ALEN) return -EINVAL; offset += offsetof(struct batadv_dhcp_packet, magic); magic = skb_header_pointer(skb, offset, sizeof(_magic), &_magic); if (!magic || get_unaligned(magic) != htonl(BATADV_DHCP_MAGIC)) return -EINVAL; return dhcp_h->op; } /** * batadv_dat_get_dhcp_message_type() - get message type of a DHCP packet * @skb: the DHCP packet to parse * * Iterates over the DHCP options of the given DHCP packet to find a * DHCP Message Type option and parse it. * * Caller needs to ensure that the given skb is a valid DHCP packet and * that the skb transport header is set correctly. * * Return: The found DHCP message type value, if found. -EINVAL otherwise. */ static int batadv_dat_get_dhcp_message_type(struct sk_buff *skb) { unsigned int offset = skb_transport_offset(skb) + sizeof(struct udphdr); u8 *type, _type; struct { u8 type; u8 len; } *tl, _tl; offset += sizeof(struct batadv_dhcp_packet); while ((tl = skb_header_pointer(skb, offset, sizeof(_tl), &_tl))) { if (tl->type == BATADV_DHCP_OPT_MSG_TYPE) break; if (tl->type == BATADV_DHCP_OPT_END) break; if (tl->type == BATADV_DHCP_OPT_PAD) offset++; else offset += tl->len + sizeof(_tl); } /* Option Overload Code not supported */ if (!tl || tl->type != BATADV_DHCP_OPT_MSG_TYPE || tl->len != sizeof(_type)) return -EINVAL; offset += sizeof(_tl); type = skb_header_pointer(skb, offset, sizeof(_type), &_type); if (!type) return -EINVAL; return *type; } /** * batadv_dat_dhcp_get_yiaddr() - get yiaddr from a DHCP packet * @skb: the DHCP packet to parse * @buf: a buffer to store the yiaddr in * * Caller needs to ensure that the given skb is a valid DHCP packet and * that the skb transport header is set correctly. * * Return: True on success, false otherwise. */ static bool batadv_dat_dhcp_get_yiaddr(struct sk_buff *skb, __be32 *buf) { unsigned int offset = skb_transport_offset(skb) + sizeof(struct udphdr); __be32 *yiaddr; offset += offsetof(struct batadv_dhcp_packet, yiaddr); yiaddr = skb_header_pointer(skb, offset, BATADV_DHCP_YIADDR_LEN, buf); if (!yiaddr) return false; if (yiaddr != buf) *buf = get_unaligned(yiaddr); return true; } /** * batadv_dat_get_dhcp_chaddr() - get chaddr from a DHCP packet * @skb: the DHCP packet to parse * @buf: a buffer to store the chaddr in * * Caller needs to ensure that the given skb is a valid DHCP packet and * that the skb transport header is set correctly. * * Return: True on success, false otherwise */ static bool batadv_dat_get_dhcp_chaddr(struct sk_buff *skb, u8 *buf) { unsigned int offset = skb_transport_offset(skb) + sizeof(struct udphdr); u8 *chaddr; offset += offsetof(struct batadv_dhcp_packet, chaddr); chaddr = skb_header_pointer(skb, offset, BATADV_DHCP_CHADDR_LEN, buf); if (!chaddr) return false; if (chaddr != buf) memcpy(buf, chaddr, BATADV_DHCP_CHADDR_LEN); return true; } /** * batadv_dat_put_dhcp() - puts addresses from a DHCP packet into the DHT and * DAT cache * @bat_priv: the bat priv with all the mesh interface information * @chaddr: the DHCP client MAC address * @yiaddr: the DHCP client IP address * @hw_dst: the DHCP server MAC address * @ip_dst: the DHCP server IP address * @vid: VLAN identifier * * Adds given MAC/IP pairs to the local DAT cache and propagates them further * into the DHT. * * For the DHT propagation, client MAC + IP will appear as the ARP Reply * transmitter (and hw_dst/ip_dst as the target). */ static void batadv_dat_put_dhcp(struct batadv_priv *bat_priv, u8 *chaddr, __be32 yiaddr, u8 *hw_dst, __be32 ip_dst, unsigned short vid) { struct sk_buff *skb; skb = batadv_dat_arp_create_reply(bat_priv, yiaddr, ip_dst, chaddr, hw_dst, vid); if (!skb) return; skb_set_network_header(skb, ETH_HLEN); batadv_dat_entry_add(bat_priv, yiaddr, chaddr, vid); batadv_dat_entry_add(bat_priv, ip_dst, hw_dst, vid); batadv_dat_forward_data(bat_priv, skb, yiaddr, vid, BATADV_P_DAT_DHT_PUT); batadv_dat_forward_data(bat_priv, skb, ip_dst, vid, BATADV_P_DAT_DHT_PUT); consume_skb(skb); batadv_dbg(BATADV_DBG_DAT, bat_priv, "Snooped from outgoing DHCPACK (server address): %pI4, %pM (vid: %i)\n", &ip_dst, hw_dst, batadv_print_vid(vid)); batadv_dbg(BATADV_DBG_DAT, bat_priv, "Snooped from outgoing DHCPACK (client address): %pI4, %pM (vid: %i)\n", &yiaddr, chaddr, batadv_print_vid(vid)); } /** * batadv_dat_check_dhcp_ack() - examine packet for valid DHCP message * @skb: the packet to check * @proto: ethernet protocol hint (behind a potential vlan) * @ip_src: a buffer to store the IPv4 source address in * @chaddr: a buffer to store the DHCP Client Hardware Address in * @yiaddr: a buffer to store the DHCP Your IP Address in * * Checks whether the given skb is a valid DHCPACK. And if so, stores the * IPv4 server source address (ip_src), client MAC address (chaddr) and client * IPv4 address (yiaddr) in the provided buffers. * * Caller needs to ensure that the skb network header is set correctly. * * Return: True if the skb is a valid DHCPACK. False otherwise. */ static bool batadv_dat_check_dhcp_ack(struct sk_buff *skb, __be16 proto, __be32 *ip_src, u8 *chaddr, __be32 *yiaddr) { int type; type = batadv_dat_check_dhcp(skb, proto, ip_src); if (type != BATADV_BOOTREPLY) return false; type = batadv_dat_get_dhcp_message_type(skb); if (type != BATADV_DHCPACK) return false; if (!batadv_dat_dhcp_get_yiaddr(skb, yiaddr)) return false; if (!batadv_dat_get_dhcp_chaddr(skb, chaddr)) return false; return true; } /** * batadv_dat_snoop_outgoing_dhcp_ack() - snoop DHCPACK and fill DAT with it * @bat_priv: the bat priv with all the mesh interface information * @skb: the packet to snoop * @proto: ethernet protocol hint (behind a potential vlan) * @vid: VLAN identifier * * This function first checks whether the given skb is a valid DHCPACK. If * so then its source MAC and IP as well as its DHCP Client Hardware Address * field and DHCP Your IP Address field are added to the local DAT cache and * propagated into the DHT. * * Caller needs to ensure that the skb mac and network headers are set * correctly. */ void batadv_dat_snoop_outgoing_dhcp_ack(struct batadv_priv *bat_priv, struct sk_buff *skb, __be16 proto, unsigned short vid) { u8 chaddr[BATADV_DHCP_CHADDR_LEN]; __be32 ip_src, yiaddr; if (!atomic_read(&bat_priv->distributed_arp_table)) return; if (!batadv_dat_check_dhcp_ack(skb, proto, &ip_src, chaddr, &yiaddr)) return; batadv_dat_put_dhcp(bat_priv, chaddr, yiaddr, eth_hdr(skb)->h_source, ip_src, vid); } /** * batadv_dat_snoop_incoming_dhcp_ack() - snoop DHCPACK and fill DAT cache * @bat_priv: the bat priv with all the mesh interface information * @skb: the packet to snoop * @hdr_size: header size, up to the tail of the batman-adv header * * This function first checks whether the given skb is a valid DHCPACK. If * so then its source MAC and IP as well as its DHCP Client Hardware Address * field and DHCP Your IP Address field are added to the local DAT cache. */ void batadv_dat_snoop_incoming_dhcp_ack(struct batadv_priv *bat_priv, struct sk_buff *skb, int hdr_size) { u8 chaddr[BATADV_DHCP_CHADDR_LEN]; struct ethhdr *ethhdr; __be32 ip_src, yiaddr; unsigned short vid; __be16 proto; u8 *hw_src; if (!atomic_read(&bat_priv->distributed_arp_table)) return; if (unlikely(!pskb_may_pull(skb, hdr_size + ETH_HLEN))) return; ethhdr = (struct ethhdr *)(skb->data + hdr_size); skb_set_network_header(skb, hdr_size + ETH_HLEN); proto = ethhdr->h_proto; if (!batadv_dat_check_dhcp_ack(skb, proto, &ip_src, chaddr, &yiaddr)) return; hw_src = ethhdr->h_source; vid = batadv_dat_get_vid(skb, &hdr_size); batadv_dat_entry_add(bat_priv, yiaddr, chaddr, vid); batadv_dat_entry_add(bat_priv, ip_src, hw_src, vid); batadv_dbg(BATADV_DBG_DAT, bat_priv, "Snooped from incoming DHCPACK (server address): %pI4, %pM (vid: %i)\n", &ip_src, hw_src, batadv_print_vid(vid)); batadv_dbg(BATADV_DBG_DAT, bat_priv, "Snooped from incoming DHCPACK (client address): %pI4, %pM (vid: %i)\n", &yiaddr, chaddr, batadv_print_vid(vid)); } /** * batadv_dat_drop_broadcast_packet() - check if an ARP request has to be * dropped (because the node has already obtained the reply via DAT) or not * @bat_priv: the bat priv with all the mesh interface information * @forw_packet: the broadcast packet * * Return: true if the node can drop the packet, false otherwise. */ bool batadv_dat_drop_broadcast_packet(struct batadv_priv *bat_priv, struct batadv_forw_packet *forw_packet) { u16 type; __be32 ip_dst; struct batadv_dat_entry *dat_entry = NULL; bool ret = false; int hdr_size = sizeof(struct batadv_bcast_packet); unsigned short vid; if (!atomic_read(&bat_priv->distributed_arp_table)) goto out; /* If this packet is an ARP_REQUEST and the node already has the * information that it is going to ask, then the packet can be dropped */ if (batadv_forw_packet_is_rebroadcast(forw_packet)) goto out; vid = batadv_dat_get_vid(forw_packet->skb, &hdr_size); type = batadv_arp_get_type(bat_priv, forw_packet->skb, hdr_size); if (type != ARPOP_REQUEST) goto out; ip_dst = batadv_arp_ip_dst(forw_packet->skb, hdr_size); dat_entry = batadv_dat_entry_hash_find(bat_priv, ip_dst, vid); /* check if the node already got this entry */ if (!dat_entry) { batadv_dbg(BATADV_DBG_DAT, bat_priv, "ARP Request for %pI4: fallback\n", &ip_dst); goto out; } batadv_dbg(BATADV_DBG_DAT, bat_priv, "ARP Request for %pI4: fallback prevented\n", &ip_dst); ret = true; out: batadv_dat_entry_put(dat_entry); return ret; } |
| 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 | #ifndef __DRM_VMA_MANAGER_H__ #define __DRM_VMA_MANAGER_H__ /* * Copyright (c) 2013 David Herrmann <dh.herrmann@gmail.com> * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. */ #include <drm/drm_mm.h> #include <linux/mm.h> #include <linux/rbtree.h> #include <linux/spinlock.h> #include <linux/types.h> /* We make up offsets for buffer objects so we can recognize them at * mmap time. pgoff in mmap is an unsigned long, so we need to make sure * that the faked up offset will fit */ #if BITS_PER_LONG == 64 #define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFFUL >> PAGE_SHIFT) + 1) #define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFFUL >> PAGE_SHIFT) * 256) #else #define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFUL >> PAGE_SHIFT) + 1) #define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFUL >> PAGE_SHIFT) * 16) #endif struct drm_file; struct drm_vma_offset_file { struct rb_node vm_rb; struct drm_file *vm_tag; unsigned long vm_count; }; struct drm_vma_offset_node { rwlock_t vm_lock; struct drm_mm_node vm_node; struct rb_root vm_files; void *driver_private; }; struct drm_vma_offset_manager { rwlock_t vm_lock; struct drm_mm vm_addr_space_mm; }; void drm_vma_offset_manager_init(struct drm_vma_offset_manager *mgr, unsigned long page_offset, unsigned long size); void drm_vma_offset_manager_destroy(struct drm_vma_offset_manager *mgr); struct drm_vma_offset_node *drm_vma_offset_lookup_locked(struct drm_vma_offset_manager *mgr, unsigned long start, unsigned long pages); int drm_vma_offset_add(struct drm_vma_offset_manager *mgr, struct drm_vma_offset_node *node, unsigned long pages); void drm_vma_offset_remove(struct drm_vma_offset_manager *mgr, struct drm_vma_offset_node *node); int drm_vma_node_allow(struct drm_vma_offset_node *node, struct drm_file *tag); int drm_vma_node_allow_once(struct drm_vma_offset_node *node, struct drm_file *tag); void drm_vma_node_revoke(struct drm_vma_offset_node *node, struct drm_file *tag); bool drm_vma_node_is_allowed(struct drm_vma_offset_node *node, struct drm_file *tag); /** * drm_vma_offset_exact_lookup_locked() - Look up node by exact address * @mgr: Manager object * @start: Start address (page-based, not byte-based) * @pages: Size of object (page-based) * * Same as drm_vma_offset_lookup_locked() but does not allow any offset into the node. * It only returns the exact object with the given start address. * * RETURNS: * Node at exact start address @start. */ static inline struct drm_vma_offset_node * drm_vma_offset_exact_lookup_locked(struct drm_vma_offset_manager *mgr, unsigned long start, unsigned long pages) { struct drm_vma_offset_node *node; node = drm_vma_offset_lookup_locked(mgr, start, pages); return (node && node->vm_node.start == start) ? node : NULL; } /** * drm_vma_offset_lock_lookup() - Lock lookup for extended private use * @mgr: Manager object * * Lock VMA manager for extended lookups. Only locked VMA function calls * are allowed while holding this lock. All other contexts are blocked from VMA * until the lock is released via drm_vma_offset_unlock_lookup(). * * Use this if you need to take a reference to the objects returned by * drm_vma_offset_lookup_locked() before releasing this lock again. * * This lock must not be used for anything else than extended lookups. You must * not call any other VMA helpers while holding this lock. * * Note: You're in atomic-context while holding this lock! */ static inline void drm_vma_offset_lock_lookup(struct drm_vma_offset_manager *mgr) { read_lock(&mgr->vm_lock); } /** * drm_vma_offset_unlock_lookup() - Unlock lookup for extended private use * @mgr: Manager object * * Release lookup-lock. See drm_vma_offset_lock_lookup() for more information. */ static inline void drm_vma_offset_unlock_lookup(struct drm_vma_offset_manager *mgr) { read_unlock(&mgr->vm_lock); } /** * drm_vma_node_reset() - Initialize or reset node object * @node: Node to initialize or reset * * Reset a node to its initial state. This must be called before using it with * any VMA offset manager. * * This must not be called on an already allocated node, or you will leak * memory. */ static inline void drm_vma_node_reset(struct drm_vma_offset_node *node) { memset(node, 0, sizeof(*node)); node->vm_files = RB_ROOT; rwlock_init(&node->vm_lock); } /** * drm_vma_node_start() - Return start address for page-based addressing * @node: Node to inspect * * Return the start address of the given node. This can be used as offset into * the linear VM space that is provided by the VMA offset manager. Note that * this can only be used for page-based addressing. If you need a proper offset * for user-space mappings, you must apply "<< PAGE_SHIFT" or use the * drm_vma_node_offset_addr() helper instead. * * RETURNS: * Start address of @node for page-based addressing. 0 if the node does not * have an offset allocated. */ static inline unsigned long drm_vma_node_start(const struct drm_vma_offset_node *node) { return node->vm_node.start; } /** * drm_vma_node_size() - Return size (page-based) * @node: Node to inspect * * Return the size as number of pages for the given node. This is the same size * that was passed to drm_vma_offset_add(). If no offset is allocated for the * node, this is 0. * * RETURNS: * Size of @node as number of pages. 0 if the node does not have an offset * allocated. */ static inline unsigned long drm_vma_node_size(struct drm_vma_offset_node *node) { return node->vm_node.size; } /** * drm_vma_node_offset_addr() - Return sanitized offset for user-space mmaps * @node: Linked offset node * * Same as drm_vma_node_start() but returns the address as a valid offset that * can be used for user-space mappings during mmap(). * This must not be called on unlinked nodes. * * RETURNS: * Offset of @node for byte-based addressing. 0 if the node does not have an * object allocated. */ static inline __u64 drm_vma_node_offset_addr(struct drm_vma_offset_node *node) { return ((__u64)node->vm_node.start) << PAGE_SHIFT; } /** * drm_vma_node_unmap() - Unmap offset node * @node: Offset node * @file_mapping: Address space to unmap @node from * * Unmap all userspace mappings for a given offset node. The mappings must be * associated with the @file_mapping address-space. If no offset exists * nothing is done. * * This call is unlocked. The caller must guarantee that drm_vma_offset_remove() * is not called on this node concurrently. */ static inline void drm_vma_node_unmap(struct drm_vma_offset_node *node, struct address_space *file_mapping) { if (drm_mm_node_allocated(&node->vm_node)) unmap_mapping_range(file_mapping, drm_vma_node_offset_addr(node), drm_vma_node_size(node) << PAGE_SHIFT, 1); } /** * drm_vma_node_verify_access() - Access verification helper for TTM * @node: Offset node * @tag: Tag of file to check * * This checks whether @tag is granted access to @node. It is the same as * drm_vma_node_is_allowed() but suitable as drop-in helper for TTM * verify_access() callbacks. * * RETURNS: * 0 if access is granted, -EACCES otherwise. */ static inline int drm_vma_node_verify_access(struct drm_vma_offset_node *node, struct drm_file *tag) { return drm_vma_node_is_allowed(node, tag) ? 0 : -EACCES; } #endif /* __DRM_VMA_MANAGER_H__ */ |
| 30 1 1 9 9 9 9 9 9 9 7 7 7 1 1 37 37 23 4 3 1 1 3 11 11 7 6 2 10 9 26 9 21 37 37 5 26 8 10 8 28 28 28 28 28 27 27 4 4 1 1 1 1 1 26 26 3 3 3 3 3 3 17 17 17 17 17 17 17 30 30 30 2 28 23 23 23 21 21 23 9 18 17 12 18 17 17 17 17 10 7 3 6 6 5 4 5 3 6 6 6 1 5 4 3 3 3 3 3 3 3 3 3 3 6 6 6 7 6 3 3 3 3 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 3 3 38 38 31 38 22 8 8 8 7 8 2 2 2 2 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 1 1 1 1 28 28 28 28 28 28 28 1 1 1 1 1 4 3 1 3 4 1 1 36 35 36 29 12 34 34 36 9 36 1 1 1 1 1 1 1 1 24 27 27 27 27 27 27 27 27 27 27 27 11 27 27 27 27 11 27 8 27 27 11 27 11 27 27 12 27 27 1 1 1 1 1 1 1 3 26 65 65 65 65 9 9 9 10 8 10 23 23 23 23 23 7 23 12 23 23 23 23 23 23 23 17 17 17 15 14 15 16 23 23 23 7 7 7 23 23 23 23 23 23 16 23 23 23 23 23 23 22 22 22 22 22 22 1 23 17 16 15 28 28 28 9 9 9 22 22 22 22 22 22 22 22 22 22 22 19 19 19 19 19 22 22 22 22 17 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 | // SPDX-License-Identifier: GPL-2.0-only /* * The input core * * Copyright (c) 1999-2002 Vojtech Pavlik */ #define pr_fmt(fmt) KBUILD_BASENAME ": " fmt #include <linux/init.h> #include <linux/types.h> #include <linux/idr.h> #include <linux/input/mt.h> #include <linux/module.h> #include <linux/slab.h> #include <linux/random.h> #include <linux/major.h> #include <linux/proc_fs.h> #include <linux/sched.h> #include <linux/seq_file.h> #include <linux/pm.h> #include <linux/poll.h> #include <linux/device.h> #include <linux/kstrtox.h> #include <linux/mutex.h> #include <linux/rcupdate.h> #include "input-compat.h" #include "input-core-private.h" #include "input-poller.h" MODULE_AUTHOR("Vojtech Pavlik <vojtech@suse.cz>"); MODULE_DESCRIPTION("Input core"); MODULE_LICENSE("GPL"); #define INPUT_MAX_CHAR_DEVICES 1024 #define INPUT_FIRST_DYNAMIC_DEV 256 static DEFINE_IDA(input_ida); static LIST_HEAD(input_dev_list); static LIST_HEAD(input_handler_list); /* * input_mutex protects access to both input_dev_list and input_handler_list. * This also causes input_[un]register_device and input_[un]register_handler * be mutually exclusive which simplifies locking in drivers implementing * input handlers. */ static DEFINE_MUTEX(input_mutex); static const struct input_value input_value_sync = { EV_SYN, SYN_REPORT, 1 }; static const unsigned int input_max_code[EV_CNT] = { [EV_KEY] = KEY_MAX, [EV_REL] = REL_MAX, [EV_ABS] = ABS_MAX, [EV_MSC] = MSC_MAX, [EV_SW] = SW_MAX, [EV_LED] = LED_MAX, [EV_SND] = SND_MAX, [EV_FF] = FF_MAX, }; static inline int is_event_supported(unsigned int code, unsigned long *bm, unsigned int max) { return code <= max && test_bit(code, bm); } static int input_defuzz_abs_event(int value, int old_val, int fuzz) { if (fuzz) { if (value > old_val - fuzz / 2 && value < old_val + fuzz / 2) return old_val; if (value > old_val - fuzz && value < old_val + fuzz) return (old_val * 3 + value) / 4; if (value > old_val - fuzz * 2 && value < old_val + fuzz * 2) return (old_val + value) / 2; } return value; } static void input_start_autorepeat(struct input_dev *dev, int code) { if (test_bit(EV_REP, dev->evbit) && dev->rep[REP_PERIOD] && dev->rep[REP_DELAY] && dev->timer.function) { dev->repeat_key = code; mod_timer(&dev->timer, jiffies + msecs_to_jiffies(dev->rep[REP_DELAY])); } } static void input_stop_autorepeat(struct input_dev *dev) { timer_delete(&dev->timer); } /* * Pass values first through all filters and then, if event has not been * filtered out, through all open handles. This order is achieved by placing * filters at the head of the list of handles attached to the device, and * placing regular handles at the tail of the list. * * This function is called with dev->event_lock held and interrupts disabled. */ static void input_pass_values(struct input_dev *dev, struct input_value *vals, unsigned int count) { struct input_handle *handle; struct input_value *v; lockdep_assert_held(&dev->event_lock); scoped_guard(rcu) { handle = rcu_dereference(dev->grab); if (handle) { count = handle->handle_events(handle, vals, count); break; } list_for_each_entry_rcu(handle, &dev->h_list, d_node) { if (handle->open) { count = handle->handle_events(handle, vals, count); if (!count) break; } } } /* trigger auto repeat for key events */ if (test_bit(EV_REP, dev->evbit) && test_bit(EV_KEY, dev->evbit)) { for (v = vals; v != vals + count; v++) { if (v->type == EV_KEY && v->value != 2) { if (v->value) input_start_autorepeat(dev, v->code); else input_stop_autorepeat(dev); } } } } #define INPUT_IGNORE_EVENT 0 #define INPUT_PASS_TO_HANDLERS 1 #define INPUT_PASS_TO_DEVICE 2 #define INPUT_SLOT 4 #define INPUT_FLUSH 8 #define INPUT_PASS_TO_ALL (INPUT_PASS_TO_HANDLERS | INPUT_PASS_TO_DEVICE) static int input_handle_abs_event(struct input_dev *dev, unsigned int code, int *pval) { struct input_mt *mt = dev->mt; bool is_new_slot = false; bool is_mt_event; int *pold; if (code == ABS_MT_SLOT) { /* * "Stage" the event; we'll flush it later, when we * get actual touch data. */ if (mt && *pval >= 0 && *pval < mt->num_slots) mt->slot = *pval; return INPUT_IGNORE_EVENT; } is_mt_event = input_is_mt_value(code); if (!is_mt_event) { pold = &dev->absinfo[code].value; } else if (mt) { pold = &mt->slots[mt->slot].abs[code - ABS_MT_FIRST]; is_new_slot = mt->slot != dev->absinfo[ABS_MT_SLOT].value; } else { /* * Bypass filtering for multi-touch events when * not employing slots. */ pold = NULL; } if (pold) { *pval = input_defuzz_abs_event(*pval, *pold, dev->absinfo[code].fuzz); if (*pold == *pval) return INPUT_IGNORE_EVENT; *pold = *pval; } /* Flush pending "slot" event */ if (is_new_slot) { dev->absinfo[ABS_MT_SLOT].value = mt->slot; return INPUT_PASS_TO_HANDLERS | INPUT_SLOT; } return INPUT_PASS_TO_HANDLERS; } static int input_get_disposition(struct input_dev *dev, unsigned int type, unsigned int code, int *pval) { int disposition = INPUT_IGNORE_EVENT; int value = *pval; /* filter-out events from inhibited devices */ if (dev->inhibited) return INPUT_IGNORE_EVENT; switch (type) { case EV_SYN: switch (code) { case SYN_CONFIG: disposition = INPUT_PASS_TO_ALL; break; case SYN_REPORT: disposition = INPUT_PASS_TO_HANDLERS | INPUT_FLUSH; break; case SYN_MT_REPORT: disposition = INPUT_PASS_TO_HANDLERS; break; } break; case EV_KEY: if (is_event_supported(code, dev->keybit, KEY_MAX)) { /* auto-repeat bypasses state updates */ if (value == 2) { disposition = INPUT_PASS_TO_HANDLERS; break; } if (!!test_bit(code, dev->key) != !!value) { __change_bit(code, dev->key); disposition = INPUT_PASS_TO_HANDLERS; } } break; case EV_SW: if (is_event_supported(code, dev->swbit, SW_MAX) && !!test_bit(code, dev->sw) != !!value) { __change_bit(code, dev->sw); disposition = INPUT_PASS_TO_HANDLERS; } break; case EV_ABS: if (is_event_supported(code, dev->absbit, ABS_MAX)) disposition = input_handle_abs_event(dev, code, &value); break; case EV_REL: if (is_event_supported(code, dev->relbit, REL_MAX) && value) disposition = INPUT_PASS_TO_HANDLERS; break; case EV_MSC: if (is_event_supported(code, dev->mscbit, MSC_MAX)) disposition = INPUT_PASS_TO_ALL; break; case EV_LED: if (is_event_supported(code, dev->ledbit, LED_MAX) && !!test_bit(code, dev->led) != !!value) { __change_bit(code, dev->led); disposition = INPUT_PASS_TO_ALL; } break; case EV_SND: if (is_event_supported(code, dev->sndbit, SND_MAX)) { if (!!test_bit(code, dev->snd) != !!value) __change_bit(code, dev->snd); disposition = INPUT_PASS_TO_ALL; } break; case EV_REP: if (code <= REP_MAX && value >= 0 && dev->rep[code] != value) { dev->rep[code] = value; disposition = INPUT_PASS_TO_ALL; } break; case EV_FF: if (value >= 0) disposition = INPUT_PASS_TO_ALL; break; case EV_PWR: disposition = INPUT_PASS_TO_ALL; break; } *pval = value; return disposition; } static void input_event_dispose(struct input_dev *dev, int disposition, unsigned int type, unsigned int code, int value) { if ((disposition & INPUT_PASS_TO_DEVICE) && dev->event) dev->event(dev, type, code, value); if (disposition & INPUT_PASS_TO_HANDLERS) { struct input_value *v; if (disposition & INPUT_SLOT) { v = &dev->vals[dev->num_vals++]; v->type = EV_ABS; v->code = ABS_MT_SLOT; v->value = dev->mt->slot; } v = &dev->vals[dev->num_vals++]; v->type = type; v->code = code; v->value = value; } if (disposition & INPUT_FLUSH) { if (dev->num_vals >= 2) input_pass_values(dev, dev->vals, dev->num_vals); dev->num_vals = 0; /* * Reset the timestamp on flush so we won't end up * with a stale one. Note we only need to reset the * monolithic one as we use its presence when deciding * whether to generate a synthetic timestamp. */ dev->timestamp[INPUT_CLK_MONO] = ktime_set(0, 0); } else if (dev->num_vals >= dev->max_vals - 2) { dev->vals[dev->num_vals++] = input_value_sync; input_pass_values(dev, dev->vals, dev->num_vals); dev->num_vals = 0; } } void input_handle_event(struct input_dev *dev, unsigned int type, unsigned int code, int value) { int disposition; lockdep_assert_held(&dev->event_lock); disposition = input_get_disposition(dev, type, code, &value); if (disposition != INPUT_IGNORE_EVENT) { if (type != EV_SYN) add_input_randomness(type, code, value); input_event_dispose(dev, disposition, type, code, value); } } /** * input_event() - report new input event * @dev: device that generated the event * @type: type of the event * @code: event code * @value: value of the event * * This function should be used by drivers implementing various input * devices to report input events. See also input_inject_event(). * * NOTE: input_event() may be safely used right after input device was * allocated with input_allocate_device(), even before it is registered * with input_register_device(), but the event will not reach any of the * input handlers. Such early invocation of input_event() may be used * to 'seed' initial state of a switch or initial position of absolute * axis, etc. */ void input_event(struct input_dev *dev, unsigned int type, unsigned int code, int value) { if (is_event_supported(type, dev->evbit, EV_MAX)) { guard(spinlock_irqsave)(&dev->event_lock); input_handle_event(dev, type, code, value); } } EXPORT_SYMBOL(input_event); /** * input_inject_event() - send input event from input handler * @handle: input handle to send event through * @type: type of the event * @code: event code * @value: value of the event * * Similar to input_event() but will ignore event if device is * "grabbed" and handle injecting event is not the one that owns * the device. */ void input_inject_event(struct input_handle *handle, unsigned int type, unsigned int code, int value) { struct input_dev *dev = handle->dev; struct input_handle *grab; if (is_event_supported(type, dev->evbit, EV_MAX)) { guard(spinlock_irqsave)(&dev->event_lock); guard(rcu)(); grab = rcu_dereference(dev->grab); if (!grab || grab == handle) input_handle_event(dev, type, code, value); } } EXPORT_SYMBOL(input_inject_event); /** * input_alloc_absinfo - allocates array of input_absinfo structs * @dev: the input device emitting absolute events * * If the absinfo struct the caller asked for is already allocated, this * functions will not do anything. */ void input_alloc_absinfo(struct input_dev *dev) { if (dev->absinfo) return; dev->absinfo = kcalloc(ABS_CNT, sizeof(*dev->absinfo), GFP_KERNEL); if (!dev->absinfo) { dev_err(dev->dev.parent ?: &dev->dev, "%s: unable to allocate memory\n", __func__); /* * We will handle this allocation failure in * input_register_device() when we refuse to register input * device with ABS bits but without absinfo. */ } } EXPORT_SYMBOL(input_alloc_absinfo); void input_set_abs_params(struct input_dev *dev, unsigned int axis, int min, int max, int fuzz, int flat) { struct input_absinfo *absinfo; __set_bit(EV_ABS, dev->evbit); __set_bit(axis, dev->absbit); input_alloc_absinfo(dev); if (!dev->absinfo) return; absinfo = &dev->absinfo[axis]; absinfo->minimum = min; absinfo->maximum = max; absinfo->fuzz = fuzz; absinfo->flat = flat; } EXPORT_SYMBOL(input_set_abs_params); /** * input_copy_abs - Copy absinfo from one input_dev to another * @dst: Destination input device to copy the abs settings to * @dst_axis: ABS_* value selecting the destination axis * @src: Source input device to copy the abs settings from * @src_axis: ABS_* value selecting the source axis * * Set absinfo for the selected destination axis by copying it from * the specified source input device's source axis. * This is useful to e.g. setup a pen/stylus input-device for combined * touchscreen/pen hardware where the pen uses the same coordinates as * the touchscreen. */ void input_copy_abs(struct input_dev *dst, unsigned int dst_axis, const struct input_dev *src, unsigned int src_axis) { /* src must have EV_ABS and src_axis set */ if (WARN_ON(!(test_bit(EV_ABS, src->evbit) && test_bit(src_axis, src->absbit)))) return; /* * input_alloc_absinfo() may have failed for the source. Our caller is * expected to catch this when registering the input devices, which may * happen after the input_copy_abs() call. */ if (!src->absinfo) return; input_set_capability(dst, EV_ABS, dst_axis); if (!dst->absinfo) return; dst->absinfo[dst_axis] = src->absinfo[src_axis]; } EXPORT_SYMBOL(input_copy_abs); /** * input_grab_device - grabs device for exclusive use * @handle: input handle that wants to own the device * * When a device is grabbed by an input handle all events generated by * the device are delivered only to this handle. Also events injected * by other input handles are ignored while device is grabbed. */ int input_grab_device(struct input_handle *handle) { struct input_dev *dev = handle->dev; scoped_cond_guard(mutex_intr, return -EINTR, &dev->mutex) { if (dev->grab) return -EBUSY; rcu_assign_pointer(dev->grab, handle); } return 0; } EXPORT_SYMBOL(input_grab_device); static void __input_release_device(struct input_handle *handle) { struct input_dev *dev = handle->dev; struct input_handle *grabber; grabber = rcu_dereference_protected(dev->grab, lockdep_is_held(&dev->mutex)); if (grabber == handle) { rcu_assign_pointer(dev->grab, NULL); /* Make sure input_pass_values() notices that grab is gone */ synchronize_rcu(); list_for_each_entry(handle, &dev->h_list, d_node) if (handle->open && handle->handler->start) handle->handler->start(handle); } } /** * input_release_device - release previously grabbed device * @handle: input handle that owns the device * * Releases previously grabbed device so that other input handles can * start receiving input events. Upon release all handlers attached * to the device have their start() method called so they have a change * to synchronize device state with the rest of the system. */ void input_release_device(struct input_handle *handle) { struct input_dev *dev = handle->dev; guard(mutex)(&dev->mutex); __input_release_device(handle); } EXPORT_SYMBOL(input_release_device); /** * input_open_device - open input device * @handle: handle through which device is being accessed * * This function should be called by input handlers when they * want to start receive events from given input device. */ int input_open_device(struct input_handle *handle) { struct input_dev *dev = handle->dev; int error; scoped_cond_guard(mutex_intr, return -EINTR, &dev->mutex) { if (dev->going_away) return -ENODEV; handle->open++; if (handle->handler->passive_observer) return 0; if (dev->users++ || dev->inhibited) { /* * Device is already opened and/or inhibited, * so we can exit immediately and report success. */ return 0; } if (dev->open) { error = dev->open(dev); if (error) { dev->users--; handle->open--; /* * Make sure we are not delivering any more * events through this handle. */ synchronize_rcu(); return error; } } if (dev->poller) input_dev_poller_start(dev->poller); } return 0; } EXPORT_SYMBOL(input_open_device); int input_flush_device(struct input_handle *handle, struct file *file) { struct input_dev *dev = handle->dev; scoped_cond_guard(mutex_intr, return -EINTR, &dev->mutex) { if (dev->flush) return dev->flush(dev, file); } return 0; } EXPORT_SYMBOL(input_flush_device); /** * input_close_device - close input device * @handle: handle through which device is being accessed * * This function should be called by input handlers when they * want to stop receive events from given input device. */ void input_close_device(struct input_handle *handle) { struct input_dev *dev = handle->dev; guard(mutex)(&dev->mutex); __input_release_device(handle); if (!handle->handler->passive_observer) { if (!--dev->users && !dev->inhibited) { if (dev->poller) input_dev_poller_stop(dev->poller); if (dev->close) dev->close(dev); } } if (!--handle->open) { /* * synchronize_rcu() makes sure that input_pass_values() * completed and that no more input events are delivered * through this handle */ synchronize_rcu(); } } EXPORT_SYMBOL(input_close_device); /* * Simulate keyup events for all keys that are marked as pressed. * The function must be called with dev->event_lock held. */ static bool input_dev_release_keys(struct input_dev *dev) { bool need_sync = false; int code; lockdep_assert_held(&dev->event_lock); if (is_event_supported(EV_KEY, dev->evbit, EV_MAX)) { for_each_set_bit(code, dev->key, KEY_CNT) { input_handle_event(dev, EV_KEY, code, 0); need_sync = true; } } return need_sync; } /* * Prepare device for unregistering */ static void input_disconnect_device(struct input_dev *dev) { struct input_handle *handle; /* * Mark device as going away. Note that we take dev->mutex here * not to protect access to dev->going_away but rather to ensure * that there are no threads in the middle of input_open_device() */ scoped_guard(mutex, &dev->mutex) dev->going_away = true; guard(spinlock_irq)(&dev->event_lock); /* * Simulate keyup events for all pressed keys so that handlers * are not left with "stuck" keys. The driver may continue * generate events even after we done here but they will not * reach any handlers. */ if (input_dev_release_keys(dev)) input_handle_event(dev, EV_SYN, SYN_REPORT, 1); list_for_each_entry(handle, &dev->h_list, d_node) handle->open = 0; } /** * input_scancode_to_scalar() - converts scancode in &struct input_keymap_entry * @ke: keymap entry containing scancode to be converted. * @scancode: pointer to the location where converted scancode should * be stored. * * This function is used to convert scancode stored in &struct keymap_entry * into scalar form understood by legacy keymap handling methods. These * methods expect scancodes to be represented as 'unsigned int'. */ int input_scancode_to_scalar(const struct input_keymap_entry *ke, unsigned int *scancode) { switch (ke->len) { case 1: *scancode = *((u8 *)ke->scancode); break; case 2: *scancode = *((u16 *)ke->scancode); break; case 4: *scancode = *((u32 *)ke->scancode); break; default: return -EINVAL; } return 0; } EXPORT_SYMBOL(input_scancode_to_scalar); /* * Those routines handle the default case where no [gs]etkeycode() is * defined. In this case, an array indexed by the scancode is used. */ static unsigned int input_fetch_keycode(struct input_dev *dev, unsigned int index) { switch (dev->keycodesize) { case 1: return ((u8 *)dev->keycode)[index]; case 2: return ((u16 *)dev->keycode)[index]; default: return ((u32 *)dev->keycode)[index]; } } static int input_default_getkeycode(struct input_dev *dev, struct input_keymap_entry *ke) { unsigned int index; int error; if (!dev->keycodesize) return -EINVAL; if (ke->flags & INPUT_KEYMAP_BY_INDEX) index = ke->index; else { error = input_scancode_to_scalar(ke, &index); if (error) return error; } if (index >= dev->keycodemax) return -EINVAL; ke->keycode = input_fetch_keycode(dev, index); ke->index = index; ke->len = sizeof(index); memcpy(ke->scancode, &index, sizeof(index)); return 0; } static int input_default_setkeycode(struct input_dev *dev, const struct input_keymap_entry *ke, unsigned int *old_keycode) { unsigned int index; int error; int i; if (!dev->keycodesize) return -EINVAL; if (ke->flags & INPUT_KEYMAP_BY_INDEX) { index = ke->index; } else { error = input_scancode_to_scalar(ke, &index); if (error) return error; } if (index >= dev->keycodemax) return -EINVAL; if (dev->keycodesize < sizeof(ke->keycode) && (ke->keycode >> (dev->keycodesize * 8))) return -EINVAL; switch (dev->keycodesize) { case 1: { u8 *k = (u8 *)dev->keycode; *old_keycode = k[index]; k[index] = ke->keycode; break; } case 2: { u16 *k = (u16 *)dev->keycode; *old_keycode = k[index]; k[index] = ke->keycode; break; } default: { u32 *k = (u32 *)dev->keycode; *old_keycode = k[index]; k[index] = ke->keycode; break; } } if (*old_keycode <= KEY_MAX) { __clear_bit(*old_keycode, dev->keybit); for (i = 0; i < dev->keycodemax; i++) { if (input_fetch_keycode(dev, i) == *old_keycode) { __set_bit(*old_keycode, dev->keybit); /* Setting the bit twice is useless, so break */ break; } } } __set_bit(ke->keycode, dev->keybit); return 0; } /** * input_get_keycode - retrieve keycode currently mapped to a given scancode * @dev: input device which keymap is being queried * @ke: keymap entry * * This function should be called by anyone interested in retrieving current * keymap. Presently evdev handlers use it. */ int input_get_keycode(struct input_dev *dev, struct input_keymap_entry *ke) { guard(spinlock_irqsave)(&dev->event_lock); return dev->getkeycode(dev, ke); } EXPORT_SYMBOL(input_get_keycode); /** * input_set_keycode - attribute a keycode to a given scancode * @dev: input device which keymap is being updated * @ke: new keymap entry * * This function should be called by anyone needing to update current * keymap. Presently keyboard and evdev handlers use it. */ int input_set_keycode(struct input_dev *dev, const struct input_keymap_entry *ke) { unsigned int old_keycode; int error; if (ke->keycode > KEY_MAX) return -EINVAL; guard(spinlock_irqsave)(&dev->event_lock); error = dev->setkeycode(dev, ke, &old_keycode); if (error) return error; /* Make sure KEY_RESERVED did not get enabled. */ __clear_bit(KEY_RESERVED, dev->keybit); /* * Simulate keyup event if keycode is not present * in the keymap anymore */ if (old_keycode > KEY_MAX) { dev_warn(dev->dev.parent ?: &dev->dev, "%s: got too big old keycode %#x\n", __func__, old_keycode); } else if (test_bit(EV_KEY, dev->evbit) && !is_event_supported(old_keycode, dev->keybit, KEY_MAX) && __test_and_clear_bit(old_keycode, dev->key)) { /* * We have to use input_event_dispose() here directly instead * of input_handle_event() because the key we want to release * here is considered no longer supported by the device and * input_handle_event() will ignore it. */ input_event_dispose(dev, INPUT_PASS_TO_HANDLERS, EV_KEY, old_keycode, 0); input_event_dispose(dev, INPUT_PASS_TO_HANDLERS | INPUT_FLUSH, EV_SYN, SYN_REPORT, 1); } return 0; } EXPORT_SYMBOL(input_set_keycode); bool input_match_device_id(const struct input_dev *dev, const struct input_device_id *id) { if (id->flags & INPUT_DEVICE_ID_MATCH_BUS) if (id->bustype != dev->id.bustype) return false; if (id->flags & INPUT_DEVICE_ID_MATCH_VENDOR) if (id->vendor != dev->id.vendor) return false; if (id->flags & INPUT_DEVICE_ID_MATCH_PRODUCT) if (id->product != dev->id.product) return false; if (id->flags & INPUT_DEVICE_ID_MATCH_VERSION) if (id->version != dev->id.version) return false; if (!bitmap_subset(id->evbit, dev->evbit, EV_MAX) || !bitmap_subset(id->keybit, dev->keybit, KEY_MAX) || !bitmap_subset(id->relbit, dev->relbit, REL_MAX) || !bitmap_subset(id->absbit, dev->absbit, ABS_MAX) || !bitmap_subset(id->mscbit, dev->mscbit, MSC_MAX) || !bitmap_subset(id->ledbit, dev->ledbit, LED_MAX) || !bitmap_subset(id->sndbit, dev->sndbit, SND_MAX) || !bitmap_subset(id->ffbit, dev->ffbit, FF_MAX) || !bitmap_subset(id->swbit, dev->swbit, SW_MAX) || !bitmap_subset(id->propbit, dev->propbit, INPUT_PROP_MAX)) { return false; } return true; } EXPORT_SYMBOL(input_match_device_id); static const struct input_device_id *input_match_device(struct input_handler *handler, struct input_dev *dev) { const struct input_device_id *id; for (id = handler->id_table; id->flags || id->driver_info; id++) { if (input_match_device_id(dev, id) && (!handler->match || handler->match(handler, dev))) { return id; } } return NULL; } static int input_attach_handler(struct input_dev *dev, struct input_handler *handler) { const struct input_device_id *id; int error; id = input_match_device(handler, dev); if (!id) return -ENODEV; error = handler->connect(handler, dev, id); if (error && error != -ENODEV) pr_err("failed to attach handler %s to device %s, error: %d\n", handler->name, kobject_name(&dev->dev.kobj), error); return error; } #ifdef CONFIG_COMPAT static int input_bits_to_string(char *buf, int buf_size, unsigned long bits, bool skip_empty) { int len = 0; if (in_compat_syscall()) { u32 dword = bits >> 32; if (dword || !skip_empty) len += snprintf(buf, buf_size, "%x ", dword); dword = bits & 0xffffffffUL; if (dword || !skip_empty || len) len += snprintf(buf + len, max(buf_size - len, 0), "%x", dword); } else { if (bits || !skip_empty) len += snprintf(buf, buf_size, "%lx", bits); } return len; } #else /* !CONFIG_COMPAT */ static int input_bits_to_string(char *buf, int buf_size, unsigned long bits, bool skip_empty) { return bits || !skip_empty ? snprintf(buf, buf_size, "%lx", bits) : 0; } #endif #ifdef CONFIG_PROC_FS static struct proc_dir_entry *proc_bus_input_dir; static DECLARE_WAIT_QUEUE_HEAD(input_devices_poll_wait); static int input_devices_state; static inline void input_wakeup_procfs_readers(void) { input_devices_state++; wake_up(&input_devices_poll_wait); } struct input_seq_state { unsigned short pos; bool mutex_acquired; int input_devices_state; }; static __poll_t input_proc_devices_poll(struct file *file, poll_table *wait) { struct seq_file *seq = file->private_data; struct input_seq_state *state = seq->private; poll_wait(file, &input_devices_poll_wait, wait); if (state->input_devices_state != input_devices_state) { state->input_devices_state = input_devices_state; return EPOLLIN | EPOLLRDNORM; } return 0; } static void *input_devices_seq_start(struct seq_file *seq, loff_t *pos) { struct input_seq_state *state = seq->private; int error; error = mutex_lock_interruptible(&input_mutex); if (error) { state->mutex_acquired = false; return ERR_PTR(error); } state->mutex_acquired = true; return seq_list_start(&input_dev_list, *pos); } static void *input_devices_seq_next(struct seq_file *seq, void *v, loff_t *pos) { return seq_list_next(v, &input_dev_list, pos); } static void input_seq_stop(struct seq_file *seq, void *v) { struct input_seq_state *state = seq->private; if (state->mutex_acquired) mutex_unlock(&input_mutex); } static void input_seq_print_bitmap(struct seq_file *seq, const char *name, unsigned long *bitmap, int max) { int i; bool skip_empty = true; char buf[18]; seq_printf(seq, "B: %s=", name); for (i = BITS_TO_LONGS(max) - 1; i >= 0; i--) { if (input_bits_to_string(buf, sizeof(buf), bitmap[i], skip_empty)) { skip_empty = false; seq_printf(seq, "%s%s", buf, i > 0 ? " " : ""); } } /* * If no output was produced print a single 0. */ if (skip_empty) seq_putc(seq, '0'); seq_putc(seq, '\n'); } static int input_devices_seq_show(struct seq_file *seq, void *v) { struct input_dev *dev = container_of(v, struct input_dev, node); const char *path = kobject_get_path(&dev->dev.kobj, GFP_KERNEL); struct input_handle *handle; seq_printf(seq, "I: Bus=%04x Vendor=%04x Product=%04x Version=%04x\n", dev->id.bustype, dev->id.vendor, dev->id.product, dev->id.version); seq_printf(seq, "N: Name=\"%s\"\n", dev->name ? dev->name : ""); seq_printf(seq, "P: Phys=%s\n", dev->phys ? dev->phys : ""); seq_printf(seq, "S: Sysfs=%s\n", path ? path : ""); seq_printf(seq, "U: Uniq=%s\n", dev->uniq ? dev->uniq : ""); seq_puts(seq, "H: Handlers="); list_for_each_entry(handle, &dev->h_list, d_node) seq_printf(seq, "%s ", handle->name); seq_putc(seq, '\n'); input_seq_print_bitmap(seq, "PROP", dev->propbit, INPUT_PROP_MAX); input_seq_print_bitmap(seq, "EV", dev->evbit, EV_MAX); if (test_bit(EV_KEY, dev->evbit)) input_seq_print_bitmap(seq, "KEY", dev->keybit, KEY_MAX); if (test_bit(EV_REL, dev->evbit)) input_seq_print_bitmap(seq, "REL", dev->relbit, REL_MAX); if (test_bit(EV_ABS, dev->evbit)) input_seq_print_bitmap(seq, "ABS", dev->absbit, ABS_MAX); if (test_bit(EV_MSC, dev->evbit)) input_seq_print_bitmap(seq, "MSC", dev->mscbit, MSC_MAX); if (test_bit(EV_LED, dev->evbit)) input_seq_print_bitmap(seq, "LED", dev->ledbit, LED_MAX); if (test_bit(EV_SND, dev->evbit)) input_seq_print_bitmap(seq, "SND", dev->sndbit, SND_MAX); if (test_bit(EV_FF, dev->evbit)) input_seq_print_bitmap(seq, "FF", dev->ffbit, FF_MAX); if (test_bit(EV_SW, dev->evbit)) input_seq_print_bitmap(seq, "SW", dev->swbit, SW_MAX); seq_putc(seq, '\n'); kfree(path); return 0; } static const struct seq_operations input_devices_seq_ops = { .start = input_devices_seq_start, .next = input_devices_seq_next, .stop = input_seq_stop, .show = input_devices_seq_show, }; static int input_proc_devices_open(struct inode *inode, struct file *file) { return seq_open_private(file, &input_devices_seq_ops, sizeof(struct input_seq_state)); } static const struct proc_ops input_devices_proc_ops = { .proc_open = input_proc_devices_open, .proc_poll = input_proc_devices_poll, .proc_read = seq_read, .proc_lseek = seq_lseek, .proc_release = seq_release_private, }; static void *input_handlers_seq_start(struct seq_file *seq, loff_t *pos) { struct input_seq_state *state = seq->private; int error; error = mutex_lock_interruptible(&input_mutex); if (error) { state->mutex_acquired = false; return ERR_PTR(error); } state->mutex_acquired = true; state->pos = *pos; return seq_list_start(&input_handler_list, *pos); } static void *input_handlers_seq_next(struct seq_file *seq, void *v, loff_t *pos) { struct input_seq_state *state = seq->private; state->pos = *pos + 1; return seq_list_next(v, &input_handler_list, pos); } static int input_handlers_seq_show(struct seq_file *seq, void *v) { struct input_handler *handler = container_of(v, struct input_handler, node); struct input_seq_state *state = seq->private; seq_printf(seq, "N: Number=%u Name=%s", state->pos, handler->name); if (handler->filter) seq_puts(seq, " (filter)"); if (handler->legacy_minors) seq_printf(seq, " Minor=%d", handler->minor); seq_putc(seq, '\n'); return 0; } static const struct seq_operations input_handlers_seq_ops = { .start = input_handlers_seq_start, .next = input_handlers_seq_next, .stop = input_seq_stop, .show = input_handlers_seq_show, }; static int input_proc_handlers_open(struct inode *inode, struct file *file) { return seq_open_private(file, &input_handlers_seq_ops, sizeof(struct input_seq_state)); } static const struct proc_ops input_handlers_proc_ops = { .proc_open = input_proc_handlers_open, .proc_read = seq_read, .proc_lseek = seq_lseek, .proc_release = seq_release_private, }; static int __init input_proc_init(void) { struct proc_dir_entry *entry; proc_bus_input_dir = proc_mkdir("bus/input", NULL); if (!proc_bus_input_dir) return -ENOMEM; entry = proc_create("devices", 0, proc_bus_input_dir, &input_devices_proc_ops); if (!entry) goto fail1; entry = proc_create("handlers", 0, proc_bus_input_dir, &input_handlers_proc_ops); if (!entry) goto fail2; return 0; fail2: remove_proc_entry("devices", proc_bus_input_dir); fail1: remove_proc_entry("bus/input", NULL); return -ENOMEM; } static void input_proc_exit(void) { remove_proc_entry("devices", proc_bus_input_dir); remove_proc_entry("handlers", proc_bus_input_dir); remove_proc_entry("bus/input", NULL); } #else /* !CONFIG_PROC_FS */ static inline void input_wakeup_procfs_readers(void) { } static inline int input_proc_init(void) { return 0; } static inline void input_proc_exit(void) { } #endif #define INPUT_DEV_STRING_ATTR_SHOW(name) \ static ssize_t input_dev_show_##name(struct device *dev, \ struct device_attribute *attr, \ char *buf) \ { \ struct input_dev *input_dev = to_input_dev(dev); \ \ return sysfs_emit(buf, "%s\n", \ input_dev->name ? input_dev->name : ""); \ } \ static DEVICE_ATTR(name, S_IRUGO, input_dev_show_##name, NULL) INPUT_DEV_STRING_ATTR_SHOW(name); INPUT_DEV_STRING_ATTR_SHOW(phys); INPUT_DEV_STRING_ATTR_SHOW(uniq); static int input_print_modalias_bits(char *buf, int size, char name, const unsigned long *bm, unsigned int min_bit, unsigned int max_bit) { int bit = min_bit; int len = 0; len += snprintf(buf, max(size, 0), "%c", name); for_each_set_bit_from(bit, bm, max_bit) len += snprintf(buf + len, max(size - len, 0), "%X,", bit); return len; } static int input_print_modalias_parts(char *buf, int size, int full_len, const struct input_dev *id) { int len, klen, remainder, space; len = snprintf(buf, max(size, 0), "input:b%04Xv%04Xp%04Xe%04X-", id->id.bustype, id->id.vendor, id->id.product, id->id.version); len += input_print_modalias_bits(buf + len, size - len, 'e', id->evbit, 0, EV_MAX); /* * Calculate the remaining space in the buffer making sure we * have place for the terminating 0. */ space = max(size - (len + 1), 0); klen = input_print_modalias_bits(buf + len, size - len, 'k', id->keybit, KEY_MIN_INTERESTING, KEY_MAX); len += klen; /* * If we have more data than we can fit in the buffer, check * if we can trim key data to fit in the rest. We will indicate * that key data is incomplete by adding "+" sign at the end, like * this: * "k1,2,3,45,+,". * * Note that we shortest key info (if present) is "k+," so we * can only try to trim if key data is longer than that. */ if (full_len && size < full_len + 1 && klen > 3) { remainder = full_len - len; /* * We can only trim if we have space for the remainder * and also for at least "k+," which is 3 more characters. */ if (remainder <= space - 3) { /* * We are guaranteed to have 'k' in the buffer, so * we need at least 3 additional bytes for storing * "+," in addition to the remainder. */ for (int i = size - 1 - remainder - 3; i >= 0; i--) { if (buf[i] == 'k' || buf[i] == ',') { strcpy(buf + i + 1, "+,"); len = i + 3; /* Not counting '\0' */ break; } } } } len += input_print_modalias_bits(buf + len, size - len, 'r', id->relbit, 0, REL_MAX); len += input_print_modalias_bits(buf + len, size - len, 'a', id->absbit, 0, ABS_MAX); len += input_print_modalias_bits(buf + len, size - len, 'm', id->mscbit, 0, MSC_MAX); len += input_print_modalias_bits(buf + len, size - len, 'l', id->ledbit, 0, LED_MAX); len += input_print_modalias_bits(buf + len, size - len, 's', id->sndbit, 0, SND_MAX); len += input_print_modalias_bits(buf + len, size - len, 'f', id->ffbit, 0, FF_MAX); len += input_print_modalias_bits(buf + len, size - len, 'w', id->swbit, 0, SW_MAX); return len; } static int input_print_modalias(char *buf, int size, const struct input_dev *id) { int full_len; /* * Printing is done in 2 passes: first one figures out total length * needed for the modalias string, second one will try to trim key * data in case when buffer is too small for the entire modalias. * If the buffer is too small regardless, it will fill as much as it * can (without trimming key data) into the buffer and leave it to * the caller to figure out what to do with the result. */ full_len = input_print_modalias_parts(NULL, 0, 0, id); return input_print_modalias_parts(buf, size, full_len, id); } static ssize_t input_dev_show_modalias(struct device *dev, struct device_attribute *attr, char *buf) { struct input_dev *id = to_input_dev(dev); ssize_t len; len = input_print_modalias(buf, PAGE_SIZE, id); if (len < PAGE_SIZE - 2) len += snprintf(buf + len, PAGE_SIZE - len, "\n"); return min_t(int, len, PAGE_SIZE); } static DEVICE_ATTR(modalias, S_IRUGO, input_dev_show_modalias, NULL); static int input_print_bitmap(char *buf, int buf_size, const unsigned long *bitmap, int max, int add_cr); static ssize_t input_dev_show_properties(struct device *dev, struct device_attribute *attr, char *buf) { struct input_dev *input_dev = to_input_dev(dev); int len = input_print_bitmap(buf, PAGE_SIZE, input_dev->propbit, INPUT_PROP_MAX, true); return min_t(int, len, PAGE_SIZE); } static DEVICE_ATTR(properties, S_IRUGO, input_dev_show_properties, NULL); static int input_inhibit_device(struct input_dev *dev); static int input_uninhibit_device(struct input_dev *dev); static ssize_t inhibited_show(struct device *dev, struct device_attribute *attr, char *buf) { struct input_dev *input_dev = to_input_dev(dev); return sysfs_emit(buf, "%d\n", input_dev->inhibited); } static ssize_t inhibited_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { struct input_dev *input_dev = to_input_dev(dev); ssize_t rv; bool inhibited; if (kstrtobool(buf, &inhibited)) return -EINVAL; if (inhibited) rv = input_inhibit_device(input_dev); else rv = input_uninhibit_device(input_dev); if (rv != 0) return rv; return len; } static DEVICE_ATTR_RW(inhibited); static struct attribute *input_dev_attrs[] = { &dev_attr_name.attr, &dev_attr_phys.attr, &dev_attr_uniq.attr, &dev_attr_modalias.attr, &dev_attr_properties.attr, &dev_attr_inhibited.attr, NULL }; static const struct attribute_group input_dev_attr_group = { .attrs = input_dev_attrs, }; #define INPUT_DEV_ID_ATTR(name) \ static ssize_t input_dev_show_id_##name(struct device *dev, \ struct device_attribute *attr, \ char *buf) \ { \ struct input_dev *input_dev = to_input_dev(dev); \ return sysfs_emit(buf, "%04x\n", input_dev->id.name); \ } \ static DEVICE_ATTR(name, S_IRUGO, input_dev_show_id_##name, NULL) INPUT_DEV_ID_ATTR(bustype); INPUT_DEV_ID_ATTR(vendor); INPUT_DEV_ID_ATTR(product); INPUT_DEV_ID_ATTR(version); static struct attribute *input_dev_id_attrs[] = { &dev_attr_bustype.attr, &dev_attr_vendor.attr, &dev_attr_product.attr, &dev_attr_version.attr, NULL }; static const struct attribute_group input_dev_id_attr_group = { .name = "id", .attrs = input_dev_id_attrs, }; static int input_print_bitmap(char *buf, int buf_size, const unsigned long *bitmap, int max, int add_cr) { int i; int len = 0; bool skip_empty = true; for (i = BITS_TO_LONGS(max) - 1; i >= 0; i--) { len += input_bits_to_string(buf + len, max(buf_size - len, 0), bitmap[i], skip_empty); if (len) { skip_empty = false; if (i > 0) len += snprintf(buf + len, max(buf_size - len, 0), " "); } } /* * If no output was produced print a single 0. */ if (len == 0) len = snprintf(buf, buf_size, "%d", 0); if (add_cr) len += snprintf(buf + len, max(buf_size - len, 0), "\n"); return len; } #define INPUT_DEV_CAP_ATTR(ev, bm) \ static ssize_t input_dev_show_cap_##bm(struct device *dev, \ struct device_attribute *attr, \ char *buf) \ { \ struct input_dev *input_dev = to_input_dev(dev); \ int len = input_print_bitmap(buf, PAGE_SIZE, \ input_dev->bm##bit, ev##_MAX, \ true); \ return min_t(int, len, PAGE_SIZE); \ } \ static DEVICE_ATTR(bm, S_IRUGO, input_dev_show_cap_##bm, NULL) INPUT_DEV_CAP_ATTR(EV, ev); INPUT_DEV_CAP_ATTR(KEY, key); INPUT_DEV_CAP_ATTR(REL, rel); INPUT_DEV_CAP_ATTR(ABS, abs); INPUT_DEV_CAP_ATTR(MSC, msc); INPUT_DEV_CAP_ATTR(LED, led); INPUT_DEV_CAP_ATTR(SND, snd); INPUT_DEV_CAP_ATTR(FF, ff); INPUT_DEV_CAP_ATTR(SW, sw); static struct attribute *input_dev_caps_attrs[] = { &dev_attr_ev.attr, &dev_attr_key.attr, &dev_attr_rel.attr, &dev_attr_abs.attr, &dev_attr_msc.attr, &dev_attr_led.attr, &dev_attr_snd.attr, &dev_attr_ff.attr, &dev_attr_sw.attr, NULL }; static const struct attribute_group input_dev_caps_attr_group = { .name = "capabilities", .attrs = input_dev_caps_attrs, }; static const struct attribute_group *input_dev_attr_groups[] = { &input_dev_attr_group, &input_dev_id_attr_group, &input_dev_caps_attr_group, &input_poller_attribute_group, NULL }; static void input_dev_release(struct device *device) { struct input_dev *dev = to_input_dev(device); input_ff_destroy(dev); input_mt_destroy_slots(dev); kfree(dev->poller); kfree(dev->absinfo); kfree(dev->vals); kfree(dev); module_put(THIS_MODULE); } /* * Input uevent interface - loading event handlers based on * device bitfields. */ static int input_add_uevent_bm_var(struct kobj_uevent_env *env, const char *name, const unsigned long *bitmap, int max) { int len; if (add_uevent_var(env, "%s", name)) return -ENOMEM; len = input_print_bitmap(&env->buf[env->buflen - 1], sizeof(env->buf) - env->buflen, bitmap, max, false); if (len >= (sizeof(env->buf) - env->buflen)) return -ENOMEM; env->buflen += len; return 0; } /* * This is a pretty gross hack. When building uevent data the driver core * may try adding more environment variables to kobj_uevent_env without * telling us, so we have no idea how much of the buffer we can use to * avoid overflows/-ENOMEM elsewhere. To work around this let's artificially * reduce amount of memory we will use for the modalias environment variable. * * The potential additions are: * * SEQNUM=18446744073709551615 - (%llu - 28 bytes) * HOME=/ (6 bytes) * PATH=/sbin:/bin:/usr/sbin:/usr/bin (34 bytes) * * 68 bytes total. Allow extra buffer - 96 bytes */ #define UEVENT_ENV_EXTRA_LEN 96 static int input_add_uevent_modalias_var(struct kobj_uevent_env *env, const struct input_dev *dev) { int len; if (add_uevent_var(env, "MODALIAS=")) return -ENOMEM; len = input_print_modalias(&env->buf[env->buflen - 1], (int)sizeof(env->buf) - env->buflen - UEVENT_ENV_EXTRA_LEN, dev); if (len >= ((int)sizeof(env->buf) - env->buflen - UEVENT_ENV_EXTRA_LEN)) return -ENOMEM; env->buflen += len; return 0; } #define INPUT_ADD_HOTPLUG_VAR(fmt, val...) \ do { \ int err = add_uevent_var(env, fmt, val); \ if (err) \ return err; \ } while (0) #define INPUT_ADD_HOTPLUG_BM_VAR(name, bm, max) \ do { \ int err = input_add_uevent_bm_var(env, name, bm, max); \ if (err) \ return err; \ } while (0) #define INPUT_ADD_HOTPLUG_MODALIAS_VAR(dev) \ do { \ int err = input_add_uevent_modalias_var(env, dev); \ if (err) \ return err; \ } while (0) static int input_dev_uevent(const struct device *device, struct kobj_uevent_env *env) { const struct input_dev *dev = to_input_dev(device); INPUT_ADD_HOTPLUG_VAR("PRODUCT=%x/%x/%x/%x", dev->id.bustype, dev->id.vendor, dev->id.product, dev->id.version); if (dev->name) INPUT_ADD_HOTPLUG_VAR("NAME=\"%s\"", dev->name); if (dev->phys) INPUT_ADD_HOTPLUG_VAR("PHYS=\"%s\"", dev->phys); if (dev->uniq) INPUT_ADD_HOTPLUG_VAR("UNIQ=\"%s\"", dev->uniq); INPUT_ADD_HOTPLUG_BM_VAR("PROP=", dev->propbit, INPUT_PROP_MAX); INPUT_ADD_HOTPLUG_BM_VAR("EV=", dev->evbit, EV_MAX); if (test_bit(EV_KEY, dev->evbit)) INPUT_ADD_HOTPLUG_BM_VAR("KEY=", dev->keybit, KEY_MAX); if (test_bit(EV_REL, dev->evbit)) INPUT_ADD_HOTPLUG_BM_VAR("REL=", dev->relbit, REL_MAX); if (test_bit(EV_ABS, dev->evbit)) INPUT_ADD_HOTPLUG_BM_VAR("ABS=", dev->absbit, ABS_MAX); if (test_bit(EV_MSC, dev->evbit)) INPUT_ADD_HOTPLUG_BM_VAR("MSC=", dev->mscbit, MSC_MAX); if (test_bit(EV_LED, dev->evbit)) INPUT_ADD_HOTPLUG_BM_VAR("LED=", dev->ledbit, LED_MAX); if (test_bit(EV_SND, dev->evbit)) INPUT_ADD_HOTPLUG_BM_VAR("SND=", dev->sndbit, SND_MAX); if (test_bit(EV_FF, dev->evbit)) INPUT_ADD_HOTPLUG_BM_VAR("FF=", dev->ffbit, FF_MAX); if (test_bit(EV_SW, dev->evbit)) INPUT_ADD_HOTPLUG_BM_VAR("SW=", dev->swbit, SW_MAX); INPUT_ADD_HOTPLUG_MODALIAS_VAR(dev); return 0; } #define INPUT_DO_TOGGLE(dev, type, bits, on) \ do { \ int i; \ bool active; \ \ if (!test_bit(EV_##type, dev->evbit)) \ break; \ \ for_each_set_bit(i, dev->bits##bit, type##_CNT) { \ active = test_bit(i, dev->bits); \ if (!active && !on) \ continue; \ \ dev->event(dev, EV_##type, i, on ? active : 0); \ } \ } while (0) static void input_dev_toggle(struct input_dev *dev, bool activate) { if (!dev->event) return; INPUT_DO_TOGGLE(dev, LED, led, activate); INPUT_DO_TOGGLE(dev, SND, snd, activate); if (activate && test_bit(EV_REP, dev->evbit)) { dev->event(dev, EV_REP, REP_PERIOD, dev->rep[REP_PERIOD]); dev->event(dev, EV_REP, REP_DELAY, dev->rep[REP_DELAY]); } } /** * input_reset_device() - reset/restore the state of input device * @dev: input device whose state needs to be reset * * This function tries to reset the state of an opened input device and * bring internal state and state if the hardware in sync with each other. * We mark all keys as released, restore LED state, repeat rate, etc. */ void input_reset_device(struct input_dev *dev) { guard(mutex)(&dev->mutex); guard(spinlock_irqsave)(&dev->event_lock); input_dev_toggle(dev, true); if (input_dev_release_keys(dev)) input_handle_event(dev, EV_SYN, SYN_REPORT, 1); } EXPORT_SYMBOL(input_reset_device); static int input_inhibit_device(struct input_dev *dev) { guard(mutex)(&dev->mutex); if (dev->inhibited) return 0; if (dev->users) { if (dev->close) dev->close(dev); if (dev->poller) input_dev_poller_stop(dev->poller); } scoped_guard(spinlock_irq, &dev->event_lock) { input_mt_release_slots(dev); input_dev_release_keys(dev); input_handle_event(dev, EV_SYN, SYN_REPORT, 1); input_dev_toggle(dev, false); } dev->inhibited = true; return 0; } static int input_uninhibit_device(struct input_dev *dev) { int error; guard(mutex)(&dev->mutex); if (!dev->inhibited) return 0; if (dev->users) { if (dev->open) { error = dev->open(dev); if (error) return error; } if (dev->poller) input_dev_poller_start(dev->poller); } dev->inhibited = false; scoped_guard(spinlock_irq, &dev->event_lock) input_dev_toggle(dev, true); return 0; } static int input_dev_suspend(struct device *dev) { struct input_dev *input_dev = to_input_dev(dev); guard(spinlock_irq)(&input_dev->event_lock); /* * Keys that are pressed now are unlikely to be * still pressed when we resume. */ if (input_dev_release_keys(input_dev)) input_handle_event(input_dev, EV_SYN, SYN_REPORT, 1); /* Turn off LEDs and sounds, if any are active. */ input_dev_toggle(input_dev, false); return 0; } static int input_dev_resume(struct device *dev) { struct input_dev *input_dev = to_input_dev(dev); guard(spinlock_irq)(&input_dev->event_lock); /* Restore state of LEDs and sounds, if any were active. */ input_dev_toggle(input_dev, true); return 0; } static int input_dev_freeze(struct device *dev) { struct input_dev *input_dev = to_input_dev(dev); guard(spinlock_irq)(&input_dev->event_lock); /* * Keys that are pressed now are unlikely to be * still pressed when we resume. */ if (input_dev_release_keys(input_dev)) input_handle_event(input_dev, EV_SYN, SYN_REPORT, 1); return 0; } static int input_dev_poweroff(struct device *dev) { struct input_dev *input_dev = to_input_dev(dev); guard(spinlock_irq)(&input_dev->event_lock); /* Turn off LEDs and sounds, if any are active. */ input_dev_toggle(input_dev, false); return 0; } static const struct dev_pm_ops input_dev_pm_ops = { .suspend = input_dev_suspend, .resume = input_dev_resume, .freeze = input_dev_freeze, .poweroff = input_dev_poweroff, .restore = input_dev_resume, }; static const struct device_type input_dev_type = { .groups = input_dev_attr_groups, .release = input_dev_release, .uevent = input_dev_uevent, .pm = pm_sleep_ptr(&input_dev_pm_ops), }; static char *input_devnode(const struct device *dev, umode_t *mode) { return kasprintf(GFP_KERNEL, "input/%s", dev_name(dev)); } const struct class input_class = { .name = "input", .devnode = input_devnode, }; EXPORT_SYMBOL_GPL(input_class); /** * input_allocate_device - allocate memory for new input device * * Returns prepared struct input_dev or %NULL. * * NOTE: Use input_free_device() to free devices that have not been * registered; input_unregister_device() should be used for already * registered devices. */ struct input_dev *input_allocate_device(void) { static atomic_t input_no = ATOMIC_INIT(-1); struct input_dev *dev; dev = kzalloc(sizeof(*dev), GFP_KERNEL); if (!dev) return NULL; /* * Start with space for SYN_REPORT + 7 EV_KEY/EV_MSC events + 2 spare, * see input_estimate_events_per_packet(). We will tune the number * when we register the device. */ dev->max_vals = 10; dev->vals = kcalloc(dev->max_vals, sizeof(*dev->vals), GFP_KERNEL); if (!dev->vals) { kfree(dev); return NULL; } mutex_init(&dev->mutex); spin_lock_init(&dev->event_lock); timer_setup(&dev->timer, NULL, 0); INIT_LIST_HEAD(&dev->h_list); INIT_LIST_HEAD(&dev->node); dev->dev.type = &input_dev_type; dev->dev.class = &input_class; device_initialize(&dev->dev); /* * From this point on we can no longer simply "kfree(dev)", we need * to use input_free_device() so that device core properly frees its * resources associated with the input device. */ dev_set_name(&dev->dev, "input%lu", (unsigned long)atomic_inc_return(&input_no)); __module_get(THIS_MODULE); return dev; } EXPORT_SYMBOL(input_allocate_device); struct input_devres { struct input_dev *input; }; static int devm_input_device_match(struct device *dev, void *res, void *data) { struct input_devres *devres = res; return devres->input == data; } static void devm_input_device_release(struct device *dev, void *res) { struct input_devres *devres = res; struct input_dev *input = devres->input; dev_dbg(dev, "%s: dropping reference to %s\n", __func__, dev_name(&input->dev)); input_put_device(input); } /** * devm_input_allocate_device - allocate managed input device * @dev: device owning the input device being created * * Returns prepared struct input_dev or %NULL. * * Managed input devices do not need to be explicitly unregistered or * freed as it will be done automatically when owner device unbinds from * its driver (or binding fails). Once managed input device is allocated, * it is ready to be set up and registered in the same fashion as regular * input device. There are no special devm_input_device_[un]register() * variants, regular ones work with both managed and unmanaged devices, * should you need them. In most cases however, managed input device need * not be explicitly unregistered or freed. * * NOTE: the owner device is set up as parent of input device and users * should not override it. */ struct input_dev *devm_input_allocate_device(struct device *dev) { struct input_dev *input; struct input_devres *devres; devres = devres_alloc(devm_input_device_release, sizeof(*devres), GFP_KERNEL); if (!devres) return NULL; input = input_allocate_device(); if (!input) { devres_free(devres); return NULL; } input->dev.parent = dev; input->devres_managed = true; devres->input = input; devres_add(dev, devres); return input; } EXPORT_SYMBOL(devm_input_allocate_device); /** * input_free_device - free memory occupied by input_dev structure * @dev: input device to free * * This function should only be used if input_register_device() * was not called yet or if it failed. Once device was registered * use input_unregister_device() and memory will be freed once last * reference to the device is dropped. * * Device should be allocated by input_allocate_device(). * * NOTE: If there are references to the input device then memory * will not be freed until last reference is dropped. */ void input_free_device(struct input_dev *dev) { if (dev) { if (dev->devres_managed) WARN_ON(devres_destroy(dev->dev.parent, devm_input_device_release, devm_input_device_match, dev)); input_put_device(dev); } } EXPORT_SYMBOL(input_free_device); /** * input_set_timestamp - set timestamp for input events * @dev: input device to set timestamp for * @timestamp: the time at which the event has occurred * in CLOCK_MONOTONIC * * This function is intended to provide to the input system a more * accurate time of when an event actually occurred. The driver should * call this function as soon as a timestamp is acquired ensuring * clock conversions in input_set_timestamp are done correctly. * * The system entering suspend state between timestamp acquisition and * calling input_set_timestamp can result in inaccurate conversions. */ void input_set_timestamp(struct input_dev *dev, ktime_t timestamp) { dev->timestamp[INPUT_CLK_MONO] = timestamp; dev->timestamp[INPUT_CLK_REAL] = ktime_mono_to_real(timestamp); dev->timestamp[INPUT_CLK_BOOT] = ktime_mono_to_any(timestamp, TK_OFFS_BOOT); } EXPORT_SYMBOL(input_set_timestamp); /** * input_get_timestamp - get timestamp for input events * @dev: input device to get timestamp from * * A valid timestamp is a timestamp of non-zero value. */ ktime_t *input_get_timestamp(struct input_dev *dev) { const ktime_t invalid_timestamp = ktime_set(0, 0); if (!ktime_compare(dev->timestamp[INPUT_CLK_MONO], invalid_timestamp)) input_set_timestamp(dev, ktime_get()); return dev->timestamp; } EXPORT_SYMBOL(input_get_timestamp); /** * input_set_capability - mark device as capable of a certain event * @dev: device that is capable of emitting or accepting event * @type: type of the event (EV_KEY, EV_REL, etc...) * @code: event code * * In addition to setting up corresponding bit in appropriate capability * bitmap the function also adjusts dev->evbit. */ void input_set_capability(struct input_dev *dev, unsigned int type, unsigned int code) { if (type < EV_CNT && input_max_code[type] && code > input_max_code[type]) { pr_err("%s: invalid code %u for type %u\n", __func__, code, type); dump_stack(); return; } switch (type) { case EV_KEY: __set_bit(code, dev->keybit); break; case EV_REL: __set_bit(code, dev->relbit); break; case EV_ABS: input_alloc_absinfo(dev); __set_bit(code, dev->absbit); break; case EV_MSC: __set_bit(code, dev->mscbit); break; case EV_SW: __set_bit(code, dev->swbit); break; case EV_LED: __set_bit(code, dev->ledbit); break; case EV_SND: __set_bit(code, dev->sndbit); break; case EV_FF: __set_bit(code, dev->ffbit); break; case EV_PWR: /* do nothing */ break; default: pr_err("%s: unknown type %u (code %u)\n", __func__, type, code); dump_stack(); return; } __set_bit(type, dev->evbit); } EXPORT_SYMBOL(input_set_capability); static unsigned int input_estimate_events_per_packet(struct input_dev *dev) { int mt_slots; int i; unsigned int events; if (dev->mt) { mt_slots = dev->mt->num_slots; } else if (test_bit(ABS_MT_TRACKING_ID, dev->absbit)) { mt_slots = dev->absinfo[ABS_MT_TRACKING_ID].maximum - dev->absinfo[ABS_MT_TRACKING_ID].minimum + 1; mt_slots = clamp(mt_slots, 2, 32); } else if (test_bit(ABS_MT_POSITION_X, dev->absbit)) { mt_slots = 2; } else { mt_slots = 0; } events = mt_slots + 1; /* count SYN_MT_REPORT and SYN_REPORT */ if (test_bit(EV_ABS, dev->evbit)) for_each_set_bit(i, dev->absbit, ABS_CNT) events += input_is_mt_axis(i) ? mt_slots : 1; if (test_bit(EV_REL, dev->evbit)) events += bitmap_weight(dev->relbit, REL_CNT); /* Make room for KEY and MSC events */ events += 7; return events; } #define INPUT_CLEANSE_BITMASK(dev, type, bits) \ do { \ if (!test_bit(EV_##type, dev->evbit)) \ memset(dev->bits##bit, 0, \ sizeof(dev->bits##bit)); \ } while (0) static void input_cleanse_bitmasks(struct input_dev *dev) { INPUT_CLEANSE_BITMASK(dev, KEY, key); INPUT_CLEANSE_BITMASK(dev, REL, rel); INPUT_CLEANSE_BITMASK(dev, ABS, abs); INPUT_CLEANSE_BITMASK(dev, MSC, msc); INPUT_CLEANSE_BITMASK(dev, LED, led); INPUT_CLEANSE_BITMASK(dev, SND, snd); INPUT_CLEANSE_BITMASK(dev, FF, ff); INPUT_CLEANSE_BITMASK(dev, SW, sw); } static void __input_unregister_device(struct input_dev *dev) { struct input_handle *handle, *next; input_disconnect_device(dev); scoped_guard(mutex, &input_mutex) { list_for_each_entry_safe(handle, next, &dev->h_list, d_node) handle->handler->disconnect(handle); WARN_ON(!list_empty(&dev->h_list)); timer_delete_sync(&dev->timer); list_del_init(&dev->node); input_wakeup_procfs_readers(); } device_del(&dev->dev); } static void devm_input_device_unregister(struct device *dev, void *res) { struct input_devres *devres = res; struct input_dev *input = devres->input; dev_dbg(dev, "%s: unregistering device %s\n", __func__, dev_name(&input->dev)); __input_unregister_device(input); } /* * Generate software autorepeat event. Note that we take * dev->event_lock here to avoid racing with input_event * which may cause keys get "stuck". */ static void input_repeat_key(struct timer_list *t) { struct input_dev *dev = from_timer(dev, t, timer); guard(spinlock_irqsave)(&dev->event_lock); if (!dev->inhibited && test_bit(dev->repeat_key, dev->key) && is_event_supported(dev->repeat_key, dev->keybit, KEY_MAX)) { input_set_timestamp(dev, ktime_get()); input_handle_event(dev, EV_KEY, dev->repeat_key, 2); input_handle_event(dev, EV_SYN, SYN_REPORT, 1); if (dev->rep[REP_PERIOD]) mod_timer(&dev->timer, jiffies + msecs_to_jiffies(dev->rep[REP_PERIOD])); } } /** * input_enable_softrepeat - enable software autorepeat * @dev: input device * @delay: repeat delay * @period: repeat period * * Enable software autorepeat on the input device. */ void input_enable_softrepeat(struct input_dev *dev, int delay, int period) { dev->timer.function = input_repeat_key; dev->rep[REP_DELAY] = delay; dev->rep[REP_PERIOD] = period; } EXPORT_SYMBOL(input_enable_softrepeat); bool input_device_enabled(struct input_dev *dev) { lockdep_assert_held(&dev->mutex); return !dev->inhibited && dev->users > 0; } EXPORT_SYMBOL_GPL(input_device_enabled); static int input_device_tune_vals(struct input_dev *dev) { struct input_value *vals; unsigned int packet_size; unsigned int max_vals; packet_size = input_estimate_events_per_packet(dev); if (dev->hint_events_per_packet < packet_size) dev->hint_events_per_packet = packet_size; max_vals = dev->hint_events_per_packet + 2; if (dev->max_vals >= max_vals) return 0; vals = kcalloc(max_vals, sizeof(*vals), GFP_KERNEL); if (!vals) return -ENOMEM; scoped_guard(spinlock_irq, &dev->event_lock) { dev->max_vals = max_vals; swap(dev->vals, vals); } /* Because of swap() above, this frees the old vals memory */ kfree(vals); return 0; } /** * input_register_device - register device with input core * @dev: device to be registered * * This function registers device with input core. The device must be * allocated with input_allocate_device() and all it's capabilities * set up before registering. * If function fails the device must be freed with input_free_device(). * Once device has been successfully registered it can be unregistered * with input_unregister_device(); input_free_device() should not be * called in this case. * * Note that this function is also used to register managed input devices * (ones allocated with devm_input_allocate_device()). Such managed input * devices need not be explicitly unregistered or freed, their tear down * is controlled by the devres infrastructure. It is also worth noting * that tear down of managed input devices is internally a 2-step process: * registered managed input device is first unregistered, but stays in * memory and can still handle input_event() calls (although events will * not be delivered anywhere). The freeing of managed input device will * happen later, when devres stack is unwound to the point where device * allocation was made. */ int input_register_device(struct input_dev *dev) { struct input_devres *devres = NULL; struct input_handler *handler; const char *path; int error; if (test_bit(EV_ABS, dev->evbit) && !dev->absinfo) { dev_err(&dev->dev, "Absolute device without dev->absinfo, refusing to register\n"); return -EINVAL; } if (dev->devres_managed) { devres = devres_alloc(devm_input_device_unregister, sizeof(*devres), GFP_KERNEL); if (!devres) return -ENOMEM; devres->input = dev; } /* Every input device generates EV_SYN/SYN_REPORT events. */ __set_bit(EV_SYN, dev->evbit); /* KEY_RESERVED is not supposed to be transmitted to userspace. */ __clear_bit(KEY_RESERVED, dev->keybit); /* Make sure that bitmasks not mentioned in dev->evbit are clean. */ input_cleanse_bitmasks(dev); error = input_device_tune_vals(dev); if (error) goto err_devres_free; /* * If delay and period are pre-set by the driver, then autorepeating * is handled by the driver itself and we don't do it in input.c. */ if (!dev->rep[REP_DELAY] && !dev->rep[REP_PERIOD]) input_enable_softrepeat(dev, 250, 33); if (!dev->getkeycode) dev->getkeycode = input_default_getkeycode; if (!dev->setkeycode) dev->setkeycode = input_default_setkeycode; if (dev->poller) input_dev_poller_finalize(dev->poller); error = device_add(&dev->dev); if (error) goto err_devres_free; path = kobject_get_path(&dev->dev.kobj, GFP_KERNEL); pr_info("%s as %s\n", dev->name ? dev->name : "Unspecified device", path ? path : "N/A"); kfree(path); error = -EINTR; scoped_cond_guard(mutex_intr, goto err_device_del, &input_mutex) { list_add_tail(&dev->node, &input_dev_list); list_for_each_entry(handler, &input_handler_list, node) input_attach_handler(dev, handler); input_wakeup_procfs_readers(); } if (dev->devres_managed) { dev_dbg(dev->dev.parent, "%s: registering %s with devres.\n", __func__, dev_name(&dev->dev)); devres_add(dev->dev.parent, devres); } return 0; err_device_del: device_del(&dev->dev); err_devres_free: devres_free(devres); return error; } EXPORT_SYMBOL(input_register_device); /** * input_unregister_device - unregister previously registered device * @dev: device to be unregistered * * This function unregisters an input device. Once device is unregistered * the caller should not try to access it as it may get freed at any moment. */ void input_unregister_device(struct input_dev *dev) { if (dev->devres_managed) { WARN_ON(devres_destroy(dev->dev.parent, devm_input_device_unregister, devm_input_device_match, dev)); __input_unregister_device(dev); /* * We do not do input_put_device() here because it will be done * when 2nd devres fires up. */ } else { __input_unregister_device(dev); input_put_device(dev); } } EXPORT_SYMBOL(input_unregister_device); static int input_handler_check_methods(const struct input_handler *handler) { int count = 0; if (handler->filter) count++; if (handler->events) count++; if (handler->event) count++; if (count > 1) { pr_err("%s: only one event processing method can be defined (%s)\n", __func__, handler->name); return -EINVAL; } return 0; } /** * input_register_handler - register a new input handler * @handler: handler to be registered * * This function registers a new input handler (interface) for input * devices in the system and attaches it to all input devices that * are compatible with the handler. */ int input_register_handler(struct input_handler *handler) { struct input_dev *dev; int error; error = input_handler_check_methods(handler); if (error) return error; scoped_cond_guard(mutex_intr, return -EINTR, &input_mutex) { INIT_LIST_HEAD(&handler->h_list); list_add_tail(&handler->node, &input_handler_list); list_for_each_entry(dev, &input_dev_list, node) input_attach_handler(dev, handler); input_wakeup_procfs_readers(); } return 0; } EXPORT_SYMBOL(input_register_handler); /** * input_unregister_handler - unregisters an input handler * @handler: handler to be unregistered * * This function disconnects a handler from its input devices and * removes it from lists of known handlers. */ void input_unregister_handler(struct input_handler *handler) { struct input_handle *handle, *next; guard(mutex)(&input_mutex); list_for_each_entry_safe(handle, next, &handler->h_list, h_node) handler->disconnect(handle); WARN_ON(!list_empty(&handler->h_list)); list_del_init(&handler->node); input_wakeup_procfs_readers(); } EXPORT_SYMBOL(input_unregister_handler); /** * input_handler_for_each_handle - handle iterator * @handler: input handler to iterate * @data: data for the callback * @fn: function to be called for each handle * * Iterate over @bus's list of devices, and call @fn for each, passing * it @data and stop when @fn returns a non-zero value. The function is * using RCU to traverse the list and therefore may be using in atomic * contexts. The @fn callback is invoked from RCU critical section and * thus must not sleep. */ int input_handler_for_each_handle(struct input_handler *handler, void *data, int (*fn)(struct input_handle *, void *)) { struct input_handle *handle; int retval; guard(rcu)(); list_for_each_entry_rcu(handle, &handler->h_list, h_node) { retval = fn(handle, data); if (retval) return retval; } return 0; } EXPORT_SYMBOL(input_handler_for_each_handle); /* * An implementation of input_handle's handle_events() method that simply * invokes handler->event() method for each event one by one. */ static unsigned int input_handle_events_default(struct input_handle *handle, struct input_value *vals, unsigned int count) { struct input_handler *handler = handle->handler; struct input_value *v; for (v = vals; v != vals + count; v++) handler->event(handle, v->type, v->code, v->value); return count; } /* * An implementation of input_handle's handle_events() method that invokes * handler->filter() method for each event one by one and removes events * that were filtered out from the "vals" array. */ static unsigned int input_handle_events_filter(struct input_handle *handle, struct input_value *vals, unsigned int count) { struct input_handler *handler = handle->handler; struct input_value *end = vals; struct input_value *v; for (v = vals; v != vals + count; v++) { if (handler->filter(handle, v->type, v->code, v->value)) continue; if (end != v) *end = *v; end++; } return end - vals; } /* * An implementation of input_handle's handle_events() method that does nothing. */ static unsigned int input_handle_events_null(struct input_handle *handle, struct input_value *vals, unsigned int count) { return count; } /* * Sets up appropriate handle->event_handler based on the input_handler * associated with the handle. */ static void input_handle_setup_event_handler(struct input_handle *handle) { struct input_handler *handler = handle->handler; if (handler->filter) handle->handle_events = input_handle_events_filter; else if (handler->event) handle->handle_events = input_handle_events_default; else if (handler->events) handle->handle_events = handler->events; else handle->handle_events = input_handle_events_null; } /** * input_register_handle - register a new input handle * @handle: handle to register * * This function puts a new input handle onto device's * and handler's lists so that events can flow through * it once it is opened using input_open_device(). * * This function is supposed to be called from handler's * connect() method. */ int input_register_handle(struct input_handle *handle) { struct input_handler *handler = handle->handler; struct input_dev *dev = handle->dev; input_handle_setup_event_handler(handle); /* * We take dev->mutex here to prevent race with * input_release_device(). */ scoped_cond_guard(mutex_intr, return -EINTR, &dev->mutex) { /* * Filters go to the head of the list, normal handlers * to the tail. */ if (handler->filter) list_add_rcu(&handle->d_node, &dev->h_list); else list_add_tail_rcu(&handle->d_node, &dev->h_list); } /* * Since we are supposed to be called from ->connect() * which is mutually exclusive with ->disconnect() * we can't be racing with input_unregister_handle() * and so separate lock is not needed here. */ list_add_tail_rcu(&handle->h_node, &handler->h_list); if (handler->start) handler->start(handle); return 0; } EXPORT_SYMBOL(input_register_handle); /** * input_unregister_handle - unregister an input handle * @handle: handle to unregister * * This function removes input handle from device's * and handler's lists. * * This function is supposed to be called from handler's * disconnect() method. */ void input_unregister_handle(struct input_handle *handle) { struct input_dev *dev = handle->dev; list_del_rcu(&handle->h_node); /* * Take dev->mutex to prevent race with input_release_device(). */ scoped_guard(mutex, &dev->mutex) list_del_rcu(&handle->d_node); synchronize_rcu(); } EXPORT_SYMBOL(input_unregister_handle); /** * input_get_new_minor - allocates a new input minor number * @legacy_base: beginning or the legacy range to be searched * @legacy_num: size of legacy range * @allow_dynamic: whether we can also take ID from the dynamic range * * This function allocates a new device minor for from input major namespace. * Caller can request legacy minor by specifying @legacy_base and @legacy_num * parameters and whether ID can be allocated from dynamic range if there are * no free IDs in legacy range. */ int input_get_new_minor(int legacy_base, unsigned int legacy_num, bool allow_dynamic) { /* * This function should be called from input handler's ->connect() * methods, which are serialized with input_mutex, so no additional * locking is needed here. */ if (legacy_base >= 0) { int minor = ida_alloc_range(&input_ida, legacy_base, legacy_base + legacy_num - 1, GFP_KERNEL); if (minor >= 0 || !allow_dynamic) return minor; } return ida_alloc_range(&input_ida, INPUT_FIRST_DYNAMIC_DEV, INPUT_MAX_CHAR_DEVICES - 1, GFP_KERNEL); } EXPORT_SYMBOL(input_get_new_minor); /** * input_free_minor - release previously allocated minor * @minor: minor to be released * * This function releases previously allocated input minor so that it can be * reused later. */ void input_free_minor(unsigned int minor) { ida_free(&input_ida, minor); } EXPORT_SYMBOL(input_free_minor); static int __init input_init(void) { int err; err = class_register(&input_class); if (err) { pr_err("unable to register input_dev class\n"); return err; } err = input_proc_init(); if (err) goto fail1; err = register_chrdev_region(MKDEV(INPUT_MAJOR, 0), INPUT_MAX_CHAR_DEVICES, "input"); if (err) { pr_err("unable to register char major %d", INPUT_MAJOR); goto fail2; } return 0; fail2: input_proc_exit(); fail1: class_unregister(&input_class); return err; } static void __exit input_exit(void) { input_proc_exit(); unregister_chrdev_region(MKDEV(INPUT_MAJOR, 0), INPUT_MAX_CHAR_DEVICES); class_unregister(&input_class); } subsys_initcall(input_init); module_exit(input_exit); |
| 302 172 2507 1724 2341 2508 2508 1144 1144 9578 424 1144 2214 104 372 372 372 89 89 89 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 | /* SPDX-License-Identifier: GPL-2.0+ */ #ifndef _LINUX_MAPLE_TREE_H #define _LINUX_MAPLE_TREE_H /* * Maple Tree - An RCU-safe adaptive tree for storing ranges * Copyright (c) 2018-2022 Oracle * Authors: Liam R. Howlett <Liam.Howlett@Oracle.com> * Matthew Wilcox <willy@infradead.org> */ #include <linux/kernel.h> #include <linux/rcupdate.h> #include <linux/spinlock.h> /* #define CONFIG_MAPLE_RCU_DISABLED */ /* * Allocated nodes are mutable until they have been inserted into the tree, * at which time they cannot change their type until they have been removed * from the tree and an RCU grace period has passed. * * Removed nodes have their ->parent set to point to themselves. RCU readers * check ->parent before relying on the value that they loaded from the * slots array. This lets us reuse the slots array for the RCU head. * * Nodes in the tree point to their parent unless bit 0 is set. */ #if defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64) /* 64bit sizes */ #define MAPLE_NODE_SLOTS 31 /* 256 bytes including ->parent */ #define MAPLE_RANGE64_SLOTS 16 /* 256 bytes */ #define MAPLE_ARANGE64_SLOTS 10 /* 240 bytes */ #define MAPLE_ALLOC_SLOTS (MAPLE_NODE_SLOTS - 1) #else /* 32bit sizes */ #define MAPLE_NODE_SLOTS 63 /* 256 bytes including ->parent */ #define MAPLE_RANGE64_SLOTS 32 /* 256 bytes */ #define MAPLE_ARANGE64_SLOTS 21 /* 240 bytes */ #define MAPLE_ALLOC_SLOTS (MAPLE_NODE_SLOTS - 2) #endif /* defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64) */ #define MAPLE_NODE_MASK 255UL /* * The node->parent of the root node has bit 0 set and the rest of the pointer * is a pointer to the tree itself. No more bits are available in this pointer * (on m68k, the data structure may only be 2-byte aligned). * * Internal non-root nodes can only have maple_range_* nodes as parents. The * parent pointer is 256B aligned like all other tree nodes. When storing a 32 * or 64 bit values, the offset can fit into 4 bits. The 16 bit values need an * extra bit to store the offset. This extra bit comes from a reuse of the last * bit in the node type. This is possible by using bit 1 to indicate if bit 2 * is part of the type or the slot. * * Once the type is decided, the decision of an allocation range type or a * range type is done by examining the immutable tree flag for the * MT_FLAGS_ALLOC_RANGE flag. * * Node types: * 0x??1 = Root * 0x?00 = 16 bit nodes * 0x010 = 32 bit nodes * 0x110 = 64 bit nodes * * Slot size and location in the parent pointer: * type : slot location * 0x??1 : Root * 0x?00 : 16 bit values, type in 0-1, slot in 2-6 * 0x010 : 32 bit values, type in 0-2, slot in 3-6 * 0x110 : 64 bit values, type in 0-2, slot in 3-6 */ /* * This metadata is used to optimize the gap updating code and in reverse * searching for gaps or any other code that needs to find the end of the data. */ struct maple_metadata { unsigned char end; unsigned char gap; }; /* * Leaf nodes do not store pointers to nodes, they store user data. Users may * store almost any bit pattern. As noted above, the optimisation of storing an * entry at 0 in the root pointer cannot be done for data which have the bottom * two bits set to '10'. We also reserve values with the bottom two bits set to * '10' which are below 4096 (ie 2, 6, 10 .. 4094) for internal use. Some APIs * return errnos as a negative errno shifted right by two bits and the bottom * two bits set to '10', and while choosing to store these values in the array * is not an error, it may lead to confusion if you're testing for an error with * mas_is_err(). * * Non-leaf nodes store the type of the node pointed to (enum maple_type in bits * 3-6), bit 2 is reserved. That leaves bits 0-1 unused for now. * * In regular B-Tree terms, pivots are called keys. The term pivot is used to * indicate that the tree is specifying ranges, Pivots may appear in the * subtree with an entry attached to the value whereas keys are unique to a * specific position of a B-tree. Pivot values are inclusive of the slot with * the same index. */ struct maple_range_64 { struct maple_pnode *parent; unsigned long pivot[MAPLE_RANGE64_SLOTS - 1]; union { void __rcu *slot[MAPLE_RANGE64_SLOTS]; struct { void __rcu *pad[MAPLE_RANGE64_SLOTS - 1]; struct maple_metadata meta; }; }; }; /* * At tree creation time, the user can specify that they're willing to trade off * storing fewer entries in a tree in return for storing more information in * each node. * * The maple tree supports recording the largest range of NULL entries available * in this node, also called gaps. This optimises the tree for allocating a * range. */ struct maple_arange_64 { struct maple_pnode *parent; unsigned long pivot[MAPLE_ARANGE64_SLOTS - 1]; void __rcu *slot[MAPLE_ARANGE64_SLOTS]; unsigned long gap[MAPLE_ARANGE64_SLOTS]; struct maple_metadata meta; }; struct maple_alloc { unsigned long total; unsigned char node_count; unsigned int request_count; struct maple_alloc *slot[MAPLE_ALLOC_SLOTS]; }; struct maple_topiary { struct maple_pnode *parent; struct maple_enode *next; /* Overlaps the pivot */ }; enum maple_type { maple_dense, maple_leaf_64, maple_range_64, maple_arange_64, }; enum store_type { wr_invalid, wr_new_root, wr_store_root, wr_exact_fit, wr_spanning_store, wr_split_store, wr_rebalance, wr_append, wr_node_store, wr_slot_store, }; /** * DOC: Maple tree flags * * * MT_FLAGS_ALLOC_RANGE - Track gaps in this tree * * MT_FLAGS_USE_RCU - Operate in RCU mode * * MT_FLAGS_HEIGHT_OFFSET - The position of the tree height in the flags * * MT_FLAGS_HEIGHT_MASK - The mask for the maple tree height value * * MT_FLAGS_LOCK_MASK - How the mt_lock is used * * MT_FLAGS_LOCK_IRQ - Acquired irq-safe * * MT_FLAGS_LOCK_BH - Acquired bh-safe * * MT_FLAGS_LOCK_EXTERN - mt_lock is not used * * MAPLE_HEIGHT_MAX The largest height that can be stored */ #define MT_FLAGS_ALLOC_RANGE 0x01 #define MT_FLAGS_USE_RCU 0x02 #define MT_FLAGS_HEIGHT_OFFSET 0x02 #define MT_FLAGS_HEIGHT_MASK 0x7C #define MT_FLAGS_LOCK_MASK 0x300 #define MT_FLAGS_LOCK_IRQ 0x100 #define MT_FLAGS_LOCK_BH 0x200 #define MT_FLAGS_LOCK_EXTERN 0x300 #define MT_FLAGS_ALLOC_WRAPPED 0x0800 #define MAPLE_HEIGHT_MAX 31 #define MAPLE_NODE_TYPE_MASK 0x0F #define MAPLE_NODE_TYPE_SHIFT 0x03 #define MAPLE_RESERVED_RANGE 4096 #ifdef CONFIG_LOCKDEP typedef struct lockdep_map *lockdep_map_p; #define mt_lock_is_held(mt) \ (!(mt)->ma_external_lock || lock_is_held((mt)->ma_external_lock)) #define mt_write_lock_is_held(mt) \ (!(mt)->ma_external_lock || \ lock_is_held_type((mt)->ma_external_lock, 0)) #define mt_set_external_lock(mt, lock) \ (mt)->ma_external_lock = &(lock)->dep_map #define mt_on_stack(mt) (mt).ma_external_lock = NULL #else typedef struct { /* nothing */ } lockdep_map_p; #define mt_lock_is_held(mt) 1 #define mt_write_lock_is_held(mt) 1 #define mt_set_external_lock(mt, lock) do { } while (0) #define mt_on_stack(mt) do { } while (0) #endif /* * If the tree contains a single entry at index 0, it is usually stored in * tree->ma_root. To optimise for the page cache, an entry which ends in '00', * '01' or '11' is stored in the root, but an entry which ends in '10' will be * stored in a node. Bits 3-6 are used to store enum maple_type. * * The flags are used both to store some immutable information about this tree * (set at tree creation time) and dynamic information set under the spinlock. * * Another use of flags are to indicate global states of the tree. This is the * case with the MT_FLAGS_USE_RCU flag, which indicates the tree is currently in * RCU mode. This mode was added to allow the tree to reuse nodes instead of * re-allocating and RCU freeing nodes when there is a single user. */ struct maple_tree { union { spinlock_t ma_lock; lockdep_map_p ma_external_lock; }; unsigned int ma_flags; void __rcu *ma_root; }; /** * MTREE_INIT() - Initialize a maple tree * @name: The maple tree name * @__flags: The maple tree flags * */ #define MTREE_INIT(name, __flags) { \ .ma_lock = __SPIN_LOCK_UNLOCKED((name).ma_lock), \ .ma_flags = __flags, \ .ma_root = NULL, \ } /** * MTREE_INIT_EXT() - Initialize a maple tree with an external lock. * @name: The tree name * @__flags: The maple tree flags * @__lock: The external lock */ #ifdef CONFIG_LOCKDEP #define MTREE_INIT_EXT(name, __flags, __lock) { \ .ma_external_lock = &(__lock).dep_map, \ .ma_flags = (__flags), \ .ma_root = NULL, \ } #else #define MTREE_INIT_EXT(name, __flags, __lock) MTREE_INIT(name, __flags) #endif #define DEFINE_MTREE(name) \ struct maple_tree name = MTREE_INIT(name, 0) #define mtree_lock(mt) spin_lock((&(mt)->ma_lock)) #define mtree_lock_nested(mas, subclass) \ spin_lock_nested((&(mt)->ma_lock), subclass) #define mtree_unlock(mt) spin_unlock((&(mt)->ma_lock)) /* * The Maple Tree squeezes various bits in at various points which aren't * necessarily obvious. Usually, this is done by observing that pointers are * N-byte aligned and thus the bottom log_2(N) bits are available for use. We * don't use the high bits of pointers to store additional information because * we don't know what bits are unused on any given architecture. * * Nodes are 256 bytes in size and are also aligned to 256 bytes, giving us 8 * low bits for our own purposes. Nodes are currently of 4 types: * 1. Single pointer (Range is 0-0) * 2. Non-leaf Allocation Range nodes * 3. Non-leaf Range nodes * 4. Leaf Range nodes All nodes consist of a number of node slots, * pivots, and a parent pointer. */ struct maple_node { union { struct { struct maple_pnode *parent; void __rcu *slot[MAPLE_NODE_SLOTS]; }; struct { void *pad; struct rcu_head rcu; struct maple_enode *piv_parent; unsigned char parent_slot; enum maple_type type; unsigned char slot_len; unsigned int ma_flags; }; struct maple_range_64 mr64; struct maple_arange_64 ma64; struct maple_alloc alloc; }; }; /* * More complicated stores can cause two nodes to become one or three and * potentially alter the height of the tree. Either half of the tree may need * to be rebalanced against the other. The ma_topiary struct is used to track * which nodes have been 'cut' from the tree so that the change can be done * safely at a later date. This is done to support RCU. */ struct ma_topiary { struct maple_enode *head; struct maple_enode *tail; struct maple_tree *mtree; }; void *mtree_load(struct maple_tree *mt, unsigned long index); int mtree_insert(struct maple_tree *mt, unsigned long index, void *entry, gfp_t gfp); int mtree_insert_range(struct maple_tree *mt, unsigned long first, unsigned long last, void *entry, gfp_t gfp); int mtree_alloc_range(struct maple_tree *mt, unsigned long *startp, void *entry, unsigned long size, unsigned long min, unsigned long max, gfp_t gfp); int mtree_alloc_cyclic(struct maple_tree *mt, unsigned long *startp, void *entry, unsigned long range_lo, unsigned long range_hi, unsigned long *next, gfp_t gfp); int mtree_alloc_rrange(struct maple_tree *mt, unsigned long *startp, void *entry, unsigned long size, unsigned long min, unsigned long max, gfp_t gfp); int mtree_store_range(struct maple_tree *mt, unsigned long first, unsigned long last, void *entry, gfp_t gfp); int mtree_store(struct maple_tree *mt, unsigned long index, void *entry, gfp_t gfp); void *mtree_erase(struct maple_tree *mt, unsigned long index); int mtree_dup(struct maple_tree *mt, struct maple_tree *new, gfp_t gfp); int __mt_dup(struct maple_tree *mt, struct maple_tree *new, gfp_t gfp); void mtree_destroy(struct maple_tree *mt); void __mt_destroy(struct maple_tree *mt); /** * mtree_empty() - Determine if a tree has any present entries. * @mt: Maple Tree. * * Context: Any context. * Return: %true if the tree contains only NULL pointers. */ static inline bool mtree_empty(const struct maple_tree *mt) { return mt->ma_root == NULL; } /* Advanced API */ /* * Maple State Status * ma_active means the maple state is pointing to a node and offset and can * continue operating on the tree. * ma_start means we have not searched the tree. * ma_root means we have searched the tree and the entry we found lives in * the root of the tree (ie it has index 0, length 1 and is the only entry in * the tree). * ma_none means we have searched the tree and there is no node in the * tree for this entry. For example, we searched for index 1 in an empty * tree. Or we have a tree which points to a full leaf node and we * searched for an entry which is larger than can be contained in that * leaf node. * ma_pause means the data within the maple state may be stale, restart the * operation * ma_overflow means the search has reached the upper limit of the search * ma_underflow means the search has reached the lower limit of the search * ma_error means there was an error, check the node for the error number. */ enum maple_status { ma_active, ma_start, ma_root, ma_none, ma_pause, ma_overflow, ma_underflow, ma_error, }; /* * The maple state is defined in the struct ma_state and is used to keep track * of information during operations, and even between operations when using the * advanced API. * * If state->node has bit 0 set then it references a tree location which is not * a node (eg the root). If bit 1 is set, the rest of the bits are a negative * errno. Bit 2 (the 'unallocated slots' bit) is clear. Bits 3-6 indicate the * node type. * * state->alloc either has a request number of nodes or an allocated node. If * stat->alloc has a requested number of nodes, the first bit will be set (0x1) * and the remaining bits are the value. If state->alloc is a node, then the * node will be of type maple_alloc. maple_alloc has MAPLE_NODE_SLOTS - 1 for * storing more allocated nodes, a total number of nodes allocated, and the * node_count in this node. node_count is the number of allocated nodes in this * node. The scaling beyond MAPLE_NODE_SLOTS - 1 is handled by storing further * nodes into state->alloc->slot[0]'s node. Nodes are taken from state->alloc * by removing a node from the state->alloc node until state->alloc->node_count * is 1, when state->alloc is returned and the state->alloc->slot[0] is promoted * to state->alloc. Nodes are pushed onto state->alloc by putting the current * state->alloc into the pushed node's slot[0]. * * The state also contains the implied min/max of the state->node, the depth of * this search, and the offset. The implied min/max are either from the parent * node or are 0-oo for the root node. The depth is incremented or decremented * every time a node is walked down or up. The offset is the slot/pivot of * interest in the node - either for reading or writing. * * When returning a value the maple state index and last respectively contain * the start and end of the range for the entry. Ranges are inclusive in the * Maple Tree. * * The status of the state is used to determine how the next action should treat * the state. For instance, if the status is ma_start then the next action * should start at the root of the tree and walk down. If the status is * ma_pause then the node may be stale data and should be discarded. If the * status is ma_overflow, then the last action hit the upper limit. * */ struct ma_state { struct maple_tree *tree; /* The tree we're operating in */ unsigned long index; /* The index we're operating on - range start */ unsigned long last; /* The last index we're operating on - range end */ struct maple_enode *node; /* The node containing this entry */ unsigned long min; /* The minimum index of this node - implied pivot min */ unsigned long max; /* The maximum index of this node - implied pivot max */ struct maple_alloc *alloc; /* Allocated nodes for this operation */ enum maple_status status; /* The status of the state (active, start, none, etc) */ unsigned char depth; /* depth of tree descent during write */ unsigned char offset; unsigned char mas_flags; unsigned char end; /* The end of the node */ enum store_type store_type; /* The type of store needed for this operation */ }; struct ma_wr_state { struct ma_state *mas; struct maple_node *node; /* Decoded mas->node */ unsigned long r_min; /* range min */ unsigned long r_max; /* range max */ enum maple_type type; /* mas->node type */ unsigned char offset_end; /* The offset where the write ends */ unsigned long *pivots; /* mas->node->pivots pointer */ unsigned long end_piv; /* The pivot at the offset end */ void __rcu **slots; /* mas->node->slots pointer */ void *entry; /* The entry to write */ void *content; /* The existing entry that is being overwritten */ }; #define mas_lock(mas) spin_lock(&((mas)->tree->ma_lock)) #define mas_lock_nested(mas, subclass) \ spin_lock_nested(&((mas)->tree->ma_lock), subclass) #define mas_unlock(mas) spin_unlock(&((mas)->tree->ma_lock)) /* * Special values for ma_state.node. * MA_ERROR represents an errno. After dropping the lock and attempting * to resolve the error, the walk would have to be restarted from the * top of the tree as the tree may have been modified. */ #define MA_ERROR(err) \ ((struct maple_enode *)(((unsigned long)err << 2) | 2UL)) #define MA_STATE(name, mt, first, end) \ struct ma_state name = { \ .tree = mt, \ .index = first, \ .last = end, \ .node = NULL, \ .status = ma_start, \ .min = 0, \ .max = ULONG_MAX, \ .alloc = NULL, \ .mas_flags = 0, \ .store_type = wr_invalid, \ } #define MA_WR_STATE(name, ma_state, wr_entry) \ struct ma_wr_state name = { \ .mas = ma_state, \ .content = NULL, \ .entry = wr_entry, \ } #define MA_TOPIARY(name, tree) \ struct ma_topiary name = { \ .head = NULL, \ .tail = NULL, \ .mtree = tree, \ } void *mas_walk(struct ma_state *mas); void *mas_store(struct ma_state *mas, void *entry); void *mas_erase(struct ma_state *mas); int mas_store_gfp(struct ma_state *mas, void *entry, gfp_t gfp); void mas_store_prealloc(struct ma_state *mas, void *entry); void *mas_find(struct ma_state *mas, unsigned long max); void *mas_find_range(struct ma_state *mas, unsigned long max); void *mas_find_rev(struct ma_state *mas, unsigned long min); void *mas_find_range_rev(struct ma_state *mas, unsigned long max); int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp); int mas_alloc_cyclic(struct ma_state *mas, unsigned long *startp, void *entry, unsigned long range_lo, unsigned long range_hi, unsigned long *next, gfp_t gfp); bool mas_nomem(struct ma_state *mas, gfp_t gfp); void mas_pause(struct ma_state *mas); void maple_tree_init(void); void mas_destroy(struct ma_state *mas); int mas_expected_entries(struct ma_state *mas, unsigned long nr_entries); void *mas_prev(struct ma_state *mas, unsigned long min); void *mas_prev_range(struct ma_state *mas, unsigned long max); void *mas_next(struct ma_state *mas, unsigned long max); void *mas_next_range(struct ma_state *mas, unsigned long max); int mas_empty_area(struct ma_state *mas, unsigned long min, unsigned long max, unsigned long size); /* * This finds an empty area from the highest address to the lowest. * AKA "Topdown" version, */ int mas_empty_area_rev(struct ma_state *mas, unsigned long min, unsigned long max, unsigned long size); static inline void mas_init(struct ma_state *mas, struct maple_tree *tree, unsigned long addr) { memset(mas, 0, sizeof(struct ma_state)); mas->tree = tree; mas->index = mas->last = addr; mas->max = ULONG_MAX; mas->status = ma_start; mas->node = NULL; } static inline bool mas_is_active(struct ma_state *mas) { return mas->status == ma_active; } static inline bool mas_is_err(struct ma_state *mas) { return mas->status == ma_error; } /** * mas_reset() - Reset a Maple Tree operation state. * @mas: Maple Tree operation state. * * Resets the error or walk state of the @mas so future walks of the * array will start from the root. Use this if you have dropped the * lock and want to reuse the ma_state. * * Context: Any context. */ static __always_inline void mas_reset(struct ma_state *mas) { mas->status = ma_start; mas->node = NULL; } /** * mas_for_each() - Iterate over a range of the maple tree. * @__mas: Maple Tree operation state (maple_state) * @__entry: Entry retrieved from the tree * @__max: maximum index to retrieve from the tree * * When returned, mas->index and mas->last will hold the entire range for the * entry. * * Note: may return the zero entry. */ #define mas_for_each(__mas, __entry, __max) \ while (((__entry) = mas_find((__mas), (__max))) != NULL) /** * mas_for_each_rev() - Iterate over a range of the maple tree in reverse order. * @__mas: Maple Tree operation state (maple_state) * @__entry: Entry retrieved from the tree * @__min: minimum index to retrieve from the tree * * When returned, mas->index and mas->last will hold the entire range for the * entry. * * Note: may return the zero entry. */ #define mas_for_each_rev(__mas, __entry, __min) \ while (((__entry) = mas_find_rev((__mas), (__min))) != NULL) #ifdef CONFIG_DEBUG_MAPLE_TREE enum mt_dump_format { mt_dump_dec, mt_dump_hex, }; extern atomic_t maple_tree_tests_run; extern atomic_t maple_tree_tests_passed; void mt_dump(const struct maple_tree *mt, enum mt_dump_format format); void mas_dump(const struct ma_state *mas); void mas_wr_dump(const struct ma_wr_state *wr_mas); void mt_validate(struct maple_tree *mt); void mt_cache_shrink(void); #define MT_BUG_ON(__tree, __x) do { \ atomic_inc(&maple_tree_tests_run); \ if (__x) { \ pr_info("BUG at %s:%d (%u)\n", \ __func__, __LINE__, __x); \ mt_dump(__tree, mt_dump_hex); \ pr_info("Pass: %u Run:%u\n", \ atomic_read(&maple_tree_tests_passed), \ atomic_read(&maple_tree_tests_run)); \ dump_stack(); \ } else { \ atomic_inc(&maple_tree_tests_passed); \ } \ } while (0) #define MAS_BUG_ON(__mas, __x) do { \ atomic_inc(&maple_tree_tests_run); \ if (__x) { \ pr_info("BUG at %s:%d (%u)\n", \ __func__, __LINE__, __x); \ mas_dump(__mas); \ mt_dump((__mas)->tree, mt_dump_hex); \ pr_info("Pass: %u Run:%u\n", \ atomic_read(&maple_tree_tests_passed), \ atomic_read(&maple_tree_tests_run)); \ dump_stack(); \ } else { \ atomic_inc(&maple_tree_tests_passed); \ } \ } while (0) #define MAS_WR_BUG_ON(__wrmas, __x) do { \ atomic_inc(&maple_tree_tests_run); \ if (__x) { \ pr_info("BUG at %s:%d (%u)\n", \ __func__, __LINE__, __x); \ mas_wr_dump(__wrmas); \ mas_dump((__wrmas)->mas); \ mt_dump((__wrmas)->mas->tree, mt_dump_hex); \ pr_info("Pass: %u Run:%u\n", \ atomic_read(&maple_tree_tests_passed), \ atomic_read(&maple_tree_tests_run)); \ dump_stack(); \ } else { \ atomic_inc(&maple_tree_tests_passed); \ } \ } while (0) #define MT_WARN_ON(__tree, __x) ({ \ int ret = !!(__x); \ atomic_inc(&maple_tree_tests_run); \ if (ret) { \ pr_info("WARN at %s:%d (%u)\n", \ __func__, __LINE__, __x); \ mt_dump(__tree, mt_dump_hex); \ pr_info("Pass: %u Run:%u\n", \ atomic_read(&maple_tree_tests_passed), \ atomic_read(&maple_tree_tests_run)); \ dump_stack(); \ } else { \ atomic_inc(&maple_tree_tests_passed); \ } \ unlikely(ret); \ }) #define MAS_WARN_ON(__mas, __x) ({ \ int ret = !!(__x); \ atomic_inc(&maple_tree_tests_run); \ if (ret) { \ pr_info("WARN at %s:%d (%u)\n", \ __func__, __LINE__, __x); \ mas_dump(__mas); \ mt_dump((__mas)->tree, mt_dump_hex); \ pr_info("Pass: %u Run:%u\n", \ atomic_read(&maple_tree_tests_passed), \ atomic_read(&maple_tree_tests_run)); \ dump_stack(); \ } else { \ atomic_inc(&maple_tree_tests_passed); \ } \ unlikely(ret); \ }) #define MAS_WR_WARN_ON(__wrmas, __x) ({ \ int ret = !!(__x); \ atomic_inc(&maple_tree_tests_run); \ if (ret) { \ pr_info("WARN at %s:%d (%u)\n", \ __func__, __LINE__, __x); \ mas_wr_dump(__wrmas); \ mas_dump((__wrmas)->mas); \ mt_dump((__wrmas)->mas->tree, mt_dump_hex); \ pr_info("Pass: %u Run:%u\n", \ atomic_read(&maple_tree_tests_passed), \ atomic_read(&maple_tree_tests_run)); \ dump_stack(); \ } else { \ atomic_inc(&maple_tree_tests_passed); \ } \ unlikely(ret); \ }) #else #define MT_BUG_ON(__tree, __x) BUG_ON(__x) #define MAS_BUG_ON(__mas, __x) BUG_ON(__x) #define MAS_WR_BUG_ON(__mas, __x) BUG_ON(__x) #define MT_WARN_ON(__tree, __x) WARN_ON(__x) #define MAS_WARN_ON(__mas, __x) WARN_ON(__x) #define MAS_WR_WARN_ON(__mas, __x) WARN_ON(__x) #endif /* CONFIG_DEBUG_MAPLE_TREE */ /** * __mas_set_range() - Set up Maple Tree operation state to a sub-range of the * current location. * @mas: Maple Tree operation state. * @start: New start of range in the Maple Tree. * @last: New end of range in the Maple Tree. * * set the internal maple state values to a sub-range. * Please use mas_set_range() if you do not know where you are in the tree. */ static inline void __mas_set_range(struct ma_state *mas, unsigned long start, unsigned long last) { /* Ensure the range starts within the current slot */ MAS_WARN_ON(mas, mas_is_active(mas) && (mas->index > start || mas->last < start)); mas->index = start; mas->last = last; } /** * mas_set_range() - Set up Maple Tree operation state for a different index. * @mas: Maple Tree operation state. * @start: New start of range in the Maple Tree. * @last: New end of range in the Maple Tree. * * Move the operation state to refer to a different range. This will * have the effect of starting a walk from the top; see mas_next() * to move to an adjacent index. */ static inline void mas_set_range(struct ma_state *mas, unsigned long start, unsigned long last) { mas_reset(mas); __mas_set_range(mas, start, last); } /** * mas_set() - Set up Maple Tree operation state for a different index. * @mas: Maple Tree operation state. * @index: New index into the Maple Tree. * * Move the operation state to refer to a different index. This will * have the effect of starting a walk from the top; see mas_next() * to move to an adjacent index. */ static inline void mas_set(struct ma_state *mas, unsigned long index) { mas_set_range(mas, index, index); } static inline bool mt_external_lock(const struct maple_tree *mt) { return (mt->ma_flags & MT_FLAGS_LOCK_MASK) == MT_FLAGS_LOCK_EXTERN; } /** * mt_init_flags() - Initialise an empty maple tree with flags. * @mt: Maple Tree * @flags: maple tree flags. * * If you need to initialise a Maple Tree with special flags (eg, an * allocation tree), use this function. * * Context: Any context. */ static inline void mt_init_flags(struct maple_tree *mt, unsigned int flags) { mt->ma_flags = flags; if (!mt_external_lock(mt)) spin_lock_init(&mt->ma_lock); rcu_assign_pointer(mt->ma_root, NULL); } /** * mt_init() - Initialise an empty maple tree. * @mt: Maple Tree * * An empty Maple Tree. * * Context: Any context. */ static inline void mt_init(struct maple_tree *mt) { mt_init_flags(mt, 0); } static inline bool mt_in_rcu(struct maple_tree *mt) { #ifdef CONFIG_MAPLE_RCU_DISABLED return false; #endif return mt->ma_flags & MT_FLAGS_USE_RCU; } /** * mt_clear_in_rcu() - Switch the tree to non-RCU mode. * @mt: The Maple Tree */ static inline void mt_clear_in_rcu(struct maple_tree *mt) { if (!mt_in_rcu(mt)) return; if (mt_external_lock(mt)) { WARN_ON(!mt_lock_is_held(mt)); mt->ma_flags &= ~MT_FLAGS_USE_RCU; } else { mtree_lock(mt); mt->ma_flags &= ~MT_FLAGS_USE_RCU; mtree_unlock(mt); } } /** * mt_set_in_rcu() - Switch the tree to RCU safe mode. * @mt: The Maple Tree */ static inline void mt_set_in_rcu(struct maple_tree *mt) { if (mt_in_rcu(mt)) return; if (mt_external_lock(mt)) { WARN_ON(!mt_lock_is_held(mt)); mt->ma_flags |= MT_FLAGS_USE_RCU; } else { mtree_lock(mt); mt->ma_flags |= MT_FLAGS_USE_RCU; mtree_unlock(mt); } } static inline unsigned int mt_height(const struct maple_tree *mt) { return (mt->ma_flags & MT_FLAGS_HEIGHT_MASK) >> MT_FLAGS_HEIGHT_OFFSET; } void *mt_find(struct maple_tree *mt, unsigned long *index, unsigned long max); void *mt_find_after(struct maple_tree *mt, unsigned long *index, unsigned long max); void *mt_prev(struct maple_tree *mt, unsigned long index, unsigned long min); void *mt_next(struct maple_tree *mt, unsigned long index, unsigned long max); /** * mt_for_each - Iterate over each entry starting at index until max. * @__tree: The Maple Tree * @__entry: The current entry * @__index: The index to start the search from. Subsequently used as iterator. * @__max: The maximum limit for @index * * This iterator skips all entries, which resolve to a NULL pointer, * e.g. entries which has been reserved with XA_ZERO_ENTRY. */ #define mt_for_each(__tree, __entry, __index, __max) \ for (__entry = mt_find(__tree, &(__index), __max); \ __entry; __entry = mt_find_after(__tree, &(__index), __max)) #endif /*_LINUX_MAPLE_TREE_H */ |
| 7 10 99 18 18 18 18 10 10 10 10 499 501 9 7 7 1 6 2 18 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _ASM_X86_DESC_H #define _ASM_X86_DESC_H #include <asm/desc_defs.h> #include <asm/ldt.h> #include <asm/mmu.h> #include <asm/fixmap.h> #include <asm/irq_vectors.h> #include <asm/cpu_entry_area.h> #include <linux/debug_locks.h> #include <linux/smp.h> #include <linux/percpu.h> static inline void fill_ldt(struct desc_struct *desc, const struct user_desc *info) { desc->limit0 = info->limit & 0x0ffff; desc->base0 = (info->base_addr & 0x0000ffff); desc->base1 = (info->base_addr & 0x00ff0000) >> 16; desc->type = (info->read_exec_only ^ 1) << 1; desc->type |= info->contents << 2; /* Set the ACCESS bit so it can be mapped RO */ desc->type |= 1; desc->s = 1; desc->dpl = 0x3; desc->p = info->seg_not_present ^ 1; desc->limit1 = (info->limit & 0xf0000) >> 16; desc->avl = info->useable; desc->d = info->seg_32bit; desc->g = info->limit_in_pages; desc->base2 = (info->base_addr & 0xff000000) >> 24; /* * Don't allow setting of the lm bit. It would confuse * user_64bit_mode and would get overridden by sysret anyway. */ desc->l = 0; } struct gdt_page { struct desc_struct gdt[GDT_ENTRIES]; } __attribute__((aligned(PAGE_SIZE))); DECLARE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page); /* Provide the original GDT */ static inline struct desc_struct *get_cpu_gdt_rw(unsigned int cpu) { return per_cpu(gdt_page, cpu).gdt; } /* Provide the current original GDT */ static inline struct desc_struct *get_current_gdt_rw(void) { return this_cpu_ptr(&gdt_page)->gdt; } /* Provide the fixmap address of the remapped GDT */ static inline struct desc_struct *get_cpu_gdt_ro(int cpu) { return (struct desc_struct *)&get_cpu_entry_area(cpu)->gdt; } /* Provide the current read-only GDT */ static inline struct desc_struct *get_current_gdt_ro(void) { return get_cpu_gdt_ro(smp_processor_id()); } /* Provide the physical address of the GDT page. */ static inline phys_addr_t get_cpu_gdt_paddr(unsigned int cpu) { return per_cpu_ptr_to_phys(get_cpu_gdt_rw(cpu)); } static inline void pack_gate(gate_desc *gate, unsigned type, unsigned long func, unsigned dpl, unsigned ist, unsigned seg) { gate->offset_low = (u16) func; gate->bits.p = 1; gate->bits.dpl = dpl; gate->bits.zero = 0; gate->bits.type = type; gate->offset_middle = (u16) (func >> 16); #ifdef CONFIG_X86_64 gate->segment = __KERNEL_CS; gate->bits.ist = ist; gate->reserved = 0; gate->offset_high = (u32) (func >> 32); #else gate->segment = seg; gate->bits.ist = 0; #endif } static inline int desc_empty(const void *ptr) { const u32 *desc = ptr; return !(desc[0] | desc[1]); } #ifdef CONFIG_PARAVIRT_XXL #include <asm/paravirt.h> #else #define load_TR_desc() native_load_tr_desc() #define load_gdt(dtr) native_load_gdt(dtr) #define load_idt(dtr) native_load_idt(dtr) #define load_tr(tr) asm volatile("ltr %0"::"m" (tr)) #define load_ldt(ldt) asm volatile("lldt %0"::"m" (ldt)) #define store_gdt(dtr) native_store_gdt(dtr) #define store_tr(tr) (tr = native_store_tr()) #define load_TLS(t, cpu) native_load_tls(t, cpu) #define set_ldt native_set_ldt #define write_ldt_entry(dt, entry, desc) native_write_ldt_entry(dt, entry, desc) #define write_gdt_entry(dt, entry, desc, type) native_write_gdt_entry(dt, entry, desc, type) #define write_idt_entry(dt, entry, g) native_write_idt_entry(dt, entry, g) static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries) { } static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries) { } #endif /* CONFIG_PARAVIRT_XXL */ #define store_ldt(ldt) asm("sldt %0" : "=m"(ldt)) static inline void native_write_idt_entry(gate_desc *idt, int entry, const gate_desc *gate) { memcpy(&idt[entry], gate, sizeof(*gate)); } static inline void native_write_ldt_entry(struct desc_struct *ldt, int entry, const void *desc) { memcpy(&ldt[entry], desc, 8); } static inline void native_write_gdt_entry(struct desc_struct *gdt, int entry, const void *desc, int type) { unsigned int size; switch (type) { case DESC_TSS: size = sizeof(tss_desc); break; case DESC_LDT: size = sizeof(ldt_desc); break; default: size = sizeof(*gdt); break; } memcpy(&gdt[entry], desc, size); } static inline void set_tssldt_descriptor(void *d, unsigned long addr, unsigned type, unsigned size) { struct ldttss_desc *desc = d; memset(desc, 0, sizeof(*desc)); desc->limit0 = (u16) size; desc->base0 = (u16) addr; desc->base1 = (addr >> 16) & 0xFF; desc->type = type; desc->p = 1; desc->limit1 = (size >> 16) & 0xF; desc->base2 = (addr >> 24) & 0xFF; #ifdef CONFIG_X86_64 desc->base3 = (u32) (addr >> 32); #endif } static inline void __set_tss_desc(unsigned cpu, unsigned int entry, struct x86_hw_tss *addr) { struct desc_struct *d = get_cpu_gdt_rw(cpu); tss_desc tss; set_tssldt_descriptor(&tss, (unsigned long)addr, DESC_TSS, __KERNEL_TSS_LIMIT); write_gdt_entry(d, entry, &tss, DESC_TSS); } #define set_tss_desc(cpu, addr) __set_tss_desc(cpu, GDT_ENTRY_TSS, addr) static inline void native_set_ldt(const void *addr, unsigned int entries) { if (likely(entries == 0)) asm volatile("lldt %w0"::"q" (0)); else { unsigned cpu = smp_processor_id(); ldt_desc ldt; set_tssldt_descriptor(&ldt, (unsigned long)addr, DESC_LDT, entries * LDT_ENTRY_SIZE - 1); write_gdt_entry(get_cpu_gdt_rw(cpu), GDT_ENTRY_LDT, &ldt, DESC_LDT); asm volatile("lldt %w0"::"q" (GDT_ENTRY_LDT*8)); } } static inline void native_load_gdt(const struct desc_ptr *dtr) { asm volatile("lgdt %0"::"m" (*dtr)); } static __always_inline void native_load_idt(const struct desc_ptr *dtr) { asm volatile("lidt %0"::"m" (*dtr)); } static inline void native_store_gdt(struct desc_ptr *dtr) { asm volatile("sgdt %0":"=m" (*dtr)); } static inline void store_idt(struct desc_ptr *dtr) { asm volatile("sidt %0":"=m" (*dtr)); } static inline void native_gdt_invalidate(void) { const struct desc_ptr invalid_gdt = { .address = 0, .size = 0 }; native_load_gdt(&invalid_gdt); } static inline void native_idt_invalidate(void) { const struct desc_ptr invalid_idt = { .address = 0, .size = 0 }; native_load_idt(&invalid_idt); } /* * The LTR instruction marks the TSS GDT entry as busy. On 64-bit, the GDT is * a read-only remapping. To prevent a page fault, the GDT is switched to the * original writeable version when needed. */ #ifdef CONFIG_X86_64 static inline void native_load_tr_desc(void) { struct desc_ptr gdt; int cpu = raw_smp_processor_id(); bool restore = 0; struct desc_struct *fixmap_gdt; native_store_gdt(&gdt); fixmap_gdt = get_cpu_gdt_ro(cpu); /* * If the current GDT is the read-only fixmap, swap to the original * writeable version. Swap back at the end. */ if (gdt.address == (unsigned long)fixmap_gdt) { load_direct_gdt(cpu); restore = 1; } asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8)); if (restore) load_fixmap_gdt(cpu); } #else static inline void native_load_tr_desc(void) { asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8)); } #endif static inline unsigned long native_store_tr(void) { unsigned long tr; asm volatile("str %0":"=r" (tr)); return tr; } static inline void native_load_tls(struct thread_struct *t, unsigned int cpu) { struct desc_struct *gdt = get_cpu_gdt_rw(cpu); unsigned int i; for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++) gdt[GDT_ENTRY_TLS_MIN + i] = t->tls_array[i]; } DECLARE_PER_CPU(bool, __tss_limit_invalid); static inline void force_reload_TR(void) { struct desc_struct *d = get_current_gdt_rw(); tss_desc tss; memcpy(&tss, &d[GDT_ENTRY_TSS], sizeof(tss_desc)); /* * LTR requires an available TSS, and the TSS is currently * busy. Make it be available so that LTR will work. */ tss.type = DESC_TSS; write_gdt_entry(d, GDT_ENTRY_TSS, &tss, DESC_TSS); load_TR_desc(); this_cpu_write(__tss_limit_invalid, false); } /* * Call this if you need the TSS limit to be correct, which should be the case * if and only if you have TIF_IO_BITMAP set or you're switching to a task * with TIF_IO_BITMAP set. */ static inline void refresh_tss_limit(void) { DEBUG_LOCKS_WARN_ON(preemptible()); if (unlikely(this_cpu_read(__tss_limit_invalid))) force_reload_TR(); } /* * If you do something evil that corrupts the cached TSS limit (I'm looking * at you, VMX exits), call this function. * * The optimization here is that the TSS limit only matters for Linux if the * IO bitmap is in use. If the TSS limit gets forced to its minimum value, * everything works except that IO bitmap will be ignored and all CPL 3 IO * instructions will #GP, which is exactly what we want for normal tasks. */ static inline void invalidate_tss_limit(void) { DEBUG_LOCKS_WARN_ON(preemptible()); if (unlikely(test_thread_flag(TIF_IO_BITMAP))) force_reload_TR(); else this_cpu_write(__tss_limit_invalid, true); } /* This intentionally ignores lm, since 32-bit apps don't have that field. */ #define LDT_empty(info) \ ((info)->base_addr == 0 && \ (info)->limit == 0 && \ (info)->contents == 0 && \ (info)->read_exec_only == 1 && \ (info)->seg_32bit == 0 && \ (info)->limit_in_pages == 0 && \ (info)->seg_not_present == 1 && \ (info)->useable == 0) /* Lots of programs expect an all-zero user_desc to mean "no segment at all". */ static inline bool LDT_zero(const struct user_desc *info) { return (info->base_addr == 0 && info->limit == 0 && info->contents == 0 && info->read_exec_only == 0 && info->seg_32bit == 0 && info->limit_in_pages == 0 && info->seg_not_present == 0 && info->useable == 0); } static inline void clear_LDT(void) { set_ldt(NULL, 0); } static inline unsigned long get_desc_base(const struct desc_struct *desc) { return (unsigned)(desc->base0 | ((desc->base1) << 16) | ((desc->base2) << 24)); } static inline void set_desc_base(struct desc_struct *desc, unsigned long base) { desc->base0 = base & 0xffff; desc->base1 = (base >> 16) & 0xff; desc->base2 = (base >> 24) & 0xff; } static inline unsigned long get_desc_limit(const struct desc_struct *desc) { return desc->limit0 | (desc->limit1 << 16); } static inline void set_desc_limit(struct desc_struct *desc, unsigned long limit) { desc->limit0 = limit & 0xffff; desc->limit1 = (limit >> 16) & 0xf; } static inline void init_idt_data(struct idt_data *data, unsigned int n, const void *addr) { BUG_ON(n > 0xFF); memset(data, 0, sizeof(*data)); data->vector = n; data->addr = addr; data->segment = __KERNEL_CS; data->bits.type = GATE_INTERRUPT; data->bits.p = 1; } static inline void idt_init_desc(gate_desc *gate, const struct idt_data *d) { unsigned long addr = (unsigned long) d->addr; gate->offset_low = (u16) addr; gate->segment = (u16) d->segment; gate->bits = d->bits; gate->offset_middle = (u16) (addr >> 16); #ifdef CONFIG_X86_64 gate->offset_high = (u32) (addr >> 32); gate->reserved = 0; #endif } extern unsigned long system_vectors[]; extern void load_current_idt(void); extern void idt_setup_early_handler(void); extern void idt_setup_early_traps(void); extern void idt_setup_traps(void); extern void idt_setup_apic_and_irq_gates(void); extern bool idt_is_f00f_address(unsigned long address); #ifdef CONFIG_X86_64 extern void idt_setup_early_pf(void); #else static inline void idt_setup_early_pf(void) { } #endif extern void idt_invalidate(void); #endif /* _ASM_X86_DESC_H */ |
| 193 131 192 10 59 13 127 5 113 8 113 113 113 113 113 113 113 113 37 79 46 46 39 32 7 7 112 113 89 31 113 48 48 48 47 48 47 48 48 48 48 46 45 48 3 48 48 48 20 48 56 56 56 56 56 56 56 56 55 55 55 46 55 55 56 1 47 47 47 47 4 4 3 3 3 2 1 1 4 3 2 1 2 2 13 13 1 12 3 2 13 2 1 2 1 1 6 6 2 5 3 6 1 7 8 1 1 3 2 1 1 1 10 1 1 1 1 1 10 3 2 1 4 3 1 20 18 19 20 18 16 7 16 12 4 15 5 4 8 13 2 2 2 888 24 47 18 48 47 48 4 882 197 3 3 3 3 3 3 3 3 3 2 2 2 4 4 4 3 2 3 4 4 456 456 24 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 | // SPDX-License-Identifier: GPL-2.0-or-later /* linux/net/ipv4/arp.c * * Copyright (C) 1994 by Florian La Roche * * This module implements the Address Resolution Protocol ARP (RFC 826), * which is used to convert IP addresses (or in the future maybe other * high-level addresses) into a low-level hardware address (like an Ethernet * address). * * Fixes: * Alan Cox : Removed the Ethernet assumptions in * Florian's code * Alan Cox : Fixed some small errors in the ARP * logic * Alan Cox : Allow >4K in /proc * Alan Cox : Make ARP add its own protocol entry * Ross Martin : Rewrote arp_rcv() and arp_get_info() * Stephen Henson : Add AX25 support to arp_get_info() * Alan Cox : Drop data when a device is downed. * Alan Cox : Use init_timer(). * Alan Cox : Double lock fixes. * Martin Seine : Move the arphdr structure * to if_arp.h for compatibility. * with BSD based programs. * Andrew Tridgell : Added ARP netmask code and * re-arranged proxy handling. * Alan Cox : Changed to use notifiers. * Niibe Yutaka : Reply for this device or proxies only. * Alan Cox : Don't proxy across hardware types! * Jonathan Naylor : Added support for NET/ROM. * Mike Shaver : RFC1122 checks. * Jonathan Naylor : Only lookup the hardware address for * the correct hardware type. * Germano Caronni : Assorted subtle races. * Craig Schlenter : Don't modify permanent entry * during arp_rcv. * Russ Nelson : Tidied up a few bits. * Alexey Kuznetsov: Major changes to caching and behaviour, * eg intelligent arp probing and * generation * of host down events. * Alan Cox : Missing unlock in device events. * Eckes : ARP ioctl control errors. * Alexey Kuznetsov: Arp free fix. * Manuel Rodriguez: Gratuitous ARP. * Jonathan Layes : Added arpd support through kerneld * message queue (960314) * Mike Shaver : /proc/sys/net/ipv4/arp_* support * Mike McLagan : Routing by source * Stuart Cheshire : Metricom and grat arp fixes * *** FOR 2.1 clean this up *** * Lawrence V. Stefani: (08/12/96) Added FDDI support. * Alan Cox : Took the AP1000 nasty FDDI hack and * folded into the mainstream FDDI code. * Ack spit, Linus how did you allow that * one in... * Jes Sorensen : Make FDDI work again in 2.1.x and * clean up the APFDDI & gen. FDDI bits. * Alexey Kuznetsov: new arp state machine; * now it is in net/core/neighbour.c. * Krzysztof Halasa: Added Frame Relay ARP support. * Arnaldo C. Melo : convert /proc/net/arp to seq_file * Shmulik Hen: Split arp_send to arp_create and * arp_xmit so intermediate drivers like * bonding can change the skb before * sending (e.g. insert 8021q tag). * Harald Welte : convert to make use of jenkins hash * Jesper D. Brouer: Proxy ARP PVLAN RFC 3069 support. */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/module.h> #include <linux/types.h> #include <linux/string.h> #include <linux/kernel.h> #include <linux/capability.h> #include <linux/socket.h> #include <linux/sockios.h> #include <linux/errno.h> #include <linux/in.h> #include <linux/mm.h> #include <linux/inet.h> #include <linux/inetdevice.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> #include <linux/fddidevice.h> #include <linux/if_arp.h> #include <linux/skbuff.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> #include <linux/stat.h> #include <linux/init.h> #include <linux/net.h> #include <linux/rcupdate.h> #include <linux/slab.h> #ifdef CONFIG_SYSCTL #include <linux/sysctl.h> #endif #include <net/net_namespace.h> #include <net/ip.h> #include <net/icmp.h> #include <net/route.h> #include <net/protocol.h> #include <net/tcp.h> #include <net/sock.h> #include <net/arp.h> #include <net/ax25.h> #include <net/netrom.h> #include <net/dst_metadata.h> #include <net/ip_tunnels.h> #include <linux/uaccess.h> #include <linux/netfilter_arp.h> /* * Interface to generic neighbour cache. */ static u32 arp_hash(const void *pkey, const struct net_device *dev, __u32 *hash_rnd); static bool arp_key_eq(const struct neighbour *n, const void *pkey); static int arp_constructor(struct neighbour *neigh); static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb); static void arp_error_report(struct neighbour *neigh, struct sk_buff *skb); static void parp_redo(struct sk_buff *skb); static int arp_is_multicast(const void *pkey); static const struct neigh_ops arp_generic_ops = { .family = AF_INET, .solicit = arp_solicit, .error_report = arp_error_report, .output = neigh_resolve_output, .connected_output = neigh_connected_output, }; static const struct neigh_ops arp_hh_ops = { .family = AF_INET, .solicit = arp_solicit, .error_report = arp_error_report, .output = neigh_resolve_output, .connected_output = neigh_resolve_output, }; static const struct neigh_ops arp_direct_ops = { .family = AF_INET, .output = neigh_direct_output, .connected_output = neigh_direct_output, }; struct neigh_table arp_tbl = { .family = AF_INET, .key_len = 4, .protocol = cpu_to_be16(ETH_P_IP), .hash = arp_hash, .key_eq = arp_key_eq, .constructor = arp_constructor, .proxy_redo = parp_redo, .is_multicast = arp_is_multicast, .id = "arp_cache", .parms = { .tbl = &arp_tbl, .reachable_time = 30 * HZ, .data = { [NEIGH_VAR_MCAST_PROBES] = 3, [NEIGH_VAR_UCAST_PROBES] = 3, [NEIGH_VAR_RETRANS_TIME] = 1 * HZ, [NEIGH_VAR_BASE_REACHABLE_TIME] = 30 * HZ, [NEIGH_VAR_DELAY_PROBE_TIME] = 5 * HZ, [NEIGH_VAR_INTERVAL_PROBE_TIME_MS] = 5 * HZ, [NEIGH_VAR_GC_STALETIME] = 60 * HZ, [NEIGH_VAR_QUEUE_LEN_BYTES] = SK_WMEM_MAX, [NEIGH_VAR_PROXY_QLEN] = 64, [NEIGH_VAR_ANYCAST_DELAY] = 1 * HZ, [NEIGH_VAR_PROXY_DELAY] = (8 * HZ) / 10, [NEIGH_VAR_LOCKTIME] = 1 * HZ, }, }, .gc_interval = 30 * HZ, .gc_thresh1 = 128, .gc_thresh2 = 512, .gc_thresh3 = 1024, }; EXPORT_SYMBOL(arp_tbl); int arp_mc_map(__be32 addr, u8 *haddr, struct net_device *dev, int dir) { switch (dev->type) { case ARPHRD_ETHER: case ARPHRD_FDDI: case ARPHRD_IEEE802: ip_eth_mc_map(addr, haddr); return 0; case ARPHRD_INFINIBAND: ip_ib_mc_map(addr, dev->broadcast, haddr); return 0; case ARPHRD_IPGRE: ip_ipgre_mc_map(addr, dev->broadcast, haddr); return 0; default: if (dir) { memcpy(haddr, dev->broadcast, dev->addr_len); return 0; } } return -EINVAL; } static u32 arp_hash(const void *pkey, const struct net_device *dev, __u32 *hash_rnd) { return arp_hashfn(pkey, dev, hash_rnd); } static bool arp_key_eq(const struct neighbour *neigh, const void *pkey) { return neigh_key_eq32(neigh, pkey); } static int arp_constructor(struct neighbour *neigh) { __be32 addr; struct net_device *dev = neigh->dev; struct in_device *in_dev; struct neigh_parms *parms; u32 inaddr_any = INADDR_ANY; if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT)) memcpy(neigh->primary_key, &inaddr_any, arp_tbl.key_len); addr = *(__be32 *)neigh->primary_key; rcu_read_lock(); in_dev = __in_dev_get_rcu(dev); if (!in_dev) { rcu_read_unlock(); return -EINVAL; } neigh->type = inet_addr_type_dev_table(dev_net(dev), dev, addr); parms = in_dev->arp_parms; __neigh_parms_put(neigh->parms); neigh->parms = neigh_parms_clone(parms); rcu_read_unlock(); if (!dev->header_ops) { neigh->nud_state = NUD_NOARP; neigh->ops = &arp_direct_ops; neigh->output = neigh_direct_output; } else { /* Good devices (checked by reading texts, but only Ethernet is tested) ARPHRD_ETHER: (ethernet, apfddi) ARPHRD_FDDI: (fddi) ARPHRD_IEEE802: (tr) ARPHRD_METRICOM: (strip) ARPHRD_ARCNET: etc. etc. etc. ARPHRD_IPDDP will also work, if author repairs it. I did not it, because this driver does not work even in old paradigm. */ if (neigh->type == RTN_MULTICAST) { neigh->nud_state = NUD_NOARP; arp_mc_map(addr, neigh->ha, dev, 1); } else if (dev->flags & (IFF_NOARP | IFF_LOOPBACK)) { neigh->nud_state = NUD_NOARP; memcpy(neigh->ha, dev->dev_addr, dev->addr_len); } else if (neigh->type == RTN_BROADCAST || (dev->flags & IFF_POINTOPOINT)) { neigh->nud_state = NUD_NOARP; memcpy(neigh->ha, dev->broadcast, dev->addr_len); } if (dev->header_ops->cache) neigh->ops = &arp_hh_ops; else neigh->ops = &arp_generic_ops; if (neigh->nud_state & NUD_VALID) neigh->output = neigh->ops->connected_output; else neigh->output = neigh->ops->output; } return 0; } static void arp_error_report(struct neighbour *neigh, struct sk_buff *skb) { dst_link_failure(skb); kfree_skb_reason(skb, SKB_DROP_REASON_NEIGH_FAILED); } /* Create and send an arp packet. */ static void arp_send_dst(int type, int ptype, __be32 dest_ip, struct net_device *dev, __be32 src_ip, const unsigned char *dest_hw, const unsigned char *src_hw, const unsigned char *target_hw, struct dst_entry *dst) { struct sk_buff *skb; /* arp on this interface. */ if (dev->flags & IFF_NOARP) return; skb = arp_create(type, ptype, dest_ip, dev, src_ip, dest_hw, src_hw, target_hw); if (!skb) return; skb_dst_set(skb, dst_clone(dst)); arp_xmit(skb); } void arp_send(int type, int ptype, __be32 dest_ip, struct net_device *dev, __be32 src_ip, const unsigned char *dest_hw, const unsigned char *src_hw, const unsigned char *target_hw) { arp_send_dst(type, ptype, dest_ip, dev, src_ip, dest_hw, src_hw, target_hw, NULL); } EXPORT_SYMBOL(arp_send); static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb) { __be32 saddr = 0; u8 dst_ha[MAX_ADDR_LEN], *dst_hw = NULL; struct net_device *dev = neigh->dev; __be32 target = *(__be32 *)neigh->primary_key; int probes = atomic_read(&neigh->probes); struct in_device *in_dev; struct dst_entry *dst = NULL; rcu_read_lock(); in_dev = __in_dev_get_rcu(dev); if (!in_dev) { rcu_read_unlock(); return; } switch (IN_DEV_ARP_ANNOUNCE(in_dev)) { default: case 0: /* By default announce any local IP */ if (skb && inet_addr_type_dev_table(dev_net(dev), dev, ip_hdr(skb)->saddr) == RTN_LOCAL) saddr = ip_hdr(skb)->saddr; break; case 1: /* Restrict announcements of saddr in same subnet */ if (!skb) break; saddr = ip_hdr(skb)->saddr; if (inet_addr_type_dev_table(dev_net(dev), dev, saddr) == RTN_LOCAL) { /* saddr should be known to target */ if (inet_addr_onlink(in_dev, target, saddr)) break; } saddr = 0; break; case 2: /* Avoid secondary IPs, get a primary/preferred one */ break; } rcu_read_unlock(); if (!saddr) saddr = inet_select_addr(dev, target, RT_SCOPE_LINK); probes -= NEIGH_VAR(neigh->parms, UCAST_PROBES); if (probes < 0) { if (!(READ_ONCE(neigh->nud_state) & NUD_VALID)) pr_debug("trying to ucast probe in NUD_INVALID\n"); neigh_ha_snapshot(dst_ha, neigh, dev); dst_hw = dst_ha; } else { probes -= NEIGH_VAR(neigh->parms, APP_PROBES); if (probes < 0) { neigh_app_ns(neigh); return; } } if (skb && !(dev->priv_flags & IFF_XMIT_DST_RELEASE)) dst = skb_dst(skb); arp_send_dst(ARPOP_REQUEST, ETH_P_ARP, target, dev, saddr, dst_hw, dev->dev_addr, NULL, dst); } static int arp_ignore(struct in_device *in_dev, __be32 sip, __be32 tip) { struct net *net = dev_net(in_dev->dev); int scope; switch (IN_DEV_ARP_IGNORE(in_dev)) { case 0: /* Reply, the tip is already validated */ return 0; case 1: /* Reply only if tip is configured on the incoming interface */ sip = 0; scope = RT_SCOPE_HOST; break; case 2: /* * Reply only if tip is configured on the incoming interface * and is in same subnet as sip */ scope = RT_SCOPE_HOST; break; case 3: /* Do not reply for scope host addresses */ sip = 0; scope = RT_SCOPE_LINK; in_dev = NULL; break; case 4: /* Reserved */ case 5: case 6: case 7: return 0; case 8: /* Do not reply */ return 1; default: return 0; } return !inet_confirm_addr(net, in_dev, sip, tip, scope); } static int arp_accept(struct in_device *in_dev, __be32 sip) { struct net *net = dev_net(in_dev->dev); int scope = RT_SCOPE_LINK; switch (IN_DEV_ARP_ACCEPT(in_dev)) { case 0: /* Don't create new entries from garp */ return 0; case 1: /* Create new entries from garp */ return 1; case 2: /* Create a neighbor in the arp table only if sip * is in the same subnet as an address configured * on the interface that received the garp message */ return !!inet_confirm_addr(net, in_dev, sip, 0, scope); default: return 0; } } static int arp_filter(__be32 sip, __be32 tip, struct net_device *dev) { struct rtable *rt; int flag = 0; /*unsigned long now; */ struct net *net = dev_net(dev); rt = ip_route_output(net, sip, tip, 0, l3mdev_master_ifindex_rcu(dev), RT_SCOPE_UNIVERSE); if (IS_ERR(rt)) return 1; if (rt->dst.dev != dev) { __NET_INC_STATS(net, LINUX_MIB_ARPFILTER); flag = 1; } ip_rt_put(rt); return flag; } /* * Check if we can use proxy ARP for this path */ static inline int arp_fwd_proxy(struct in_device *in_dev, struct net_device *dev, struct rtable *rt) { struct in_device *out_dev; int imi, omi = -1; if (rt->dst.dev == dev) return 0; if (!IN_DEV_PROXY_ARP(in_dev)) return 0; imi = IN_DEV_MEDIUM_ID(in_dev); if (imi == 0) return 1; if (imi == -1) return 0; /* place to check for proxy_arp for routes */ out_dev = __in_dev_get_rcu(rt->dst.dev); if (out_dev) omi = IN_DEV_MEDIUM_ID(out_dev); return omi != imi && omi != -1; } /* * Check for RFC3069 proxy arp private VLAN (allow to send back to same dev) * * RFC3069 supports proxy arp replies back to the same interface. This * is done to support (ethernet) switch features, like RFC 3069, where * the individual ports are not allowed to communicate with each * other, BUT they are allowed to talk to the upstream router. As * described in RFC 3069, it is possible to allow these hosts to * communicate through the upstream router, by proxy_arp'ing. * * RFC 3069: "VLAN Aggregation for Efficient IP Address Allocation" * * This technology is known by different names: * In RFC 3069 it is called VLAN Aggregation. * Cisco and Allied Telesyn call it Private VLAN. * Hewlett-Packard call it Source-Port filtering or port-isolation. * Ericsson call it MAC-Forced Forwarding (RFC Draft). * */ static inline int arp_fwd_pvlan(struct in_device *in_dev, struct net_device *dev, struct rtable *rt, __be32 sip, __be32 tip) { /* Private VLAN is only concerned about the same ethernet segment */ if (rt->dst.dev != dev) return 0; /* Don't reply on self probes (often done by windowz boxes)*/ if (sip == tip) return 0; if (IN_DEV_PROXY_ARP_PVLAN(in_dev)) return 1; else return 0; } /* * Interface to link layer: send routine and receive handler. */ /* * Create an arp packet. If dest_hw is not set, we create a broadcast * message. */ struct sk_buff *arp_create(int type, int ptype, __be32 dest_ip, struct net_device *dev, __be32 src_ip, const unsigned char *dest_hw, const unsigned char *src_hw, const unsigned char *target_hw) { struct sk_buff *skb; struct arphdr *arp; unsigned char *arp_ptr; int hlen = LL_RESERVED_SPACE(dev); int tlen = dev->needed_tailroom; /* * Allocate a buffer */ skb = alloc_skb(arp_hdr_len(dev) + hlen + tlen, GFP_ATOMIC); if (!skb) return NULL; skb_reserve(skb, hlen); skb_reset_network_header(skb); arp = skb_put(skb, arp_hdr_len(dev)); skb->dev = dev; skb->protocol = htons(ETH_P_ARP); if (!src_hw) src_hw = dev->dev_addr; if (!dest_hw) dest_hw = dev->broadcast; /* * Fill the device header for the ARP frame */ if (dev_hard_header(skb, dev, ptype, dest_hw, src_hw, skb->len) < 0) goto out; /* * Fill out the arp protocol part. * * The arp hardware type should match the device type, except for FDDI, * which (according to RFC 1390) should always equal 1 (Ethernet). */ /* * Exceptions everywhere. AX.25 uses the AX.25 PID value not the * DIX code for the protocol. Make these device structure fields. */ switch (dev->type) { default: arp->ar_hrd = htons(dev->type); arp->ar_pro = htons(ETH_P_IP); break; #if IS_ENABLED(CONFIG_AX25) case ARPHRD_AX25: arp->ar_hrd = htons(ARPHRD_AX25); arp->ar_pro = htons(AX25_P_IP); break; #if IS_ENABLED(CONFIG_NETROM) case ARPHRD_NETROM: arp->ar_hrd = htons(ARPHRD_NETROM); arp->ar_pro = htons(AX25_P_IP); break; #endif #endif #if IS_ENABLED(CONFIG_FDDI) case ARPHRD_FDDI: arp->ar_hrd = htons(ARPHRD_ETHER); arp->ar_pro = htons(ETH_P_IP); break; #endif } arp->ar_hln = dev->addr_len; arp->ar_pln = 4; arp->ar_op = htons(type); arp_ptr = (unsigned char *)(arp + 1); memcpy(arp_ptr, src_hw, dev->addr_len); arp_ptr += dev->addr_len; memcpy(arp_ptr, &src_ip, 4); arp_ptr += 4; switch (dev->type) { #if IS_ENABLED(CONFIG_FIREWIRE_NET) case ARPHRD_IEEE1394: break; #endif default: if (target_hw) memcpy(arp_ptr, target_hw, dev->addr_len); else memset(arp_ptr, 0, dev->addr_len); arp_ptr += dev->addr_len; } memcpy(arp_ptr, &dest_ip, 4); return skb; out: kfree_skb(skb); return NULL; } EXPORT_SYMBOL(arp_create); static int arp_xmit_finish(struct net *net, struct sock *sk, struct sk_buff *skb) { return dev_queue_xmit(skb); } /* * Send an arp packet. */ void arp_xmit(struct sk_buff *skb) { rcu_read_lock(); /* Send it off, maybe filter it using firewalling first. */ NF_HOOK(NFPROTO_ARP, NF_ARP_OUT, dev_net_rcu(skb->dev), NULL, skb, NULL, skb->dev, arp_xmit_finish); rcu_read_unlock(); } EXPORT_SYMBOL(arp_xmit); static bool arp_is_garp(struct net *net, struct net_device *dev, int *addr_type, __be16 ar_op, __be32 sip, __be32 tip, unsigned char *sha, unsigned char *tha) { bool is_garp = tip == sip; /* Gratuitous ARP _replies_ also require target hwaddr to be * the same as source. */ if (is_garp && ar_op == htons(ARPOP_REPLY)) is_garp = /* IPv4 over IEEE 1394 doesn't provide target * hardware address field in its ARP payload. */ tha && !memcmp(tha, sha, dev->addr_len); if (is_garp) { *addr_type = inet_addr_type_dev_table(net, dev, sip); if (*addr_type != RTN_UNICAST) is_garp = false; } return is_garp; } /* * Process an arp request. */ static int arp_process(struct net *net, struct sock *sk, struct sk_buff *skb) { struct net_device *dev = skb->dev; struct in_device *in_dev = __in_dev_get_rcu(dev); struct arphdr *arp; unsigned char *arp_ptr; struct rtable *rt; unsigned char *sha; unsigned char *tha = NULL; __be32 sip, tip; u16 dev_type = dev->type; int addr_type; struct neighbour *n; struct dst_entry *reply_dst = NULL; bool is_garp = false; /* arp_rcv below verifies the ARP header and verifies the device * is ARP'able. */ if (!in_dev) goto out_free_skb; arp = arp_hdr(skb); switch (dev_type) { default: if (arp->ar_pro != htons(ETH_P_IP) || htons(dev_type) != arp->ar_hrd) goto out_free_skb; break; case ARPHRD_ETHER: case ARPHRD_FDDI: case ARPHRD_IEEE802: /* * ETHERNET, and Fibre Channel (which are IEEE 802 * devices, according to RFC 2625) devices will accept ARP * hardware types of either 1 (Ethernet) or 6 (IEEE 802.2). * This is the case also of FDDI, where the RFC 1390 says that * FDDI devices should accept ARP hardware of (1) Ethernet, * however, to be more robust, we'll accept both 1 (Ethernet) * or 6 (IEEE 802.2) */ if ((arp->ar_hrd != htons(ARPHRD_ETHER) && arp->ar_hrd != htons(ARPHRD_IEEE802)) || arp->ar_pro != htons(ETH_P_IP)) goto out_free_skb; break; case ARPHRD_AX25: if (arp->ar_pro != htons(AX25_P_IP) || arp->ar_hrd != htons(ARPHRD_AX25)) goto out_free_skb; break; case ARPHRD_NETROM: if (arp->ar_pro != htons(AX25_P_IP) || arp->ar_hrd != htons(ARPHRD_NETROM)) goto out_free_skb; break; } /* Understand only these message types */ if (arp->ar_op != htons(ARPOP_REPLY) && arp->ar_op != htons(ARPOP_REQUEST)) goto out_free_skb; /* * Extract fields */ arp_ptr = (unsigned char *)(arp + 1); sha = arp_ptr; arp_ptr += dev->addr_len; memcpy(&sip, arp_ptr, 4); arp_ptr += 4; switch (dev_type) { #if IS_ENABLED(CONFIG_FIREWIRE_NET) case ARPHRD_IEEE1394: break; #endif default: tha = arp_ptr; arp_ptr += dev->addr_len; } memcpy(&tip, arp_ptr, 4); /* * Check for bad requests for 127.x.x.x and requests for multicast * addresses. If this is one such, delete it. */ if (ipv4_is_multicast(tip) || (!IN_DEV_ROUTE_LOCALNET(in_dev) && ipv4_is_loopback(tip))) goto out_free_skb; /* * For some 802.11 wireless deployments (and possibly other networks), * there will be an ARP proxy and gratuitous ARP frames are attacks * and thus should not be accepted. */ if (sip == tip && IN_DEV_ORCONF(in_dev, DROP_GRATUITOUS_ARP)) goto out_free_skb; /* * Special case: We must set Frame Relay source Q.922 address */ if (dev_type == ARPHRD_DLCI) sha = dev->broadcast; /* * Process entry. The idea here is we want to send a reply if it is a * request for us or if it is a request for someone else that we hold * a proxy for. We want to add an entry to our cache if it is a reply * to us or if it is a request for our address. * (The assumption for this last is that if someone is requesting our * address, they are probably intending to talk to us, so it saves time * if we cache their address. Their address is also probably not in * our cache, since ours is not in their cache.) * * Putting this another way, we only care about replies if they are to * us, in which case we add them to the cache. For requests, we care * about those for us and those for our proxies. We reply to both, * and in the case of requests for us we add the requester to the arp * cache. */ if (arp->ar_op == htons(ARPOP_REQUEST) && skb_metadata_dst(skb)) reply_dst = (struct dst_entry *) iptunnel_metadata_reply(skb_metadata_dst(skb), GFP_ATOMIC); /* Special case: IPv4 duplicate address detection packet (RFC2131) */ if (sip == 0) { if (arp->ar_op == htons(ARPOP_REQUEST) && inet_addr_type_dev_table(net, dev, tip) == RTN_LOCAL && !arp_ignore(in_dev, sip, tip)) arp_send_dst(ARPOP_REPLY, ETH_P_ARP, sip, dev, tip, sha, dev->dev_addr, sha, reply_dst); goto out_consume_skb; } if (arp->ar_op == htons(ARPOP_REQUEST) && ip_route_input_noref(skb, tip, sip, 0, dev) == 0) { rt = skb_rtable(skb); addr_type = rt->rt_type; if (addr_type == RTN_LOCAL) { int dont_send; dont_send = arp_ignore(in_dev, sip, tip); if (!dont_send && IN_DEV_ARPFILTER(in_dev)) dont_send = arp_filter(sip, tip, dev); if (!dont_send) { n = neigh_event_ns(&arp_tbl, sha, &sip, dev); if (n) { arp_send_dst(ARPOP_REPLY, ETH_P_ARP, sip, dev, tip, sha, dev->dev_addr, sha, reply_dst); neigh_release(n); } } goto out_consume_skb; } else if (IN_DEV_FORWARD(in_dev)) { if (addr_type == RTN_UNICAST && (arp_fwd_proxy(in_dev, dev, rt) || arp_fwd_pvlan(in_dev, dev, rt, sip, tip) || (rt->dst.dev != dev && pneigh_lookup(&arp_tbl, net, &tip, dev, 0)))) { n = neigh_event_ns(&arp_tbl, sha, &sip, dev); if (n) neigh_release(n); if (NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED || skb->pkt_type == PACKET_HOST || NEIGH_VAR(in_dev->arp_parms, PROXY_DELAY) == 0) { arp_send_dst(ARPOP_REPLY, ETH_P_ARP, sip, dev, tip, sha, dev->dev_addr, sha, reply_dst); } else { pneigh_enqueue(&arp_tbl, in_dev->arp_parms, skb); goto out_free_dst; } goto out_consume_skb; } } } /* Update our ARP tables */ n = __neigh_lookup(&arp_tbl, &sip, dev, 0); addr_type = -1; if (n || arp_accept(in_dev, sip)) { is_garp = arp_is_garp(net, dev, &addr_type, arp->ar_op, sip, tip, sha, tha); } if (arp_accept(in_dev, sip)) { /* Unsolicited ARP is not accepted by default. It is possible, that this option should be enabled for some devices (strip is candidate) */ if (!n && (is_garp || (arp->ar_op == htons(ARPOP_REPLY) && (addr_type == RTN_UNICAST || (addr_type < 0 && /* postpone calculation to as late as possible */ inet_addr_type_dev_table(net, dev, sip) == RTN_UNICAST))))) n = __neigh_lookup(&arp_tbl, &sip, dev, 1); } if (n) { int state = NUD_REACHABLE; int override; /* If several different ARP replies follows back-to-back, use the FIRST one. It is possible, if several proxy agents are active. Taking the first reply prevents arp trashing and chooses the fastest router. */ override = time_after(jiffies, n->updated + NEIGH_VAR(n->parms, LOCKTIME)) || is_garp; /* Broadcast replies and request packets do not assert neighbour reachability. */ if (arp->ar_op != htons(ARPOP_REPLY) || skb->pkt_type != PACKET_HOST) state = NUD_STALE; neigh_update(n, sha, state, override ? NEIGH_UPDATE_F_OVERRIDE : 0, 0); neigh_release(n); } out_consume_skb: consume_skb(skb); out_free_dst: dst_release(reply_dst); return NET_RX_SUCCESS; out_free_skb: kfree_skb(skb); return NET_RX_DROP; } static void parp_redo(struct sk_buff *skb) { arp_process(dev_net(skb->dev), NULL, skb); } static int arp_is_multicast(const void *pkey) { return ipv4_is_multicast(*((__be32 *)pkey)); } /* * Receive an arp request from the device layer. */ static int arp_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev) { const struct arphdr *arp; /* do not tweak dropwatch on an ARP we will ignore */ if (dev->flags & IFF_NOARP || skb->pkt_type == PACKET_OTHERHOST || skb->pkt_type == PACKET_LOOPBACK) goto consumeskb; skb = skb_share_check(skb, GFP_ATOMIC); if (!skb) goto out_of_mem; /* ARP header, plus 2 device addresses, plus 2 IP addresses. */ if (!pskb_may_pull(skb, arp_hdr_len(dev))) goto freeskb; arp = arp_hdr(skb); if (arp->ar_hln != dev->addr_len || arp->ar_pln != 4) goto freeskb; memset(NEIGH_CB(skb), 0, sizeof(struct neighbour_cb)); return NF_HOOK(NFPROTO_ARP, NF_ARP_IN, dev_net(dev), NULL, skb, dev, NULL, arp_process); consumeskb: consume_skb(skb); return NET_RX_SUCCESS; freeskb: kfree_skb(skb); out_of_mem: return NET_RX_DROP; } /* * User level interface (ioctl) */ static struct net_device *arp_req_dev_by_name(struct net *net, struct arpreq *r, bool getarp) { struct net_device *dev; if (getarp) dev = dev_get_by_name_rcu(net, r->arp_dev); else dev = __dev_get_by_name(net, r->arp_dev); if (!dev) return ERR_PTR(-ENODEV); /* Mmmm... It is wrong... ARPHRD_NETROM == 0 */ if (!r->arp_ha.sa_family) r->arp_ha.sa_family = dev->type; if ((r->arp_flags & ATF_COM) && r->arp_ha.sa_family != dev->type) return ERR_PTR(-EINVAL); return dev; } static struct net_device *arp_req_dev(struct net *net, struct arpreq *r) { struct net_device *dev; struct rtable *rt; __be32 ip; if (r->arp_dev[0]) return arp_req_dev_by_name(net, r, false); if (r->arp_flags & ATF_PUBL) return NULL; ip = ((struct sockaddr_in *)&r->arp_pa)->sin_addr.s_addr; rt = ip_route_output(net, ip, 0, 0, 0, RT_SCOPE_LINK); if (IS_ERR(rt)) return ERR_CAST(rt); dev = rt->dst.dev; ip_rt_put(rt); if (!dev) return ERR_PTR(-EINVAL); return dev; } /* * Set (create) an ARP cache entry. */ static int arp_req_set_proxy(struct net *net, struct net_device *dev, int on) { if (!dev) { IPV4_DEVCONF_ALL(net, PROXY_ARP) = on; return 0; } if (__in_dev_get_rtnl_net(dev)) { IN_DEV_CONF_SET(__in_dev_get_rtnl_net(dev), PROXY_ARP, on); return 0; } return -ENXIO; } static int arp_req_set_public(struct net *net, struct arpreq *r, struct net_device *dev) { __be32 mask = ((struct sockaddr_in *)&r->arp_netmask)->sin_addr.s_addr; if (!dev && (r->arp_flags & ATF_COM)) { dev = dev_getbyhwaddr(net, r->arp_ha.sa_family, r->arp_ha.sa_data); if (!dev) return -ENODEV; } if (mask) { __be32 ip = ((struct sockaddr_in *)&r->arp_pa)->sin_addr.s_addr; if (!pneigh_lookup(&arp_tbl, net, &ip, dev, 1)) return -ENOBUFS; return 0; } return arp_req_set_proxy(net, dev, 1); } static int arp_req_set(struct net *net, struct arpreq *r) { struct neighbour *neigh; struct net_device *dev; __be32 ip; int err; dev = arp_req_dev(net, r); if (IS_ERR(dev)) return PTR_ERR(dev); if (r->arp_flags & ATF_PUBL) return arp_req_set_public(net, r, dev); switch (dev->type) { #if IS_ENABLED(CONFIG_FDDI) case ARPHRD_FDDI: /* * According to RFC 1390, FDDI devices should accept ARP * hardware types of 1 (Ethernet). However, to be more * robust, we'll accept hardware types of either 1 (Ethernet) * or 6 (IEEE 802.2). */ if (r->arp_ha.sa_family != ARPHRD_FDDI && r->arp_ha.sa_family != ARPHRD_ETHER && r->arp_ha.sa_family != ARPHRD_IEEE802) return -EINVAL; break; #endif default: if (r->arp_ha.sa_family != dev->type) return -EINVAL; break; } ip = ((struct sockaddr_in *)&r->arp_pa)->sin_addr.s_addr; neigh = __neigh_lookup_errno(&arp_tbl, &ip, dev); err = PTR_ERR(neigh); if (!IS_ERR(neigh)) { unsigned int state = NUD_STALE; if (r->arp_flags & ATF_PERM) { r->arp_flags |= ATF_COM; state = NUD_PERMANENT; } err = neigh_update(neigh, (r->arp_flags & ATF_COM) ? r->arp_ha.sa_data : NULL, state, NEIGH_UPDATE_F_OVERRIDE | NEIGH_UPDATE_F_ADMIN, 0); neigh_release(neigh); } return err; } static unsigned int arp_state_to_flags(struct neighbour *neigh) { if (neigh->nud_state&NUD_PERMANENT) return ATF_PERM | ATF_COM; else if (neigh->nud_state&NUD_VALID) return ATF_COM; else return 0; } /* * Get an ARP cache entry. */ static int arp_req_get(struct net *net, struct arpreq *r) { __be32 ip = ((struct sockaddr_in *) &r->arp_pa)->sin_addr.s_addr; struct neighbour *neigh; struct net_device *dev; if (!r->arp_dev[0]) return -ENODEV; dev = arp_req_dev_by_name(net, r, true); if (IS_ERR(dev)) return PTR_ERR(dev); neigh = neigh_lookup(&arp_tbl, &ip, dev); if (!neigh) return -ENXIO; if (READ_ONCE(neigh->nud_state) & NUD_NOARP) { neigh_release(neigh); return -ENXIO; } read_lock_bh(&neigh->lock); memcpy(r->arp_ha.sa_data, neigh->ha, min(dev->addr_len, sizeof(r->arp_ha.sa_data_min))); r->arp_flags = arp_state_to_flags(neigh); read_unlock_bh(&neigh->lock); neigh_release(neigh); r->arp_ha.sa_family = dev->type; netdev_copy_name(dev, r->arp_dev); return 0; } int arp_invalidate(struct net_device *dev, __be32 ip, bool force) { struct neighbour *neigh = neigh_lookup(&arp_tbl, &ip, dev); int err = -ENXIO; struct neigh_table *tbl = &arp_tbl; if (neigh) { if ((READ_ONCE(neigh->nud_state) & NUD_VALID) && !force) { neigh_release(neigh); return 0; } if (READ_ONCE(neigh->nud_state) & ~NUD_NOARP) err = neigh_update(neigh, NULL, NUD_FAILED, NEIGH_UPDATE_F_OVERRIDE| NEIGH_UPDATE_F_ADMIN, 0); write_lock_bh(&tbl->lock); neigh_release(neigh); neigh_remove_one(neigh); write_unlock_bh(&tbl->lock); } return err; } static int arp_req_delete_public(struct net *net, struct arpreq *r, struct net_device *dev) { __be32 mask = ((struct sockaddr_in *)&r->arp_netmask)->sin_addr.s_addr; if (mask) { __be32 ip = ((struct sockaddr_in *)&r->arp_pa)->sin_addr.s_addr; return pneigh_delete(&arp_tbl, net, &ip, dev); } return arp_req_set_proxy(net, dev, 0); } static int arp_req_delete(struct net *net, struct arpreq *r) { struct net_device *dev; __be32 ip; dev = arp_req_dev(net, r); if (IS_ERR(dev)) return PTR_ERR(dev); if (r->arp_flags & ATF_PUBL) return arp_req_delete_public(net, r, dev); ip = ((struct sockaddr_in *)&r->arp_pa)->sin_addr.s_addr; return arp_invalidate(dev, ip, true); } /* * Handle an ARP layer I/O control request. */ int arp_ioctl(struct net *net, unsigned int cmd, void __user *arg) { struct arpreq r; __be32 *netmask; int err; switch (cmd) { case SIOCDARP: case SIOCSARP: if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) return -EPERM; fallthrough; case SIOCGARP: err = copy_from_user(&r, arg, sizeof(struct arpreq)); if (err) return -EFAULT; break; default: return -EINVAL; } if (r.arp_pa.sa_family != AF_INET) return -EPFNOSUPPORT; if (!(r.arp_flags & ATF_PUBL) && (r.arp_flags & (ATF_NETMASK | ATF_DONTPUB))) return -EINVAL; netmask = &((struct sockaddr_in *)&r.arp_netmask)->sin_addr.s_addr; if (!(r.arp_flags & ATF_NETMASK)) *netmask = htonl(0xFFFFFFFFUL); else if (*netmask && *netmask != htonl(0xFFFFFFFFUL)) return -EINVAL; switch (cmd) { case SIOCDARP: rtnl_net_lock(net); err = arp_req_delete(net, &r); rtnl_net_unlock(net); break; case SIOCSARP: rtnl_net_lock(net); err = arp_req_set(net, &r); rtnl_net_unlock(net); break; case SIOCGARP: rcu_read_lock(); err = arp_req_get(net, &r); rcu_read_unlock(); if (!err && copy_to_user(arg, &r, sizeof(r))) err = -EFAULT; break; } return err; } static int arp_netdev_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr); struct netdev_notifier_change_info *change_info; struct in_device *in_dev; bool evict_nocarrier; switch (event) { case NETDEV_CHANGEADDR: neigh_changeaddr(&arp_tbl, dev); rt_cache_flush(dev_net(dev)); break; case NETDEV_CHANGE: change_info = ptr; if (change_info->flags_changed & IFF_NOARP) neigh_changeaddr(&arp_tbl, dev); in_dev = __in_dev_get_rtnl(dev); if (!in_dev) evict_nocarrier = true; else evict_nocarrier = IN_DEV_ARP_EVICT_NOCARRIER(in_dev); if (evict_nocarrier && !netif_carrier_ok(dev)) neigh_carrier_down(&arp_tbl, dev); break; default: break; } return NOTIFY_DONE; } static struct notifier_block arp_netdev_notifier = { .notifier_call = arp_netdev_event, }; /* Note, that it is not on notifier chain. It is necessary, that this routine was called after route cache will be flushed. */ void arp_ifdown(struct net_device *dev) { neigh_ifdown(&arp_tbl, dev); } /* * Called once on startup. */ static struct packet_type arp_packet_type __read_mostly = { .type = cpu_to_be16(ETH_P_ARP), .func = arp_rcv, }; #ifdef CONFIG_PROC_FS #if IS_ENABLED(CONFIG_AX25) /* * ax25 -> ASCII conversion */ static void ax2asc2(ax25_address *a, char *buf) { char c, *s; int n; for (n = 0, s = buf; n < 6; n++) { c = (a->ax25_call[n] >> 1) & 0x7F; if (c != ' ') *s++ = c; } *s++ = '-'; n = (a->ax25_call[6] >> 1) & 0x0F; if (n > 9) { *s++ = '1'; n -= 10; } *s++ = n + '0'; *s++ = '\0'; if (*buf == '\0' || *buf == '-') { buf[0] = '*'; buf[1] = '\0'; } } #endif /* CONFIG_AX25 */ #define HBUFFERLEN 30 static void arp_format_neigh_entry(struct seq_file *seq, struct neighbour *n) { char hbuffer[HBUFFERLEN]; int k, j; char tbuf[16]; struct net_device *dev = n->dev; int hatype = dev->type; read_lock(&n->lock); /* Convert hardware address to XX:XX:XX:XX ... form. */ #if IS_ENABLED(CONFIG_AX25) if (hatype == ARPHRD_AX25 || hatype == ARPHRD_NETROM) ax2asc2((ax25_address *)n->ha, hbuffer); else { #endif for (k = 0, j = 0; k < HBUFFERLEN - 3 && j < dev->addr_len; j++) { hbuffer[k++] = hex_asc_hi(n->ha[j]); hbuffer[k++] = hex_asc_lo(n->ha[j]); hbuffer[k++] = ':'; } if (k != 0) --k; hbuffer[k] = 0; #if IS_ENABLED(CONFIG_AX25) } #endif sprintf(tbuf, "%pI4", n->primary_key); seq_printf(seq, "%-16s 0x%-10x0x%-10x%-17s * %s\n", tbuf, hatype, arp_state_to_flags(n), hbuffer, dev->name); read_unlock(&n->lock); } static void arp_format_pneigh_entry(struct seq_file *seq, struct pneigh_entry *n) { struct net_device *dev = n->dev; int hatype = dev ? dev->type : 0; char tbuf[16]; sprintf(tbuf, "%pI4", n->key); seq_printf(seq, "%-16s 0x%-10x0x%-10x%s * %s\n", tbuf, hatype, ATF_PUBL | ATF_PERM, "00:00:00:00:00:00", dev ? dev->name : "*"); } static int arp_seq_show(struct seq_file *seq, void *v) { if (v == SEQ_START_TOKEN) { seq_puts(seq, "IP address HW type Flags " "HW address Mask Device\n"); } else { struct neigh_seq_state *state = seq->private; if (state->flags & NEIGH_SEQ_IS_PNEIGH) arp_format_pneigh_entry(seq, v); else arp_format_neigh_entry(seq, v); } return 0; } static void *arp_seq_start(struct seq_file *seq, loff_t *pos) { /* Don't want to confuse "arp -a" w/ magic entries, * so we tell the generic iterator to skip NUD_NOARP. */ return neigh_seq_start(seq, pos, &arp_tbl, NEIGH_SEQ_SKIP_NOARP); } static const struct seq_operations arp_seq_ops = { .start = arp_seq_start, .next = neigh_seq_next, .stop = neigh_seq_stop, .show = arp_seq_show, }; #endif /* CONFIG_PROC_FS */ static int __net_init arp_net_init(struct net *net) { if (!proc_create_net("arp", 0444, net->proc_net, &arp_seq_ops, sizeof(struct neigh_seq_state))) return -ENOMEM; return 0; } static void __net_exit arp_net_exit(struct net *net) { remove_proc_entry("arp", net->proc_net); } static struct pernet_operations arp_net_ops = { .init = arp_net_init, .exit = arp_net_exit, }; void __init arp_init(void) { neigh_table_init(NEIGH_ARP_TABLE, &arp_tbl); dev_add_pack(&arp_packet_type); register_pernet_subsys(&arp_net_ops); #ifdef CONFIG_SYSCTL neigh_sysctl_register(NULL, &arp_tbl.parms, NULL); #endif register_netdevice_notifier(&arp_netdev_notifier); } |
| 5982 5955 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _ASM_X86_COMPAT_H #define _ASM_X86_COMPAT_H /* * Architecture specific compatibility types */ #include <linux/types.h> #include <linux/sched.h> #include <linux/sched/task_stack.h> #include <asm/processor.h> #include <asm/user32.h> #include <asm/unistd.h> #define compat_mode_t compat_mode_t typedef u16 compat_mode_t; #define __compat_uid_t __compat_uid_t typedef u16 __compat_uid_t; typedef u16 __compat_gid_t; #define compat_dev_t compat_dev_t typedef u16 compat_dev_t; #define compat_ipc_pid_t compat_ipc_pid_t typedef u16 compat_ipc_pid_t; #define compat_statfs compat_statfs #include <asm-generic/compat.h> #define COMPAT_UTS_MACHINE "i686\0\0" typedef u16 compat_nlink_t; struct compat_stat { u32 st_dev; compat_ino_t st_ino; compat_mode_t st_mode; compat_nlink_t st_nlink; __compat_uid_t st_uid; __compat_gid_t st_gid; u32 st_rdev; u32 st_size; u32 st_blksize; u32 st_blocks; u32 st_atime; u32 st_atime_nsec; u32 st_mtime; u32 st_mtime_nsec; u32 st_ctime; u32 st_ctime_nsec; u32 __unused4; u32 __unused5; }; /* * IA32 uses 4 byte alignment for 64 bit quantities, so we need to pack the * compat flock64 structure. */ #define __ARCH_NEED_COMPAT_FLOCK64_PACKED struct compat_statfs { int f_type; int f_bsize; int f_blocks; int f_bfree; int f_bavail; int f_files; int f_ffree; compat_fsid_t f_fsid; int f_namelen; /* SunOS ignores this field. */ int f_frsize; int f_flags; int f_spare[4]; }; #ifdef CONFIG_X86_X32_ABI #define COMPAT_USE_64BIT_TIME \ (!!(task_pt_regs(current)->orig_ax & __X32_SYSCALL_BIT)) #endif static inline bool in_x32_syscall(void) { #ifdef CONFIG_X86_X32_ABI if (task_pt_regs(current)->orig_ax & __X32_SYSCALL_BIT) return true; #endif return false; } static inline bool in_32bit_syscall(void) { return in_ia32_syscall() || in_x32_syscall(); } #ifdef CONFIG_COMPAT static inline bool in_compat_syscall(void) { return in_32bit_syscall(); } #define in_compat_syscall in_compat_syscall /* override the generic impl */ #define compat_need_64bit_alignment_fixup in_ia32_syscall #endif struct compat_siginfo; #ifdef CONFIG_X86_X32_ABI int copy_siginfo_to_user32(struct compat_siginfo __user *to, const kernel_siginfo_t *from); #define copy_siginfo_to_user32 copy_siginfo_to_user32 #endif /* CONFIG_X86_X32_ABI */ #endif /* _ASM_X86_COMPAT_H */ |
| 5 3 1 1 4 8 8 8 8 8 8 8 1 5 4 12 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 | /* SPDX-License-Identifier: GPL-2.0-only */ /* * v4l2-tpg.h - Test Pattern Generator * * Copyright 2014 Cisco Systems, Inc. and/or its affiliates. All rights reserved. */ #ifndef _V4L2_TPG_H_ #define _V4L2_TPG_H_ #include <linux/types.h> #include <linux/errno.h> #include <linux/random.h> #include <linux/slab.h> #include <linux/vmalloc.h> #include <linux/videodev2.h> struct tpg_rbg_color8 { unsigned char r, g, b; }; struct tpg_rbg_color16 { __u16 r, g, b; }; enum tpg_color { TPG_COLOR_CSC_WHITE, TPG_COLOR_CSC_YELLOW, TPG_COLOR_CSC_CYAN, TPG_COLOR_CSC_GREEN, TPG_COLOR_CSC_MAGENTA, TPG_COLOR_CSC_RED, TPG_COLOR_CSC_BLUE, TPG_COLOR_CSC_BLACK, TPG_COLOR_75_YELLOW, TPG_COLOR_75_CYAN, TPG_COLOR_75_GREEN, TPG_COLOR_75_MAGENTA, TPG_COLOR_75_RED, TPG_COLOR_75_BLUE, TPG_COLOR_100_WHITE, TPG_COLOR_100_YELLOW, TPG_COLOR_100_CYAN, TPG_COLOR_100_GREEN, TPG_COLOR_100_MAGENTA, TPG_COLOR_100_RED, TPG_COLOR_100_BLUE, TPG_COLOR_100_BLACK, TPG_COLOR_TEXTFG, TPG_COLOR_TEXTBG, TPG_COLOR_RANDOM, TPG_COLOR_RAMP, TPG_COLOR_MAX = TPG_COLOR_RAMP + 256 }; extern const struct tpg_rbg_color8 tpg_colors[TPG_COLOR_MAX]; extern const unsigned short tpg_rec709_to_linear[255 * 16 + 1]; extern const unsigned short tpg_linear_to_rec709[255 * 16 + 1]; extern const struct tpg_rbg_color16 tpg_csc_colors[V4L2_COLORSPACE_DCI_P3 + 1] [V4L2_XFER_FUNC_SMPTE2084 + 1] [TPG_COLOR_CSC_BLACK + 1]; enum tpg_pattern { TPG_PAT_75_COLORBAR, TPG_PAT_100_COLORBAR, TPG_PAT_CSC_COLORBAR, TPG_PAT_100_HCOLORBAR, TPG_PAT_100_COLORSQUARES, TPG_PAT_BLACK, TPG_PAT_WHITE, TPG_PAT_RED, TPG_PAT_GREEN, TPG_PAT_BLUE, TPG_PAT_CHECKERS_16X16, TPG_PAT_CHECKERS_2X2, TPG_PAT_CHECKERS_1X1, TPG_PAT_COLOR_CHECKERS_2X2, TPG_PAT_COLOR_CHECKERS_1X1, TPG_PAT_ALTERNATING_HLINES, TPG_PAT_ALTERNATING_VLINES, TPG_PAT_CROSS_1_PIXEL, TPG_PAT_CROSS_2_PIXELS, TPG_PAT_CROSS_10_PIXELS, TPG_PAT_GRAY_RAMP, /* Must be the last pattern */ TPG_PAT_NOISE, }; extern const char * const tpg_pattern_strings[]; enum tpg_quality { TPG_QUAL_COLOR, TPG_QUAL_GRAY, TPG_QUAL_NOISE }; enum tpg_video_aspect { TPG_VIDEO_ASPECT_IMAGE, TPG_VIDEO_ASPECT_4X3, TPG_VIDEO_ASPECT_14X9_CENTRE, TPG_VIDEO_ASPECT_16X9_CENTRE, TPG_VIDEO_ASPECT_16X9_ANAMORPHIC, }; enum tpg_pixel_aspect { TPG_PIXEL_ASPECT_SQUARE, TPG_PIXEL_ASPECT_NTSC, TPG_PIXEL_ASPECT_PAL, }; enum tpg_move_mode { TPG_MOVE_NEG_FAST, TPG_MOVE_NEG, TPG_MOVE_NEG_SLOW, TPG_MOVE_NONE, TPG_MOVE_POS_SLOW, TPG_MOVE_POS, TPG_MOVE_POS_FAST, }; enum tgp_color_enc { TGP_COLOR_ENC_RGB, TGP_COLOR_ENC_YCBCR, TGP_COLOR_ENC_HSV, TGP_COLOR_ENC_LUMA, }; extern const char * const tpg_aspect_strings[]; #define TPG_MAX_PLANES 3 #define TPG_MAX_PAT_LINES 8 struct tpg_data { /* Source frame size */ unsigned src_width, src_height; /* Buffer height */ unsigned buf_height; /* Scaled output frame size */ unsigned scaled_width; u32 field; bool field_alternate; /* crop coordinates are frame-based */ struct v4l2_rect crop; /* compose coordinates are format-based */ struct v4l2_rect compose; /* border and square coordinates are frame-based */ struct v4l2_rect border; struct v4l2_rect square; /* Color-related fields */ enum tpg_quality qual; unsigned qual_offset; u8 alpha_component; bool alpha_red_only; u8 brightness; u8 contrast; u8 saturation; s16 hue; u32 fourcc; enum tgp_color_enc color_enc; u32 colorspace; u32 xfer_func; u32 ycbcr_enc; u32 hsv_enc; /* * Stores the actual transfer function, i.e. will never be * V4L2_XFER_FUNC_DEFAULT. */ u32 real_xfer_func; /* * Stores the actual Y'CbCr encoding, i.e. will never be * V4L2_YCBCR_ENC_DEFAULT. */ u32 real_hsv_enc; u32 real_ycbcr_enc; u32 quantization; /* * Stores the actual quantization, i.e. will never be * V4L2_QUANTIZATION_DEFAULT. */ u32 real_quantization; enum tpg_video_aspect vid_aspect; enum tpg_pixel_aspect pix_aspect; unsigned rgb_range; unsigned real_rgb_range; unsigned buffers; unsigned planes; bool interleaved; u8 vdownsampling[TPG_MAX_PLANES]; u8 hdownsampling[TPG_MAX_PLANES]; /* * horizontal positions must be ANDed with this value to enforce * correct boundaries for packed YUYV values. */ unsigned hmask[TPG_MAX_PLANES]; /* Used to store the colors in native format, either RGB or YUV */ u8 colors[TPG_COLOR_MAX][3]; u8 textfg[TPG_MAX_PLANES][8], textbg[TPG_MAX_PLANES][8]; /* size in bytes for two pixels in each plane */ unsigned twopixelsize[TPG_MAX_PLANES]; unsigned bytesperline[TPG_MAX_PLANES]; /* Configuration */ enum tpg_pattern pattern; bool hflip; bool vflip; unsigned perc_fill; bool perc_fill_blank; bool show_border; bool show_square; bool insert_sav; bool insert_eav; bool insert_hdmi_video_guard_band; /* Test pattern movement */ enum tpg_move_mode mv_hor_mode; int mv_hor_count; int mv_hor_step; enum tpg_move_mode mv_vert_mode; int mv_vert_count; int mv_vert_step; bool recalc_colors; bool recalc_lines; bool recalc_square_border; /* Used to store TPG_MAX_PAT_LINES lines, each with up to two planes */ unsigned max_line_width; u8 *lines[TPG_MAX_PAT_LINES][TPG_MAX_PLANES]; u8 *downsampled_lines[TPG_MAX_PAT_LINES][TPG_MAX_PLANES]; u8 *random_line[TPG_MAX_PLANES]; u8 *contrast_line[TPG_MAX_PLANES]; u8 *black_line[TPG_MAX_PLANES]; }; void tpg_init(struct tpg_data *tpg, unsigned w, unsigned h); int tpg_alloc(struct tpg_data *tpg, unsigned max_w); void tpg_free(struct tpg_data *tpg); void tpg_reset_source(struct tpg_data *tpg, unsigned width, unsigned height, u32 field); void tpg_log_status(struct tpg_data *tpg); void tpg_set_font(const u8 *f); void tpg_gen_text(const struct tpg_data *tpg, u8 *basep[TPG_MAX_PLANES][2], int y, int x, const char *text); void tpg_calc_text_basep(struct tpg_data *tpg, u8 *basep[TPG_MAX_PLANES][2], unsigned p, u8 *vbuf); unsigned tpg_g_interleaved_plane(const struct tpg_data *tpg, unsigned buf_line); void tpg_fill_plane_buffer(struct tpg_data *tpg, v4l2_std_id std, unsigned p, u8 *vbuf); void tpg_fillbuffer(struct tpg_data *tpg, v4l2_std_id std, unsigned p, u8 *vbuf); bool tpg_s_fourcc(struct tpg_data *tpg, u32 fourcc); void tpg_s_crop_compose(struct tpg_data *tpg, const struct v4l2_rect *crop, const struct v4l2_rect *compose); const char *tpg_g_color_order(const struct tpg_data *tpg); static inline void tpg_s_pattern(struct tpg_data *tpg, enum tpg_pattern pattern) { if (tpg->pattern == pattern) return; tpg->pattern = pattern; tpg->recalc_colors = true; } static inline void tpg_s_quality(struct tpg_data *tpg, enum tpg_quality qual, unsigned qual_offset) { if (tpg->qual == qual && tpg->qual_offset == qual_offset) return; tpg->qual = qual; tpg->qual_offset = qual_offset; tpg->recalc_colors = true; } static inline enum tpg_quality tpg_g_quality(const struct tpg_data *tpg) { return tpg->qual; } static inline void tpg_s_alpha_component(struct tpg_data *tpg, u8 alpha_component) { if (tpg->alpha_component == alpha_component) return; tpg->alpha_component = alpha_component; tpg->recalc_colors = true; } static inline void tpg_s_alpha_mode(struct tpg_data *tpg, bool red_only) { if (tpg->alpha_red_only == red_only) return; tpg->alpha_red_only = red_only; tpg->recalc_colors = true; } static inline void tpg_s_brightness(struct tpg_data *tpg, u8 brightness) { if (tpg->brightness == brightness) return; tpg->brightness = brightness; tpg->recalc_colors = true; } static inline void tpg_s_contrast(struct tpg_data *tpg, u8 contrast) { if (tpg->contrast == contrast) return; tpg->contrast = contrast; tpg->recalc_colors = true; } static inline void tpg_s_saturation(struct tpg_data *tpg, u8 saturation) { if (tpg->saturation == saturation) return; tpg->saturation = saturation; tpg->recalc_colors = true; } static inline void tpg_s_hue(struct tpg_data *tpg, s16 hue) { hue = clamp_t(s16, hue, -128, 128); if (tpg->hue == hue) return; tpg->hue = hue; tpg->recalc_colors = true; } static inline void tpg_s_rgb_range(struct tpg_data *tpg, unsigned rgb_range) { if (tpg->rgb_range == rgb_range) return; tpg->rgb_range = rgb_range; tpg->recalc_colors = true; } static inline void tpg_s_real_rgb_range(struct tpg_data *tpg, unsigned rgb_range) { if (tpg->real_rgb_range == rgb_range) return; tpg->real_rgb_range = rgb_range; tpg->recalc_colors = true; } static inline void tpg_s_colorspace(struct tpg_data *tpg, u32 colorspace) { if (tpg->colorspace == colorspace) return; tpg->colorspace = colorspace; tpg->recalc_colors = true; } static inline u32 tpg_g_colorspace(const struct tpg_data *tpg) { return tpg->colorspace; } static inline void tpg_s_ycbcr_enc(struct tpg_data *tpg, u32 ycbcr_enc) { if (tpg->ycbcr_enc == ycbcr_enc) return; tpg->ycbcr_enc = ycbcr_enc; tpg->recalc_colors = true; } static inline u32 tpg_g_ycbcr_enc(const struct tpg_data *tpg) { return tpg->ycbcr_enc; } static inline void tpg_s_hsv_enc(struct tpg_data *tpg, u32 hsv_enc) { if (tpg->hsv_enc == hsv_enc) return; tpg->hsv_enc = hsv_enc; tpg->recalc_colors = true; } static inline u32 tpg_g_hsv_enc(const struct tpg_data *tpg) { return tpg->hsv_enc; } static inline void tpg_s_xfer_func(struct tpg_data *tpg, u32 xfer_func) { if (tpg->xfer_func == xfer_func) return; tpg->xfer_func = xfer_func; tpg->recalc_colors = true; } static inline u32 tpg_g_xfer_func(const struct tpg_data *tpg) { return tpg->xfer_func; } static inline void tpg_s_quantization(struct tpg_data *tpg, u32 quantization) { if (tpg->quantization == quantization) return; tpg->quantization = quantization; tpg->recalc_colors = true; } static inline u32 tpg_g_quantization(const struct tpg_data *tpg) { return tpg->quantization; } static inline unsigned tpg_g_buffers(const struct tpg_data *tpg) { return tpg->buffers; } static inline unsigned tpg_g_planes(const struct tpg_data *tpg) { return tpg->interleaved ? 1 : tpg->planes; } static inline bool tpg_g_interleaved(const struct tpg_data *tpg) { return tpg->interleaved; } static inline unsigned tpg_g_twopixelsize(const struct tpg_data *tpg, unsigned plane) { return tpg->twopixelsize[plane]; } static inline unsigned tpg_hdiv(const struct tpg_data *tpg, unsigned plane, unsigned x) { return ((x / tpg->hdownsampling[plane]) & tpg->hmask[plane]) * tpg->twopixelsize[plane] / 2; } static inline unsigned tpg_hscale(const struct tpg_data *tpg, unsigned x) { return (x * tpg->scaled_width) / tpg->src_width; } static inline unsigned tpg_hscale_div(const struct tpg_data *tpg, unsigned plane, unsigned x) { return tpg_hdiv(tpg, plane, tpg_hscale(tpg, x)); } static inline unsigned tpg_g_bytesperline(const struct tpg_data *tpg, unsigned plane) { return tpg->bytesperline[plane]; } static inline void tpg_s_bytesperline(struct tpg_data *tpg, unsigned plane, unsigned bpl) { unsigned p; if (tpg->buffers > 1) { tpg->bytesperline[plane] = bpl; return; } for (p = 0; p < tpg_g_planes(tpg); p++) { unsigned plane_w = bpl * tpg->twopixelsize[p] / tpg->twopixelsize[0]; tpg->bytesperline[p] = plane_w / tpg->hdownsampling[p]; } if (tpg_g_interleaved(tpg)) tpg->bytesperline[1] = tpg->bytesperline[0]; } static inline unsigned tpg_g_line_width(const struct tpg_data *tpg, unsigned plane) { unsigned w = 0; unsigned p; if (tpg->buffers > 1) return tpg_g_bytesperline(tpg, plane); for (p = 0; p < tpg_g_planes(tpg); p++) { unsigned plane_w = tpg_g_bytesperline(tpg, p); w += plane_w / tpg->vdownsampling[p]; } return w; } static inline unsigned tpg_calc_line_width(const struct tpg_data *tpg, unsigned plane, unsigned bpl) { unsigned w = 0; unsigned p; if (tpg->buffers > 1) return bpl; for (p = 0; p < tpg_g_planes(tpg); p++) { unsigned plane_w = bpl * tpg->twopixelsize[p] / tpg->twopixelsize[0]; plane_w /= tpg->hdownsampling[p]; w += plane_w / tpg->vdownsampling[p]; } return w; } static inline unsigned tpg_calc_plane_size(const struct tpg_data *tpg, unsigned plane) { if (plane >= tpg_g_planes(tpg)) return 0; return tpg_g_bytesperline(tpg, plane) * tpg->buf_height / tpg->vdownsampling[plane]; } static inline void tpg_s_buf_height(struct tpg_data *tpg, unsigned h) { tpg->buf_height = h; } static inline void tpg_s_field(struct tpg_data *tpg, unsigned field, bool alternate) { tpg->field = field; tpg->field_alternate = alternate; } static inline void tpg_s_perc_fill(struct tpg_data *tpg, unsigned perc_fill) { tpg->perc_fill = perc_fill; } static inline unsigned tpg_g_perc_fill(const struct tpg_data *tpg) { return tpg->perc_fill; } static inline void tpg_s_perc_fill_blank(struct tpg_data *tpg, bool perc_fill_blank) { tpg->perc_fill_blank = perc_fill_blank; } static inline void tpg_s_video_aspect(struct tpg_data *tpg, enum tpg_video_aspect vid_aspect) { if (tpg->vid_aspect == vid_aspect) return; tpg->vid_aspect = vid_aspect; tpg->recalc_square_border = true; } static inline enum tpg_video_aspect tpg_g_video_aspect(const struct tpg_data *tpg) { return tpg->vid_aspect; } static inline void tpg_s_pixel_aspect(struct tpg_data *tpg, enum tpg_pixel_aspect pix_aspect) { if (tpg->pix_aspect == pix_aspect) return; tpg->pix_aspect = pix_aspect; tpg->recalc_square_border = true; } static inline void tpg_s_show_border(struct tpg_data *tpg, bool show_border) { tpg->show_border = show_border; } static inline void tpg_s_show_square(struct tpg_data *tpg, bool show_square) { tpg->show_square = show_square; } static inline void tpg_s_insert_sav(struct tpg_data *tpg, bool insert_sav) { tpg->insert_sav = insert_sav; } static inline void tpg_s_insert_eav(struct tpg_data *tpg, bool insert_eav) { tpg->insert_eav = insert_eav; } /* * This inserts 4 pixels of the RGB color 0xab55ab at the left hand side of the * image. This is only done for 3 or 4 byte RGB pixel formats. This pixel value * equals the Video Guard Band value as defined by HDMI (see section 5.2.2.1 * in the HDMI 1.3 Specification) that preceeds the first actual pixel. If the * HDMI receiver doesn't handle this correctly, then it might keep skipping * these Video Guard Band patterns and end up with a shorter video line. So this * is a nice pattern to test with. */ static inline void tpg_s_insert_hdmi_video_guard_band(struct tpg_data *tpg, bool insert_hdmi_video_guard_band) { tpg->insert_hdmi_video_guard_band = insert_hdmi_video_guard_band; } void tpg_update_mv_step(struct tpg_data *tpg); static inline void tpg_s_mv_hor_mode(struct tpg_data *tpg, enum tpg_move_mode mv_hor_mode) { tpg->mv_hor_mode = mv_hor_mode; tpg_update_mv_step(tpg); } static inline void tpg_s_mv_vert_mode(struct tpg_data *tpg, enum tpg_move_mode mv_vert_mode) { tpg->mv_vert_mode = mv_vert_mode; tpg_update_mv_step(tpg); } static inline void tpg_init_mv_count(struct tpg_data *tpg) { tpg->mv_hor_count = tpg->mv_vert_count = 0; } static inline void tpg_update_mv_count(struct tpg_data *tpg, bool frame_is_field) { tpg->mv_hor_count += tpg->mv_hor_step * (frame_is_field ? 1 : 2); tpg->mv_vert_count += tpg->mv_vert_step * (frame_is_field ? 1 : 2); } static inline void tpg_s_hflip(struct tpg_data *tpg, bool hflip) { if (tpg->hflip == hflip) return; tpg->hflip = hflip; tpg_update_mv_step(tpg); tpg->recalc_lines = true; } static inline bool tpg_g_hflip(const struct tpg_data *tpg) { return tpg->hflip; } static inline void tpg_s_vflip(struct tpg_data *tpg, bool vflip) { tpg->vflip = vflip; } static inline bool tpg_g_vflip(const struct tpg_data *tpg) { return tpg->vflip; } static inline bool tpg_pattern_is_static(const struct tpg_data *tpg) { return tpg->pattern != TPG_PAT_NOISE && tpg->mv_hor_mode == TPG_MOVE_NONE && tpg->mv_vert_mode == TPG_MOVE_NONE; } #endif |
| 448 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 | // SPDX-License-Identifier: BSD-3-Clause /* * linux/net/sunrpc/auth_gss/auth_gss.c * * RPCSEC_GSS client authentication. * * Copyright (c) 2000 The Regents of the University of Michigan. * All rights reserved. * * Dug Song <dugsong@monkey.org> * Andy Adamson <andros@umich.edu> */ #include <linux/module.h> #include <linux/init.h> #include <linux/types.h> #include <linux/slab.h> #include <linux/sched.h> #include <linux/pagemap.h> #include <linux/sunrpc/clnt.h> #include <linux/sunrpc/auth.h> #include <linux/sunrpc/auth_gss.h> #include <linux/sunrpc/gss_krb5.h> #include <linux/sunrpc/svcauth_gss.h> #include <linux/sunrpc/gss_err.h> #include <linux/workqueue.h> #include <linux/sunrpc/rpc_pipe_fs.h> #include <linux/sunrpc/gss_api.h> #include <linux/uaccess.h> #include <linux/hashtable.h> #include "auth_gss_internal.h" #include "../netns.h" #include <trace/events/rpcgss.h> static const struct rpc_authops authgss_ops; static const struct rpc_credops gss_credops; static const struct rpc_credops gss_nullops; #define GSS_RETRY_EXPIRED 5 static unsigned int gss_expired_cred_retry_delay = GSS_RETRY_EXPIRED; #define GSS_KEY_EXPIRE_TIMEO 240 static unsigned int gss_key_expire_timeo = GSS_KEY_EXPIRE_TIMEO; #if IS_ENABLED(CONFIG_SUNRPC_DEBUG) # define RPCDBG_FACILITY RPCDBG_AUTH #endif /* * This compile-time check verifies that we will not exceed the * slack space allotted by the client and server auth_gss code * before they call gss_wrap(). */ #define GSS_KRB5_MAX_SLACK_NEEDED \ (GSS_KRB5_TOK_HDR_LEN /* gss token header */ \ + GSS_KRB5_MAX_CKSUM_LEN /* gss token checksum */ \ + GSS_KRB5_MAX_BLOCKSIZE /* confounder */ \ + GSS_KRB5_MAX_BLOCKSIZE /* possible padding */ \ + GSS_KRB5_TOK_HDR_LEN /* encrypted hdr in v2 token */ \ + GSS_KRB5_MAX_CKSUM_LEN /* encryption hmac */ \ + XDR_UNIT * 2 /* RPC verifier */ \ + GSS_KRB5_TOK_HDR_LEN \ + GSS_KRB5_MAX_CKSUM_LEN) #define GSS_CRED_SLACK (RPC_MAX_AUTH_SIZE * 2) /* length of a krb5 verifier (48), plus data added before arguments when * using integrity (two 4-byte integers): */ #define GSS_VERF_SLACK 100 static DEFINE_HASHTABLE(gss_auth_hash_table, 4); static DEFINE_SPINLOCK(gss_auth_hash_lock); struct gss_pipe { struct rpc_pipe_dir_object pdo; struct rpc_pipe *pipe; struct rpc_clnt *clnt; const char *name; struct kref kref; }; struct gss_auth { struct kref kref; struct hlist_node hash; struct rpc_auth rpc_auth; struct gss_api_mech *mech; enum rpc_gss_svc service; struct rpc_clnt *client; struct net *net; netns_tracker ns_tracker; /* * There are two upcall pipes; dentry[1], named "gssd", is used * for the new text-based upcall; dentry[0] is named after the * mechanism (for example, "krb5") and exists for * backwards-compatibility with older gssd's. */ struct gss_pipe *gss_pipe[2]; const char *target_name; }; /* pipe_version >= 0 if and only if someone has a pipe open. */ static DEFINE_SPINLOCK(pipe_version_lock); static struct rpc_wait_queue pipe_version_rpc_waitqueue; static DECLARE_WAIT_QUEUE_HEAD(pipe_version_waitqueue); static void gss_put_auth(struct gss_auth *gss_auth); static void gss_free_ctx(struct gss_cl_ctx *); static const struct rpc_pipe_ops gss_upcall_ops_v0; static const struct rpc_pipe_ops gss_upcall_ops_v1; static inline struct gss_cl_ctx * gss_get_ctx(struct gss_cl_ctx *ctx) { refcount_inc(&ctx->count); return ctx; } static inline void gss_put_ctx(struct gss_cl_ctx *ctx) { if (refcount_dec_and_test(&ctx->count)) gss_free_ctx(ctx); } /* gss_cred_set_ctx: * called by gss_upcall_callback and gss_create_upcall in order * to set the gss context. The actual exchange of an old context * and a new one is protected by the pipe->lock. */ static void gss_cred_set_ctx(struct rpc_cred *cred, struct gss_cl_ctx *ctx) { struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); if (!test_bit(RPCAUTH_CRED_NEW, &cred->cr_flags)) return; gss_get_ctx(ctx); rcu_assign_pointer(gss_cred->gc_ctx, ctx); set_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags); smp_mb__before_atomic(); clear_bit(RPCAUTH_CRED_NEW, &cred->cr_flags); } static struct gss_cl_ctx * gss_cred_get_ctx(struct rpc_cred *cred) { struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_cl_ctx *ctx = NULL; rcu_read_lock(); ctx = rcu_dereference(gss_cred->gc_ctx); if (ctx) gss_get_ctx(ctx); rcu_read_unlock(); return ctx; } static struct gss_cl_ctx * gss_alloc_context(void) { struct gss_cl_ctx *ctx; ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); if (ctx != NULL) { ctx->gc_proc = RPC_GSS_PROC_DATA; ctx->gc_seq = 1; /* NetApp 6.4R1 doesn't accept seq. no. 0 */ spin_lock_init(&ctx->gc_seq_lock); refcount_set(&ctx->count,1); } return ctx; } #define GSSD_MIN_TIMEOUT (60 * 60) static const void * gss_fill_context(const void *p, const void *end, struct gss_cl_ctx *ctx, struct gss_api_mech *gm) { const void *q; unsigned int seclen; unsigned int timeout; unsigned long now = jiffies; u32 window_size; int ret; /* First unsigned int gives the remaining lifetime in seconds of the * credential - e.g. the remaining TGT lifetime for Kerberos or * the -t value passed to GSSD. */ p = simple_get_bytes(p, end, &timeout, sizeof(timeout)); if (IS_ERR(p)) goto err; if (timeout == 0) timeout = GSSD_MIN_TIMEOUT; ctx->gc_expiry = now + ((unsigned long)timeout * HZ); /* Sequence number window. Determines the maximum number of * simultaneous requests */ p = simple_get_bytes(p, end, &window_size, sizeof(window_size)); if (IS_ERR(p)) goto err; ctx->gc_win = window_size; /* gssd signals an error by passing ctx->gc_win = 0: */ if (ctx->gc_win == 0) { /* * in which case, p points to an error code. Anything other * than -EKEYEXPIRED gets converted to -EACCES. */ p = simple_get_bytes(p, end, &ret, sizeof(ret)); if (!IS_ERR(p)) p = (ret == -EKEYEXPIRED) ? ERR_PTR(-EKEYEXPIRED) : ERR_PTR(-EACCES); goto err; } /* copy the opaque wire context */ p = simple_get_netobj(p, end, &ctx->gc_wire_ctx); if (IS_ERR(p)) goto err; /* import the opaque security context */ p = simple_get_bytes(p, end, &seclen, sizeof(seclen)); if (IS_ERR(p)) goto err; q = (const void *)((const char *)p + seclen); if (unlikely(q > end || q < p)) { p = ERR_PTR(-EFAULT); goto err; } ret = gss_import_sec_context(p, seclen, gm, &ctx->gc_gss_ctx, NULL, GFP_KERNEL); if (ret < 0) { trace_rpcgss_import_ctx(ret); p = ERR_PTR(ret); goto err; } /* is there any trailing data? */ if (q == end) { p = q; goto done; } /* pull in acceptor name (if there is one) */ p = simple_get_netobj(q, end, &ctx->gc_acceptor); if (IS_ERR(p)) goto err; done: trace_rpcgss_context(window_size, ctx->gc_expiry, now, timeout, ctx->gc_acceptor.len, ctx->gc_acceptor.data); err: return p; } /* XXX: Need some documentation about why UPCALL_BUF_LEN is so small. * Is user space expecting no more than UPCALL_BUF_LEN bytes? * Note that there are now _two_ NI_MAXHOST sized data items * being passed in this string. */ #define UPCALL_BUF_LEN 256 struct gss_upcall_msg { refcount_t count; kuid_t uid; const char *service_name; struct rpc_pipe_msg msg; struct list_head list; struct gss_auth *auth; struct rpc_pipe *pipe; struct rpc_wait_queue rpc_waitqueue; wait_queue_head_t waitqueue; struct gss_cl_ctx *ctx; char databuf[UPCALL_BUF_LEN]; }; static int get_pipe_version(struct net *net) { struct sunrpc_net *sn = net_generic(net, sunrpc_net_id); int ret; spin_lock(&pipe_version_lock); if (sn->pipe_version >= 0) { atomic_inc(&sn->pipe_users); ret = sn->pipe_version; } else ret = -EAGAIN; spin_unlock(&pipe_version_lock); return ret; } static void put_pipe_version(struct net *net) { struct sunrpc_net *sn = net_generic(net, sunrpc_net_id); if (atomic_dec_and_lock(&sn->pipe_users, &pipe_version_lock)) { sn->pipe_version = -1; spin_unlock(&pipe_version_lock); } } static void gss_release_msg(struct gss_upcall_msg *gss_msg) { struct net *net = gss_msg->auth->net; if (!refcount_dec_and_test(&gss_msg->count)) return; put_pipe_version(net); BUG_ON(!list_empty(&gss_msg->list)); if (gss_msg->ctx != NULL) gss_put_ctx(gss_msg->ctx); rpc_destroy_wait_queue(&gss_msg->rpc_waitqueue); gss_put_auth(gss_msg->auth); kfree_const(gss_msg->service_name); kfree(gss_msg); } static struct gss_upcall_msg * __gss_find_upcall(struct rpc_pipe *pipe, kuid_t uid, const struct gss_auth *auth) { struct gss_upcall_msg *pos; list_for_each_entry(pos, &pipe->in_downcall, list) { if (!uid_eq(pos->uid, uid)) continue; if (pos->auth->service != auth->service) continue; refcount_inc(&pos->count); return pos; } return NULL; } /* Try to add an upcall to the pipefs queue. * If an upcall owned by our uid already exists, then we return a reference * to that upcall instead of adding the new upcall. */ static inline struct gss_upcall_msg * gss_add_msg(struct gss_upcall_msg *gss_msg) { struct rpc_pipe *pipe = gss_msg->pipe; struct gss_upcall_msg *old; spin_lock(&pipe->lock); old = __gss_find_upcall(pipe, gss_msg->uid, gss_msg->auth); if (old == NULL) { refcount_inc(&gss_msg->count); list_add(&gss_msg->list, &pipe->in_downcall); } else gss_msg = old; spin_unlock(&pipe->lock); return gss_msg; } static void __gss_unhash_msg(struct gss_upcall_msg *gss_msg) { list_del_init(&gss_msg->list); rpc_wake_up_status(&gss_msg->rpc_waitqueue, gss_msg->msg.errno); wake_up_all(&gss_msg->waitqueue); refcount_dec(&gss_msg->count); } static void gss_unhash_msg(struct gss_upcall_msg *gss_msg) { struct rpc_pipe *pipe = gss_msg->pipe; if (list_empty(&gss_msg->list)) return; spin_lock(&pipe->lock); if (!list_empty(&gss_msg->list)) __gss_unhash_msg(gss_msg); spin_unlock(&pipe->lock); } static void gss_handle_downcall_result(struct gss_cred *gss_cred, struct gss_upcall_msg *gss_msg) { switch (gss_msg->msg.errno) { case 0: if (gss_msg->ctx == NULL) break; clear_bit(RPCAUTH_CRED_NEGATIVE, &gss_cred->gc_base.cr_flags); gss_cred_set_ctx(&gss_cred->gc_base, gss_msg->ctx); break; case -EKEYEXPIRED: set_bit(RPCAUTH_CRED_NEGATIVE, &gss_cred->gc_base.cr_flags); } gss_cred->gc_upcall_timestamp = jiffies; gss_cred->gc_upcall = NULL; rpc_wake_up_status(&gss_msg->rpc_waitqueue, gss_msg->msg.errno); } static void gss_upcall_callback(struct rpc_task *task) { struct gss_cred *gss_cred = container_of(task->tk_rqstp->rq_cred, struct gss_cred, gc_base); struct gss_upcall_msg *gss_msg = gss_cred->gc_upcall; struct rpc_pipe *pipe = gss_msg->pipe; spin_lock(&pipe->lock); gss_handle_downcall_result(gss_cred, gss_msg); spin_unlock(&pipe->lock); task->tk_status = gss_msg->msg.errno; gss_release_msg(gss_msg); } static void gss_encode_v0_msg(struct gss_upcall_msg *gss_msg, const struct cred *cred) { struct user_namespace *userns = cred->user_ns; uid_t uid = from_kuid_munged(userns, gss_msg->uid); memcpy(gss_msg->databuf, &uid, sizeof(uid)); gss_msg->msg.data = gss_msg->databuf; gss_msg->msg.len = sizeof(uid); BUILD_BUG_ON(sizeof(uid) > sizeof(gss_msg->databuf)); } static ssize_t gss_v0_upcall(struct file *file, struct rpc_pipe_msg *msg, char __user *buf, size_t buflen) { struct gss_upcall_msg *gss_msg = container_of(msg, struct gss_upcall_msg, msg); if (msg->copied == 0) gss_encode_v0_msg(gss_msg, file->f_cred); return rpc_pipe_generic_upcall(file, msg, buf, buflen); } static int gss_encode_v1_msg(struct gss_upcall_msg *gss_msg, const char *service_name, const char *target_name, const struct cred *cred) { struct user_namespace *userns = cred->user_ns; struct gss_api_mech *mech = gss_msg->auth->mech; char *p = gss_msg->databuf; size_t buflen = sizeof(gss_msg->databuf); int len; len = scnprintf(p, buflen, "mech=%s uid=%d", mech->gm_name, from_kuid_munged(userns, gss_msg->uid)); buflen -= len; p += len; gss_msg->msg.len = len; /* * target= is a full service principal that names the remote * identity that we are authenticating to. */ if (target_name) { len = scnprintf(p, buflen, " target=%s", target_name); buflen -= len; p += len; gss_msg->msg.len += len; } /* * gssd uses service= and srchost= to select a matching key from * the system's keytab to use as the source principal. * * service= is the service name part of the source principal, * or "*" (meaning choose any). * * srchost= is the hostname part of the source principal. When * not provided, gssd uses the local hostname. */ if (service_name) { char *c = strchr(service_name, '@'); if (!c) len = scnprintf(p, buflen, " service=%s", service_name); else len = scnprintf(p, buflen, " service=%.*s srchost=%s", (int)(c - service_name), service_name, c + 1); buflen -= len; p += len; gss_msg->msg.len += len; } if (mech->gm_upcall_enctypes) { len = scnprintf(p, buflen, " enctypes=%s", mech->gm_upcall_enctypes); buflen -= len; p += len; gss_msg->msg.len += len; } trace_rpcgss_upcall_msg(gss_msg->databuf); len = scnprintf(p, buflen, "\n"); if (len == 0) goto out_overflow; gss_msg->msg.len += len; gss_msg->msg.data = gss_msg->databuf; return 0; out_overflow: WARN_ON_ONCE(1); return -ENOMEM; } static ssize_t gss_v1_upcall(struct file *file, struct rpc_pipe_msg *msg, char __user *buf, size_t buflen) { struct gss_upcall_msg *gss_msg = container_of(msg, struct gss_upcall_msg, msg); int err; if (msg->copied == 0) { err = gss_encode_v1_msg(gss_msg, gss_msg->service_name, gss_msg->auth->target_name, file->f_cred); if (err) return err; } return rpc_pipe_generic_upcall(file, msg, buf, buflen); } static struct gss_upcall_msg * gss_alloc_msg(struct gss_auth *gss_auth, kuid_t uid, const char *service_name) { struct gss_upcall_msg *gss_msg; int vers; int err = -ENOMEM; gss_msg = kzalloc(sizeof(*gss_msg), GFP_KERNEL); if (gss_msg == NULL) goto err; vers = get_pipe_version(gss_auth->net); err = vers; if (err < 0) goto err_free_msg; gss_msg->pipe = gss_auth->gss_pipe[vers]->pipe; INIT_LIST_HEAD(&gss_msg->list); rpc_init_wait_queue(&gss_msg->rpc_waitqueue, "RPCSEC_GSS upcall waitq"); init_waitqueue_head(&gss_msg->waitqueue); refcount_set(&gss_msg->count, 1); gss_msg->uid = uid; gss_msg->auth = gss_auth; kref_get(&gss_auth->kref); if (service_name) { gss_msg->service_name = kstrdup_const(service_name, GFP_KERNEL); if (!gss_msg->service_name) { err = -ENOMEM; goto err_put_pipe_version; } } return gss_msg; err_put_pipe_version: put_pipe_version(gss_auth->net); err_free_msg: kfree(gss_msg); err: return ERR_PTR(err); } static struct gss_upcall_msg * gss_setup_upcall(struct gss_auth *gss_auth, struct rpc_cred *cred) { struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_upcall_msg *gss_new, *gss_msg; kuid_t uid = cred->cr_cred->fsuid; gss_new = gss_alloc_msg(gss_auth, uid, gss_cred->gc_principal); if (IS_ERR(gss_new)) return gss_new; gss_msg = gss_add_msg(gss_new); if (gss_msg == gss_new) { int res; refcount_inc(&gss_msg->count); res = rpc_queue_upcall(gss_new->pipe, &gss_new->msg); if (res) { gss_unhash_msg(gss_new); refcount_dec(&gss_msg->count); gss_release_msg(gss_new); gss_msg = ERR_PTR(res); } } else gss_release_msg(gss_new); return gss_msg; } static void warn_gssd(void) { dprintk("AUTH_GSS upcall failed. Please check user daemon is running.\n"); } static inline int gss_refresh_upcall(struct rpc_task *task) { struct rpc_cred *cred = task->tk_rqstp->rq_cred; struct gss_auth *gss_auth = container_of(cred->cr_auth, struct gss_auth, rpc_auth); struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_upcall_msg *gss_msg; struct rpc_pipe *pipe; int err = 0; gss_msg = gss_setup_upcall(gss_auth, cred); if (PTR_ERR(gss_msg) == -EAGAIN) { /* XXX: warning on the first, under the assumption we * shouldn't normally hit this case on a refresh. */ warn_gssd(); rpc_sleep_on_timeout(&pipe_version_rpc_waitqueue, task, NULL, jiffies + (15 * HZ)); err = -EAGAIN; goto out; } if (IS_ERR(gss_msg)) { err = PTR_ERR(gss_msg); goto out; } pipe = gss_msg->pipe; spin_lock(&pipe->lock); if (gss_cred->gc_upcall != NULL) rpc_sleep_on(&gss_cred->gc_upcall->rpc_waitqueue, task, NULL); else if (gss_msg->ctx == NULL && gss_msg->msg.errno >= 0) { gss_cred->gc_upcall = gss_msg; /* gss_upcall_callback will release the reference to gss_upcall_msg */ refcount_inc(&gss_msg->count); rpc_sleep_on(&gss_msg->rpc_waitqueue, task, gss_upcall_callback); } else { gss_handle_downcall_result(gss_cred, gss_msg); err = gss_msg->msg.errno; } spin_unlock(&pipe->lock); gss_release_msg(gss_msg); out: trace_rpcgss_upcall_result(from_kuid(&init_user_ns, cred->cr_cred->fsuid), err); return err; } static inline int gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred) { struct net *net = gss_auth->net; struct sunrpc_net *sn = net_generic(net, sunrpc_net_id); struct rpc_pipe *pipe; struct rpc_cred *cred = &gss_cred->gc_base; struct gss_upcall_msg *gss_msg; DEFINE_WAIT(wait); int err; retry: err = 0; /* if gssd is down, just skip upcalling altogether */ if (!gssd_running(net)) { warn_gssd(); err = -EACCES; goto out; } gss_msg = gss_setup_upcall(gss_auth, cred); if (PTR_ERR(gss_msg) == -EAGAIN) { err = wait_event_interruptible_timeout(pipe_version_waitqueue, sn->pipe_version >= 0, 15 * HZ); if (sn->pipe_version < 0) { warn_gssd(); err = -EACCES; } if (err < 0) goto out; goto retry; } if (IS_ERR(gss_msg)) { err = PTR_ERR(gss_msg); goto out; } pipe = gss_msg->pipe; for (;;) { prepare_to_wait(&gss_msg->waitqueue, &wait, TASK_KILLABLE); spin_lock(&pipe->lock); if (gss_msg->ctx != NULL || gss_msg->msg.errno < 0) { break; } spin_unlock(&pipe->lock); if (fatal_signal_pending(current)) { err = -ERESTARTSYS; goto out_intr; } schedule(); } if (gss_msg->ctx) { trace_rpcgss_ctx_init(gss_cred); gss_cred_set_ctx(cred, gss_msg->ctx); } else { err = gss_msg->msg.errno; } spin_unlock(&pipe->lock); out_intr: finish_wait(&gss_msg->waitqueue, &wait); gss_release_msg(gss_msg); out: trace_rpcgss_upcall_result(from_kuid(&init_user_ns, cred->cr_cred->fsuid), err); return err; } static struct gss_upcall_msg * gss_find_downcall(struct rpc_pipe *pipe, kuid_t uid) { struct gss_upcall_msg *pos; list_for_each_entry(pos, &pipe->in_downcall, list) { if (!uid_eq(pos->uid, uid)) continue; if (!rpc_msg_is_inflight(&pos->msg)) continue; refcount_inc(&pos->count); return pos; } return NULL; } #define MSG_BUF_MAXSIZE 1024 static ssize_t gss_pipe_downcall(struct file *filp, const char __user *src, size_t mlen) { const void *p, *end; void *buf; struct gss_upcall_msg *gss_msg; struct rpc_pipe *pipe = RPC_I(file_inode(filp))->pipe; struct gss_cl_ctx *ctx; uid_t id; kuid_t uid; ssize_t err = -EFBIG; if (mlen > MSG_BUF_MAXSIZE) goto out; err = -ENOMEM; buf = kmalloc(mlen, GFP_KERNEL); if (!buf) goto out; err = -EFAULT; if (copy_from_user(buf, src, mlen)) goto err; end = (const void *)((char *)buf + mlen); p = simple_get_bytes(buf, end, &id, sizeof(id)); if (IS_ERR(p)) { err = PTR_ERR(p); goto err; } uid = make_kuid(current_user_ns(), id); if (!uid_valid(uid)) { err = -EINVAL; goto err; } err = -ENOMEM; ctx = gss_alloc_context(); if (ctx == NULL) goto err; err = -ENOENT; /* Find a matching upcall */ spin_lock(&pipe->lock); gss_msg = gss_find_downcall(pipe, uid); if (gss_msg == NULL) { spin_unlock(&pipe->lock); goto err_put_ctx; } list_del_init(&gss_msg->list); spin_unlock(&pipe->lock); p = gss_fill_context(p, end, ctx, gss_msg->auth->mech); if (IS_ERR(p)) { err = PTR_ERR(p); switch (err) { case -EACCES: case -EKEYEXPIRED: gss_msg->msg.errno = err; err = mlen; break; case -EFAULT: case -ENOMEM: case -EINVAL: case -ENOSYS: gss_msg->msg.errno = -EAGAIN; break; default: printk(KERN_CRIT "%s: bad return from " "gss_fill_context: %zd\n", __func__, err); gss_msg->msg.errno = -EIO; } goto err_release_msg; } gss_msg->ctx = gss_get_ctx(ctx); err = mlen; err_release_msg: spin_lock(&pipe->lock); __gss_unhash_msg(gss_msg); spin_unlock(&pipe->lock); gss_release_msg(gss_msg); err_put_ctx: gss_put_ctx(ctx); err: kfree(buf); out: return err; } static int gss_pipe_open(struct inode *inode, int new_version) { struct net *net = inode->i_sb->s_fs_info; struct sunrpc_net *sn = net_generic(net, sunrpc_net_id); int ret = 0; spin_lock(&pipe_version_lock); if (sn->pipe_version < 0) { /* First open of any gss pipe determines the version: */ sn->pipe_version = new_version; rpc_wake_up(&pipe_version_rpc_waitqueue); wake_up(&pipe_version_waitqueue); } else if (sn->pipe_version != new_version) { /* Trying to open a pipe of a different version */ ret = -EBUSY; goto out; } atomic_inc(&sn->pipe_users); out: spin_unlock(&pipe_version_lock); return ret; } static int gss_pipe_open_v0(struct inode *inode) { return gss_pipe_open(inode, 0); } static int gss_pipe_open_v1(struct inode *inode) { return gss_pipe_open(inode, 1); } static void gss_pipe_release(struct inode *inode) { struct net *net = inode->i_sb->s_fs_info; struct rpc_pipe *pipe = RPC_I(inode)->pipe; struct gss_upcall_msg *gss_msg; restart: spin_lock(&pipe->lock); list_for_each_entry(gss_msg, &pipe->in_downcall, list) { if (!list_empty(&gss_msg->msg.list)) continue; gss_msg->msg.errno = -EPIPE; refcount_inc(&gss_msg->count); __gss_unhash_msg(gss_msg); spin_unlock(&pipe->lock); gss_release_msg(gss_msg); goto restart; } spin_unlock(&pipe->lock); put_pipe_version(net); } static void gss_pipe_destroy_msg(struct rpc_pipe_msg *msg) { struct gss_upcall_msg *gss_msg = container_of(msg, struct gss_upcall_msg, msg); if (msg->errno < 0) { refcount_inc(&gss_msg->count); gss_unhash_msg(gss_msg); if (msg->errno == -ETIMEDOUT) warn_gssd(); gss_release_msg(gss_msg); } gss_release_msg(gss_msg); } static void gss_pipe_dentry_destroy(struct dentry *dir, struct rpc_pipe_dir_object *pdo) { struct gss_pipe *gss_pipe = pdo->pdo_data; struct rpc_pipe *pipe = gss_pipe->pipe; if (pipe->dentry != NULL) { rpc_unlink(pipe->dentry); pipe->dentry = NULL; } } static int gss_pipe_dentry_create(struct dentry *dir, struct rpc_pipe_dir_object *pdo) { struct gss_pipe *p = pdo->pdo_data; struct dentry *dentry; dentry = rpc_mkpipe_dentry(dir, p->name, p->clnt, p->pipe); if (IS_ERR(dentry)) return PTR_ERR(dentry); p->pipe->dentry = dentry; return 0; } static const struct rpc_pipe_dir_object_ops gss_pipe_dir_object_ops = { .create = gss_pipe_dentry_create, .destroy = gss_pipe_dentry_destroy, }; static struct gss_pipe *gss_pipe_alloc(struct rpc_clnt *clnt, const char *name, const struct rpc_pipe_ops *upcall_ops) { struct gss_pipe *p; int err = -ENOMEM; p = kmalloc(sizeof(*p), GFP_KERNEL); if (p == NULL) goto err; p->pipe = rpc_mkpipe_data(upcall_ops, RPC_PIPE_WAIT_FOR_OPEN); if (IS_ERR(p->pipe)) { err = PTR_ERR(p->pipe); goto err_free_gss_pipe; } p->name = name; p->clnt = clnt; kref_init(&p->kref); rpc_init_pipe_dir_object(&p->pdo, &gss_pipe_dir_object_ops, p); return p; err_free_gss_pipe: kfree(p); err: return ERR_PTR(err); } struct gss_alloc_pdo { struct rpc_clnt *clnt; const char *name; const struct rpc_pipe_ops *upcall_ops; }; static int gss_pipe_match_pdo(struct rpc_pipe_dir_object *pdo, void *data) { struct gss_pipe *gss_pipe; struct gss_alloc_pdo *args = data; if (pdo->pdo_ops != &gss_pipe_dir_object_ops) return 0; gss_pipe = container_of(pdo, struct gss_pipe, pdo); if (strcmp(gss_pipe->name, args->name) != 0) return 0; if (!kref_get_unless_zero(&gss_pipe->kref)) return 0; return 1; } static struct rpc_pipe_dir_object *gss_pipe_alloc_pdo(void *data) { struct gss_pipe *gss_pipe; struct gss_alloc_pdo *args = data; gss_pipe = gss_pipe_alloc(args->clnt, args->name, args->upcall_ops); if (!IS_ERR(gss_pipe)) return &gss_pipe->pdo; return NULL; } static struct gss_pipe *gss_pipe_get(struct rpc_clnt *clnt, const char *name, const struct rpc_pipe_ops *upcall_ops) { struct net *net = rpc_net_ns(clnt); struct rpc_pipe_dir_object *pdo; struct gss_alloc_pdo args = { .clnt = clnt, .name = name, .upcall_ops = upcall_ops, }; pdo = rpc_find_or_alloc_pipe_dir_object(net, &clnt->cl_pipedir_objects, gss_pipe_match_pdo, gss_pipe_alloc_pdo, &args); if (pdo != NULL) return container_of(pdo, struct gss_pipe, pdo); return ERR_PTR(-ENOMEM); } static void __gss_pipe_free(struct gss_pipe *p) { struct rpc_clnt *clnt = p->clnt; struct net *net = rpc_net_ns(clnt); rpc_remove_pipe_dir_object(net, &clnt->cl_pipedir_objects, &p->pdo); rpc_destroy_pipe_data(p->pipe); kfree(p); } static void __gss_pipe_release(struct kref *kref) { struct gss_pipe *p = container_of(kref, struct gss_pipe, kref); __gss_pipe_free(p); } static void gss_pipe_free(struct gss_pipe *p) { if (p != NULL) kref_put(&p->kref, __gss_pipe_release); } /* * NOTE: we have the opportunity to use different * parameters based on the input flavor (which must be a pseudoflavor) */ static struct gss_auth * gss_create_new(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt) { rpc_authflavor_t flavor = args->pseudoflavor; struct gss_auth *gss_auth; struct gss_pipe *gss_pipe; struct rpc_auth * auth; int err = -ENOMEM; /* XXX? */ if (!try_module_get(THIS_MODULE)) return ERR_PTR(err); if (!(gss_auth = kmalloc(sizeof(*gss_auth), GFP_KERNEL))) goto out_dec; INIT_HLIST_NODE(&gss_auth->hash); gss_auth->target_name = NULL; if (args->target_name) { gss_auth->target_name = kstrdup(args->target_name, GFP_KERNEL); if (gss_auth->target_name == NULL) goto err_free; } gss_auth->client = clnt; gss_auth->net = get_net_track(rpc_net_ns(clnt), &gss_auth->ns_tracker, GFP_KERNEL); err = -EINVAL; gss_auth->mech = gss_mech_get_by_pseudoflavor(flavor); if (!gss_auth->mech) goto err_put_net; gss_auth->service = gss_pseudoflavor_to_service(gss_auth->mech, flavor); if (gss_auth->service == 0) goto err_put_mech; if (!gssd_running(gss_auth->net)) goto err_put_mech; auth = &gss_auth->rpc_auth; auth->au_cslack = GSS_CRED_SLACK >> 2; BUILD_BUG_ON(GSS_KRB5_MAX_SLACK_NEEDED > RPC_MAX_AUTH_SIZE); auth->au_rslack = GSS_KRB5_MAX_SLACK_NEEDED >> 2; auth->au_verfsize = GSS_VERF_SLACK >> 2; auth->au_ralign = GSS_VERF_SLACK >> 2; __set_bit(RPCAUTH_AUTH_UPDATE_SLACK, &auth->au_flags); auth->au_ops = &authgss_ops; auth->au_flavor = flavor; if (gss_pseudoflavor_to_datatouch(gss_auth->mech, flavor)) __set_bit(RPCAUTH_AUTH_DATATOUCH, &auth->au_flags); refcount_set(&auth->au_count, 1); kref_init(&gss_auth->kref); err = rpcauth_init_credcache(auth); if (err) goto err_put_mech; /* * Note: if we created the old pipe first, then someone who * examined the directory at the right moment might conclude * that we supported only the old pipe. So we instead create * the new pipe first. */ gss_pipe = gss_pipe_get(clnt, "gssd", &gss_upcall_ops_v1); if (IS_ERR(gss_pipe)) { err = PTR_ERR(gss_pipe); goto err_destroy_credcache; } gss_auth->gss_pipe[1] = gss_pipe; gss_pipe = gss_pipe_get(clnt, gss_auth->mech->gm_name, &gss_upcall_ops_v0); if (IS_ERR(gss_pipe)) { err = PTR_ERR(gss_pipe); goto err_destroy_pipe_1; } gss_auth->gss_pipe[0] = gss_pipe; return gss_auth; err_destroy_pipe_1: gss_pipe_free(gss_auth->gss_pipe[1]); err_destroy_credcache: rpcauth_destroy_credcache(auth); err_put_mech: gss_mech_put(gss_auth->mech); err_put_net: put_net_track(gss_auth->net, &gss_auth->ns_tracker); err_free: kfree(gss_auth->target_name); kfree(gss_auth); out_dec: module_put(THIS_MODULE); trace_rpcgss_createauth(flavor, err); return ERR_PTR(err); } static void gss_free(struct gss_auth *gss_auth) { gss_pipe_free(gss_auth->gss_pipe[0]); gss_pipe_free(gss_auth->gss_pipe[1]); gss_mech_put(gss_auth->mech); put_net_track(gss_auth->net, &gss_auth->ns_tracker); kfree(gss_auth->target_name); kfree(gss_auth); module_put(THIS_MODULE); } static void gss_free_callback(struct kref *kref) { struct gss_auth *gss_auth = container_of(kref, struct gss_auth, kref); gss_free(gss_auth); } static void gss_put_auth(struct gss_auth *gss_auth) { kref_put(&gss_auth->kref, gss_free_callback); } static void gss_destroy(struct rpc_auth *auth) { struct gss_auth *gss_auth = container_of(auth, struct gss_auth, rpc_auth); if (hash_hashed(&gss_auth->hash)) { spin_lock(&gss_auth_hash_lock); hash_del(&gss_auth->hash); spin_unlock(&gss_auth_hash_lock); } gss_pipe_free(gss_auth->gss_pipe[0]); gss_auth->gss_pipe[0] = NULL; gss_pipe_free(gss_auth->gss_pipe[1]); gss_auth->gss_pipe[1] = NULL; rpcauth_destroy_credcache(auth); gss_put_auth(gss_auth); } /* * Auths may be shared between rpc clients that were cloned from a * common client with the same xprt, if they also share the flavor and * target_name. * * The auth is looked up from the oldest parent sharing the same * cl_xprt, and the auth itself references only that common parent * (which is guaranteed to last as long as any of its descendants). */ static struct gss_auth * gss_auth_find_or_add_hashed(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt, struct gss_auth *new) { struct gss_auth *gss_auth; unsigned long hashval = (unsigned long)clnt; spin_lock(&gss_auth_hash_lock); hash_for_each_possible(gss_auth_hash_table, gss_auth, hash, hashval) { if (gss_auth->client != clnt) continue; if (gss_auth->rpc_auth.au_flavor != args->pseudoflavor) continue; if (gss_auth->target_name != args->target_name) { if (gss_auth->target_name == NULL) continue; if (args->target_name == NULL) continue; if (strcmp(gss_auth->target_name, args->target_name)) continue; } if (!refcount_inc_not_zero(&gss_auth->rpc_auth.au_count)) continue; goto out; } if (new) hash_add(gss_auth_hash_table, &new->hash, hashval); gss_auth = new; out: spin_unlock(&gss_auth_hash_lock); return gss_auth; } static struct gss_auth * gss_create_hashed(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt) { struct gss_auth *gss_auth; struct gss_auth *new; gss_auth = gss_auth_find_or_add_hashed(args, clnt, NULL); if (gss_auth != NULL) goto out; new = gss_create_new(args, clnt); if (IS_ERR(new)) return new; gss_auth = gss_auth_find_or_add_hashed(args, clnt, new); if (gss_auth != new) gss_destroy(&new->rpc_auth); out: return gss_auth; } static struct rpc_auth * gss_create(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt) { struct gss_auth *gss_auth; struct rpc_xprt_switch *xps = rcu_access_pointer(clnt->cl_xpi.xpi_xpswitch); while (clnt != clnt->cl_parent) { struct rpc_clnt *parent = clnt->cl_parent; /* Find the original parent for this transport */ if (rcu_access_pointer(parent->cl_xpi.xpi_xpswitch) != xps) break; clnt = parent; } gss_auth = gss_create_hashed(args, clnt); if (IS_ERR(gss_auth)) return ERR_CAST(gss_auth); return &gss_auth->rpc_auth; } static struct gss_cred * gss_dup_cred(struct gss_auth *gss_auth, struct gss_cred *gss_cred) { struct gss_cred *new; /* Make a copy of the cred so that we can reference count it */ new = kzalloc(sizeof(*gss_cred), GFP_KERNEL); if (new) { struct auth_cred acred = { .cred = gss_cred->gc_base.cr_cred, }; struct gss_cl_ctx *ctx = rcu_dereference_protected(gss_cred->gc_ctx, 1); rpcauth_init_cred(&new->gc_base, &acred, &gss_auth->rpc_auth, &gss_nullops); new->gc_base.cr_flags = 1UL << RPCAUTH_CRED_UPTODATE; new->gc_service = gss_cred->gc_service; new->gc_principal = gss_cred->gc_principal; kref_get(&gss_auth->kref); rcu_assign_pointer(new->gc_ctx, ctx); gss_get_ctx(ctx); } return new; } /* * gss_send_destroy_context will cause the RPCSEC_GSS to send a NULL RPC call * to the server with the GSS control procedure field set to * RPC_GSS_PROC_DESTROY. This should normally cause the server to release * all RPCSEC_GSS state associated with that context. */ static void gss_send_destroy_context(struct rpc_cred *cred) { struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_auth *gss_auth = container_of(cred->cr_auth, struct gss_auth, rpc_auth); struct gss_cl_ctx *ctx = rcu_dereference_protected(gss_cred->gc_ctx, 1); struct gss_cred *new; struct rpc_task *task; new = gss_dup_cred(gss_auth, gss_cred); if (new) { ctx->gc_proc = RPC_GSS_PROC_DESTROY; trace_rpcgss_ctx_destroy(gss_cred); task = rpc_call_null(gss_auth->client, &new->gc_base, RPC_TASK_ASYNC); if (!IS_ERR(task)) rpc_put_task(task); put_rpccred(&new->gc_base); } } /* gss_destroy_cred (and gss_free_ctx) are used to clean up after failure * to create a new cred or context, so they check that things have been * allocated before freeing them. */ static void gss_do_free_ctx(struct gss_cl_ctx *ctx) { gss_delete_sec_context(&ctx->gc_gss_ctx); kfree(ctx->gc_wire_ctx.data); kfree(ctx->gc_acceptor.data); kfree(ctx); } static void gss_free_ctx_callback(struct rcu_head *head) { struct gss_cl_ctx *ctx = container_of(head, struct gss_cl_ctx, gc_rcu); gss_do_free_ctx(ctx); } static void gss_free_ctx(struct gss_cl_ctx *ctx) { call_rcu(&ctx->gc_rcu, gss_free_ctx_callback); } static void gss_free_cred(struct gss_cred *gss_cred) { kfree(gss_cred); } static void gss_free_cred_callback(struct rcu_head *head) { struct gss_cred *gss_cred = container_of(head, struct gss_cred, gc_base.cr_rcu); gss_free_cred(gss_cred); } static void gss_destroy_nullcred(struct rpc_cred *cred) { struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_auth *gss_auth = container_of(cred->cr_auth, struct gss_auth, rpc_auth); struct gss_cl_ctx *ctx = rcu_dereference_protected(gss_cred->gc_ctx, 1); RCU_INIT_POINTER(gss_cred->gc_ctx, NULL); put_cred(cred->cr_cred); call_rcu(&cred->cr_rcu, gss_free_cred_callback); if (ctx) gss_put_ctx(ctx); gss_put_auth(gss_auth); } static void gss_destroy_cred(struct rpc_cred *cred) { if (test_and_clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags) != 0) gss_send_destroy_context(cred); gss_destroy_nullcred(cred); } static int gss_hash_cred(struct auth_cred *acred, unsigned int hashbits) { return hash_64(from_kuid(&init_user_ns, acred->cred->fsuid), hashbits); } /* * Lookup RPCSEC_GSS cred for the current process */ static struct rpc_cred *gss_lookup_cred(struct rpc_auth *auth, struct auth_cred *acred, int flags) { return rpcauth_lookup_credcache(auth, acred, flags, rpc_task_gfp_mask()); } static struct rpc_cred * gss_create_cred(struct rpc_auth *auth, struct auth_cred *acred, int flags, gfp_t gfp) { struct gss_auth *gss_auth = container_of(auth, struct gss_auth, rpc_auth); struct gss_cred *cred = NULL; int err = -ENOMEM; if (!(cred = kzalloc(sizeof(*cred), gfp))) goto out_err; rpcauth_init_cred(&cred->gc_base, acred, auth, &gss_credops); /* * Note: in order to force a call to call_refresh(), we deliberately * fail to flag the credential as RPCAUTH_CRED_UPTODATE. */ cred->gc_base.cr_flags = 1UL << RPCAUTH_CRED_NEW; cred->gc_service = gss_auth->service; cred->gc_principal = acred->principal; kref_get(&gss_auth->kref); return &cred->gc_base; out_err: return ERR_PTR(err); } static int gss_cred_init(struct rpc_auth *auth, struct rpc_cred *cred) { struct gss_auth *gss_auth = container_of(auth, struct gss_auth, rpc_auth); struct gss_cred *gss_cred = container_of(cred,struct gss_cred, gc_base); int err; do { err = gss_create_upcall(gss_auth, gss_cred); } while (err == -EAGAIN); return err; } static char * gss_stringify_acceptor(struct rpc_cred *cred) { char *string = NULL; struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_cl_ctx *ctx; unsigned int len; struct xdr_netobj *acceptor; rcu_read_lock(); ctx = rcu_dereference(gss_cred->gc_ctx); if (!ctx) goto out; len = ctx->gc_acceptor.len; rcu_read_unlock(); /* no point if there's no string */ if (!len) return NULL; realloc: string = kmalloc(len + 1, GFP_KERNEL); if (!string) return NULL; rcu_read_lock(); ctx = rcu_dereference(gss_cred->gc_ctx); /* did the ctx disappear or was it replaced by one with no acceptor? */ if (!ctx || !ctx->gc_acceptor.len) { kfree(string); string = NULL; goto out; } acceptor = &ctx->gc_acceptor; /* * Did we find a new acceptor that's longer than the original? Allocate * a longer buffer and try again. */ if (len < acceptor->len) { len = acceptor->len; rcu_read_unlock(); kfree(string); goto realloc; } memcpy(string, acceptor->data, acceptor->len); string[acceptor->len] = '\0'; out: rcu_read_unlock(); return string; } /* * Returns -EACCES if GSS context is NULL or will expire within the * timeout (miliseconds) */ static int gss_key_timeout(struct rpc_cred *rc) { struct gss_cred *gss_cred = container_of(rc, struct gss_cred, gc_base); struct gss_cl_ctx *ctx; unsigned long timeout = jiffies + (gss_key_expire_timeo * HZ); int ret = 0; rcu_read_lock(); ctx = rcu_dereference(gss_cred->gc_ctx); if (!ctx || time_after(timeout, ctx->gc_expiry)) ret = -EACCES; rcu_read_unlock(); return ret; } static int gss_match(struct auth_cred *acred, struct rpc_cred *rc, int flags) { struct gss_cred *gss_cred = container_of(rc, struct gss_cred, gc_base); struct gss_cl_ctx *ctx; int ret; if (test_bit(RPCAUTH_CRED_NEW, &rc->cr_flags)) goto out; /* Don't match with creds that have expired. */ rcu_read_lock(); ctx = rcu_dereference(gss_cred->gc_ctx); if (!ctx || time_after(jiffies, ctx->gc_expiry)) { rcu_read_unlock(); return 0; } rcu_read_unlock(); if (!test_bit(RPCAUTH_CRED_UPTODATE, &rc->cr_flags)) return 0; out: if (acred->principal != NULL) { if (gss_cred->gc_principal == NULL) return 0; ret = strcmp(acred->principal, gss_cred->gc_principal) == 0; } else { if (gss_cred->gc_principal != NULL) return 0; ret = uid_eq(rc->cr_cred->fsuid, acred->cred->fsuid); } return ret; } /* * Marshal credentials. * * The expensive part is computing the verifier. We can't cache a * pre-computed version of the verifier because the seqno, which * is different every time, is included in the MIC. */ static int gss_marshal(struct rpc_task *task, struct xdr_stream *xdr) { struct rpc_rqst *req = task->tk_rqstp; struct rpc_cred *cred = req->rq_cred; struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_cl_ctx *ctx = gss_cred_get_ctx(cred); __be32 *p, *cred_len; u32 maj_stat = 0; struct xdr_netobj mic; struct kvec iov; struct xdr_buf verf_buf; int status; /* Credential */ p = xdr_reserve_space(xdr, 7 * sizeof(*p) + ctx->gc_wire_ctx.len); if (!p) goto marshal_failed; *p++ = rpc_auth_gss; cred_len = p++; spin_lock(&ctx->gc_seq_lock); req->rq_seqno = (ctx->gc_seq < MAXSEQ) ? ctx->gc_seq++ : MAXSEQ; spin_unlock(&ctx->gc_seq_lock); if (req->rq_seqno == MAXSEQ) goto expired; trace_rpcgss_seqno(task); *p++ = cpu_to_be32(RPC_GSS_VERSION); *p++ = cpu_to_be32(ctx->gc_proc); *p++ = cpu_to_be32(req->rq_seqno); *p++ = cpu_to_be32(gss_cred->gc_service); p = xdr_encode_netobj(p, &ctx->gc_wire_ctx); *cred_len = cpu_to_be32((p - (cred_len + 1)) << 2); /* Verifier */ /* We compute the checksum for the verifier over the xdr-encoded bytes * starting with the xid and ending at the end of the credential: */ iov.iov_base = req->rq_snd_buf.head[0].iov_base; iov.iov_len = (u8 *)p - (u8 *)iov.iov_base; xdr_buf_from_iov(&iov, &verf_buf); p = xdr_reserve_space(xdr, sizeof(*p)); if (!p) goto marshal_failed; *p++ = rpc_auth_gss; mic.data = (u8 *)(p + 1); maj_stat = gss_get_mic(ctx->gc_gss_ctx, &verf_buf, &mic); if (maj_stat == GSS_S_CONTEXT_EXPIRED) goto expired; else if (maj_stat != 0) goto bad_mic; if (xdr_stream_encode_opaque_inline(xdr, (void **)&p, mic.len) < 0) goto marshal_failed; status = 0; out: gss_put_ctx(ctx); return status; expired: clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags); status = -EKEYEXPIRED; goto out; marshal_failed: status = -EMSGSIZE; goto out; bad_mic: trace_rpcgss_get_mic(task, maj_stat); status = -EIO; goto out; } static int gss_renew_cred(struct rpc_task *task) { struct rpc_cred *oldcred = task->tk_rqstp->rq_cred; struct gss_cred *gss_cred = container_of(oldcred, struct gss_cred, gc_base); struct rpc_auth *auth = oldcred->cr_auth; struct auth_cred acred = { .cred = oldcred->cr_cred, .principal = gss_cred->gc_principal, }; struct rpc_cred *new; new = gss_lookup_cred(auth, &acred, RPCAUTH_LOOKUP_NEW); if (IS_ERR(new)) return PTR_ERR(new); task->tk_rqstp->rq_cred = new; put_rpccred(oldcred); return 0; } static int gss_cred_is_negative_entry(struct rpc_cred *cred) { if (test_bit(RPCAUTH_CRED_NEGATIVE, &cred->cr_flags)) { unsigned long now = jiffies; unsigned long begin, expire; struct gss_cred *gss_cred; gss_cred = container_of(cred, struct gss_cred, gc_base); begin = gss_cred->gc_upcall_timestamp; expire = begin + gss_expired_cred_retry_delay * HZ; if (time_in_range_open(now, begin, expire)) return 1; } return 0; } /* * Refresh credentials. XXX - finish */ static int gss_refresh(struct rpc_task *task) { struct rpc_cred *cred = task->tk_rqstp->rq_cred; int ret = 0; if (gss_cred_is_negative_entry(cred)) return -EKEYEXPIRED; if (!test_bit(RPCAUTH_CRED_NEW, &cred->cr_flags) && !test_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags)) { ret = gss_renew_cred(task); if (ret < 0) goto out; cred = task->tk_rqstp->rq_cred; } if (test_bit(RPCAUTH_CRED_NEW, &cred->cr_flags)) ret = gss_refresh_upcall(task); out: return ret; } /* Dummy refresh routine: used only when destroying the context */ static int gss_refresh_null(struct rpc_task *task) { return 0; } static int gss_validate(struct rpc_task *task, struct xdr_stream *xdr) { struct rpc_cred *cred = task->tk_rqstp->rq_cred; struct gss_cl_ctx *ctx = gss_cred_get_ctx(cred); __be32 *p, *seq = NULL; struct kvec iov; struct xdr_buf verf_buf; struct xdr_netobj mic; u32 len, maj_stat; int status; p = xdr_inline_decode(xdr, 2 * sizeof(*p)); if (!p) goto validate_failed; if (*p++ != rpc_auth_gss) goto validate_failed; len = be32_to_cpup(p); if (len > RPC_MAX_AUTH_SIZE) goto validate_failed; p = xdr_inline_decode(xdr, len); if (!p) goto validate_failed; seq = kmalloc(4, GFP_KERNEL); if (!seq) goto validate_failed; *seq = cpu_to_be32(task->tk_rqstp->rq_seqno); iov.iov_base = seq; iov.iov_len = 4; xdr_buf_from_iov(&iov, &verf_buf); mic.data = (u8 *)p; mic.len = len; maj_stat = gss_verify_mic(ctx->gc_gss_ctx, &verf_buf, &mic); if (maj_stat == GSS_S_CONTEXT_EXPIRED) clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags); if (maj_stat) goto bad_mic; /* We leave it to unwrap to calculate au_rslack. For now we just * calculate the length of the verifier: */ if (test_bit(RPCAUTH_AUTH_UPDATE_SLACK, &cred->cr_auth->au_flags)) cred->cr_auth->au_verfsize = XDR_QUADLEN(len) + 2; status = 0; out: gss_put_ctx(ctx); kfree(seq); return status; validate_failed: status = -EIO; goto out; bad_mic: trace_rpcgss_verify_mic(task, maj_stat); status = -EACCES; goto out; } static noinline_for_stack int gss_wrap_req_integ(struct rpc_cred *cred, struct gss_cl_ctx *ctx, struct rpc_task *task, struct xdr_stream *xdr) { struct rpc_rqst *rqstp = task->tk_rqstp; struct xdr_buf integ_buf, *snd_buf = &rqstp->rq_snd_buf; struct xdr_netobj mic; __be32 *p, *integ_len; u32 offset, maj_stat; p = xdr_reserve_space(xdr, 2 * sizeof(*p)); if (!p) goto wrap_failed; integ_len = p++; *p = cpu_to_be32(rqstp->rq_seqno); if (rpcauth_wrap_req_encode(task, xdr)) goto wrap_failed; offset = (u8 *)p - (u8 *)snd_buf->head[0].iov_base; if (xdr_buf_subsegment(snd_buf, &integ_buf, offset, snd_buf->len - offset)) goto wrap_failed; *integ_len = cpu_to_be32(integ_buf.len); p = xdr_reserve_space(xdr, 0); if (!p) goto wrap_failed; mic.data = (u8 *)(p + 1); maj_stat = gss_get_mic(ctx->gc_gss_ctx, &integ_buf, &mic); if (maj_stat == GSS_S_CONTEXT_EXPIRED) clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags); else if (maj_stat) goto bad_mic; /* Check that the trailing MIC fit in the buffer, after the fact */ if (xdr_stream_encode_opaque_inline(xdr, (void **)&p, mic.len) < 0) goto wrap_failed; return 0; wrap_failed: return -EMSGSIZE; bad_mic: trace_rpcgss_get_mic(task, maj_stat); return -EIO; } static void priv_release_snd_buf(struct rpc_rqst *rqstp) { int i; for (i=0; i < rqstp->rq_enc_pages_num; i++) __free_page(rqstp->rq_enc_pages[i]); kfree(rqstp->rq_enc_pages); rqstp->rq_release_snd_buf = NULL; } static int alloc_enc_pages(struct rpc_rqst *rqstp) { struct xdr_buf *snd_buf = &rqstp->rq_snd_buf; int first, last, i; if (rqstp->rq_release_snd_buf) rqstp->rq_release_snd_buf(rqstp); if (snd_buf->page_len == 0) { rqstp->rq_enc_pages_num = 0; return 0; } first = snd_buf->page_base >> PAGE_SHIFT; last = (snd_buf->page_base + snd_buf->page_len - 1) >> PAGE_SHIFT; rqstp->rq_enc_pages_num = last - first + 1 + 1; rqstp->rq_enc_pages = kmalloc_array(rqstp->rq_enc_pages_num, sizeof(struct page *), GFP_KERNEL); if (!rqstp->rq_enc_pages) goto out; for (i=0; i < rqstp->rq_enc_pages_num; i++) { rqstp->rq_enc_pages[i] = alloc_page(GFP_KERNEL); if (rqstp->rq_enc_pages[i] == NULL) goto out_free; } rqstp->rq_release_snd_buf = priv_release_snd_buf; return 0; out_free: rqstp->rq_enc_pages_num = i; priv_release_snd_buf(rqstp); out: return -EAGAIN; } static noinline_for_stack int gss_wrap_req_priv(struct rpc_cred *cred, struct gss_cl_ctx *ctx, struct rpc_task *task, struct xdr_stream *xdr) { struct rpc_rqst *rqstp = task->tk_rqstp; struct xdr_buf *snd_buf = &rqstp->rq_snd_buf; u32 pad, offset, maj_stat; int status; __be32 *p, *opaque_len; struct page **inpages; int first; struct kvec *iov; status = -EIO; p = xdr_reserve_space(xdr, 2 * sizeof(*p)); if (!p) goto wrap_failed; opaque_len = p++; *p = cpu_to_be32(rqstp->rq_seqno); if (rpcauth_wrap_req_encode(task, xdr)) goto wrap_failed; status = alloc_enc_pages(rqstp); if (unlikely(status)) goto wrap_failed; first = snd_buf->page_base >> PAGE_SHIFT; inpages = snd_buf->pages + first; snd_buf->pages = rqstp->rq_enc_pages; snd_buf->page_base -= first << PAGE_SHIFT; /* * Move the tail into its own page, in case gss_wrap needs * more space in the head when wrapping. * * Still... Why can't gss_wrap just slide the tail down? */ if (snd_buf->page_len || snd_buf->tail[0].iov_len) { char *tmp; tmp = page_address(rqstp->rq_enc_pages[rqstp->rq_enc_pages_num - 1]); memcpy(tmp, snd_buf->tail[0].iov_base, snd_buf->tail[0].iov_len); snd_buf->tail[0].iov_base = tmp; } offset = (u8 *)p - (u8 *)snd_buf->head[0].iov_base; maj_stat = gss_wrap(ctx->gc_gss_ctx, offset, snd_buf, inpages); /* slack space should prevent this ever happening: */ if (unlikely(snd_buf->len > snd_buf->buflen)) { status = -EIO; goto wrap_failed; } /* We're assuming that when GSS_S_CONTEXT_EXPIRED, the encryption was * done anyway, so it's safe to put the request on the wire: */ if (maj_stat == GSS_S_CONTEXT_EXPIRED) clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags); else if (maj_stat) goto bad_wrap; *opaque_len = cpu_to_be32(snd_buf->len - offset); /* guess whether the pad goes into the head or the tail: */ if (snd_buf->page_len || snd_buf->tail[0].iov_len) iov = snd_buf->tail; else iov = snd_buf->head; p = iov->iov_base + iov->iov_len; pad = xdr_pad_size(snd_buf->len - offset); memset(p, 0, pad); iov->iov_len += pad; snd_buf->len += pad; return 0; wrap_failed: return status; bad_wrap: trace_rpcgss_wrap(task, maj_stat); return -EIO; } static int gss_wrap_req(struct rpc_task *task, struct xdr_stream *xdr) { struct rpc_cred *cred = task->tk_rqstp->rq_cred; struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_cl_ctx *ctx = gss_cred_get_ctx(cred); int status; status = -EIO; if (ctx->gc_proc != RPC_GSS_PROC_DATA) { /* The spec seems a little ambiguous here, but I think that not * wrapping context destruction requests makes the most sense. */ status = rpcauth_wrap_req_encode(task, xdr); goto out; } switch (gss_cred->gc_service) { case RPC_GSS_SVC_NONE: status = rpcauth_wrap_req_encode(task, xdr); break; case RPC_GSS_SVC_INTEGRITY: status = gss_wrap_req_integ(cred, ctx, task, xdr); break; case RPC_GSS_SVC_PRIVACY: status = gss_wrap_req_priv(cred, ctx, task, xdr); break; default: status = -EIO; } out: gss_put_ctx(ctx); return status; } /** * gss_update_rslack - Possibly update RPC receive buffer size estimates * @task: rpc_task for incoming RPC Reply being unwrapped * @cred: controlling rpc_cred for @task * @before: XDR words needed before each RPC Reply message * @after: XDR words needed following each RPC Reply message * */ static void gss_update_rslack(struct rpc_task *task, struct rpc_cred *cred, unsigned int before, unsigned int after) { struct rpc_auth *auth = cred->cr_auth; if (test_and_clear_bit(RPCAUTH_AUTH_UPDATE_SLACK, &auth->au_flags)) { auth->au_ralign = auth->au_verfsize + before; auth->au_rslack = auth->au_verfsize + after; trace_rpcgss_update_slack(task, auth); } } static int gss_unwrap_resp_auth(struct rpc_task *task, struct rpc_cred *cred) { gss_update_rslack(task, cred, 0, 0); return 0; } /* * RFC 2203, Section 5.3.2.2 * * struct rpc_gss_integ_data { * opaque databody_integ<>; * opaque checksum<>; * }; * * struct rpc_gss_data_t { * unsigned int seq_num; * proc_req_arg_t arg; * }; */ static noinline_for_stack int gss_unwrap_resp_integ(struct rpc_task *task, struct rpc_cred *cred, struct gss_cl_ctx *ctx, struct rpc_rqst *rqstp, struct xdr_stream *xdr) { struct xdr_buf gss_data, *rcv_buf = &rqstp->rq_rcv_buf; u32 len, offset, seqno, maj_stat; struct xdr_netobj mic; int ret; ret = -EIO; mic.data = NULL; /* opaque databody_integ<>; */ if (xdr_stream_decode_u32(xdr, &len)) goto unwrap_failed; if (len & 3) goto unwrap_failed; offset = rcv_buf->len - xdr_stream_remaining(xdr); if (xdr_stream_decode_u32(xdr, &seqno)) goto unwrap_failed; if (seqno != rqstp->rq_seqno) goto bad_seqno; if (xdr_buf_subsegment(rcv_buf, &gss_data, offset, len)) goto unwrap_failed; /* * The xdr_stream now points to the beginning of the * upper layer payload, to be passed below to * rpcauth_unwrap_resp_decode(). The checksum, which * follows the upper layer payload in @rcv_buf, is * located and parsed without updating the xdr_stream. */ /* opaque checksum<>; */ offset += len; if (xdr_decode_word(rcv_buf, offset, &len)) goto unwrap_failed; offset += sizeof(__be32); if (offset + len > rcv_buf->len) goto unwrap_failed; mic.len = len; mic.data = kmalloc(len, GFP_KERNEL); if (ZERO_OR_NULL_PTR(mic.data)) goto unwrap_failed; if (read_bytes_from_xdr_buf(rcv_buf, offset, mic.data, mic.len)) goto unwrap_failed; maj_stat = gss_verify_mic(ctx->gc_gss_ctx, &gss_data, &mic); if (maj_stat == GSS_S_CONTEXT_EXPIRED) clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags); if (maj_stat != GSS_S_COMPLETE) goto bad_mic; gss_update_rslack(task, cred, 2, 2 + 1 + XDR_QUADLEN(mic.len)); ret = 0; out: kfree(mic.data); return ret; unwrap_failed: trace_rpcgss_unwrap_failed(task); goto out; bad_seqno: trace_rpcgss_bad_seqno(task, rqstp->rq_seqno, seqno); goto out; bad_mic: trace_rpcgss_verify_mic(task, maj_stat); goto out; } static noinline_for_stack int gss_unwrap_resp_priv(struct rpc_task *task, struct rpc_cred *cred, struct gss_cl_ctx *ctx, struct rpc_rqst *rqstp, struct xdr_stream *xdr) { struct xdr_buf *rcv_buf = &rqstp->rq_rcv_buf; struct kvec *head = rqstp->rq_rcv_buf.head; u32 offset, opaque_len, maj_stat; __be32 *p; p = xdr_inline_decode(xdr, 2 * sizeof(*p)); if (unlikely(!p)) goto unwrap_failed; opaque_len = be32_to_cpup(p++); offset = (u8 *)(p) - (u8 *)head->iov_base; if (offset + opaque_len > rcv_buf->len) goto unwrap_failed; maj_stat = gss_unwrap(ctx->gc_gss_ctx, offset, offset + opaque_len, rcv_buf); if (maj_stat == GSS_S_CONTEXT_EXPIRED) clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags); if (maj_stat != GSS_S_COMPLETE) goto bad_unwrap; /* gss_unwrap decrypted the sequence number */ if (be32_to_cpup(p++) != rqstp->rq_seqno) goto bad_seqno; /* gss_unwrap redacts the opaque blob from the head iovec. * rcv_buf has changed, thus the stream needs to be reset. */ xdr_init_decode(xdr, rcv_buf, p, rqstp); gss_update_rslack(task, cred, 2 + ctx->gc_gss_ctx->align, 2 + ctx->gc_gss_ctx->slack); return 0; unwrap_failed: trace_rpcgss_unwrap_failed(task); return -EIO; bad_seqno: trace_rpcgss_bad_seqno(task, rqstp->rq_seqno, be32_to_cpup(--p)); return -EIO; bad_unwrap: trace_rpcgss_unwrap(task, maj_stat); return -EIO; } static bool gss_seq_is_newer(u32 new, u32 old) { return (s32)(new - old) > 0; } static bool gss_xmit_need_reencode(struct rpc_task *task) { struct rpc_rqst *req = task->tk_rqstp; struct rpc_cred *cred = req->rq_cred; struct gss_cl_ctx *ctx = gss_cred_get_ctx(cred); u32 win, seq_xmit = 0; bool ret = true; if (!ctx) goto out; if (gss_seq_is_newer(req->rq_seqno, READ_ONCE(ctx->gc_seq))) goto out_ctx; seq_xmit = READ_ONCE(ctx->gc_seq_xmit); while (gss_seq_is_newer(req->rq_seqno, seq_xmit)) { u32 tmp = seq_xmit; seq_xmit = cmpxchg(&ctx->gc_seq_xmit, tmp, req->rq_seqno); if (seq_xmit == tmp) { ret = false; goto out_ctx; } } win = ctx->gc_win; if (win > 0) ret = !gss_seq_is_newer(req->rq_seqno, seq_xmit - win); out_ctx: gss_put_ctx(ctx); out: trace_rpcgss_need_reencode(task, seq_xmit, ret); return ret; } static int gss_unwrap_resp(struct rpc_task *task, struct xdr_stream *xdr) { struct rpc_rqst *rqstp = task->tk_rqstp; struct rpc_cred *cred = rqstp->rq_cred; struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base); struct gss_cl_ctx *ctx = gss_cred_get_ctx(cred); int status = -EIO; if (ctx->gc_proc != RPC_GSS_PROC_DATA) goto out_decode; switch (gss_cred->gc_service) { case RPC_GSS_SVC_NONE: status = gss_unwrap_resp_auth(task, cred); break; case RPC_GSS_SVC_INTEGRITY: status = gss_unwrap_resp_integ(task, cred, ctx, rqstp, xdr); break; case RPC_GSS_SVC_PRIVACY: status = gss_unwrap_resp_priv(task, cred, ctx, rqstp, xdr); break; } if (status) goto out; out_decode: status = rpcauth_unwrap_resp_decode(task, xdr); out: gss_put_ctx(ctx); return status; } static const struct rpc_authops authgss_ops = { .owner = THIS_MODULE, .au_flavor = RPC_AUTH_GSS, .au_name = "RPCSEC_GSS", .create = gss_create, .destroy = gss_destroy, .hash_cred = gss_hash_cred, .lookup_cred = gss_lookup_cred, .crcreate = gss_create_cred, .info2flavor = gss_mech_info2flavor, .flavor2info = gss_mech_flavor2info, }; static const struct rpc_credops gss_credops = { .cr_name = "AUTH_GSS", .crdestroy = gss_destroy_cred, .cr_init = gss_cred_init, .crmatch = gss_match, .crmarshal = gss_marshal, .crrefresh = gss_refresh, .crvalidate = gss_validate, .crwrap_req = gss_wrap_req, .crunwrap_resp = gss_unwrap_resp, .crkey_timeout = gss_key_timeout, .crstringify_acceptor = gss_stringify_acceptor, .crneed_reencode = gss_xmit_need_reencode, }; static const struct rpc_credops gss_nullops = { .cr_name = "AUTH_GSS", .crdestroy = gss_destroy_nullcred, .crmatch = gss_match, .crmarshal = gss_marshal, .crrefresh = gss_refresh_null, .crvalidate = gss_validate, .crwrap_req = gss_wrap_req, .crunwrap_resp = gss_unwrap_resp, .crstringify_acceptor = gss_stringify_acceptor, }; static const struct rpc_pipe_ops gss_upcall_ops_v0 = { .upcall = gss_v0_upcall, .downcall = gss_pipe_downcall, .destroy_msg = gss_pipe_destroy_msg, .open_pipe = gss_pipe_open_v0, .release_pipe = gss_pipe_release, }; static const struct rpc_pipe_ops gss_upcall_ops_v1 = { .upcall = gss_v1_upcall, .downcall = gss_pipe_downcall, .destroy_msg = gss_pipe_destroy_msg, .open_pipe = gss_pipe_open_v1, .release_pipe = gss_pipe_release, }; static __net_init int rpcsec_gss_init_net(struct net *net) { return gss_svc_init_net(net); } static __net_exit void rpcsec_gss_exit_net(struct net *net) { gss_svc_shutdown_net(net); } static struct pernet_operations rpcsec_gss_net_ops = { .init = rpcsec_gss_init_net, .exit = rpcsec_gss_exit_net, }; /* * Initialize RPCSEC_GSS module */ static int __init init_rpcsec_gss(void) { int err = 0; err = rpcauth_register(&authgss_ops); if (err) goto out; err = gss_svc_init(); if (err) goto out_unregister; err = register_pernet_subsys(&rpcsec_gss_net_ops); if (err) goto out_svc_exit; rpc_init_wait_queue(&pipe_version_rpc_waitqueue, "gss pipe version"); return 0; out_svc_exit: gss_svc_shutdown(); out_unregister: rpcauth_unregister(&authgss_ops); out: return err; } static void __exit exit_rpcsec_gss(void) { unregister_pernet_subsys(&rpcsec_gss_net_ops); gss_svc_shutdown(); rpcauth_unregister(&authgss_ops); rcu_barrier(); /* Wait for completion of call_rcu()'s */ } MODULE_ALIAS("rpc-auth-6"); MODULE_DESCRIPTION("Sun RPC Kerberos RPCSEC_GSS client authentication"); MODULE_LICENSE("GPL"); module_param_named(expired_cred_retry_delay, gss_expired_cred_retry_delay, uint, 0644); MODULE_PARM_DESC(expired_cred_retry_delay, "Timeout (in seconds) until " "the RPC engine retries an expired credential"); module_param_named(key_expire_timeo, gss_key_expire_timeo, uint, 0644); MODULE_PARM_DESC(key_expire_timeo, "Time (in seconds) at the end of a " "credential keys lifetime where the NFS layer cleans up " "prior to key expiration"); module_init(init_rpcsec_gss) module_exit(exit_rpcsec_gss) |
| 80 78 71 8 8 8 7 7 7 85 1 1 7 12 12 4 2 2 7 7 89 82 2 80 1 2 2 78 78 80 7 7 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef ARCH_X86_KVM_X86_H #define ARCH_X86_KVM_X86_H #include <linux/kvm_host.h> #include <asm/fpu/xstate.h> #include <asm/mce.h> #include <asm/pvclock.h> #include "kvm_cache_regs.h" #include "kvm_emulate.h" #include "cpuid.h" struct kvm_caps { /* control of guest tsc rate supported? */ bool has_tsc_control; /* maximum supported tsc_khz for guests */ u32 max_guest_tsc_khz; /* number of bits of the fractional part of the TSC scaling ratio */ u8 tsc_scaling_ratio_frac_bits; /* maximum allowed value of TSC scaling ratio */ u64 max_tsc_scaling_ratio; /* 1ull << kvm_caps.tsc_scaling_ratio_frac_bits */ u64 default_tsc_scaling_ratio; /* bus lock detection supported? */ bool has_bus_lock_exit; /* notify VM exit supported? */ bool has_notify_vmexit; /* bit mask of VM types */ u32 supported_vm_types; u64 supported_mce_cap; u64 supported_xcr0; u64 supported_xss; u64 supported_perf_cap; }; struct kvm_host_values { /* * The host's raw MAXPHYADDR, i.e. the number of non-reserved physical * address bits irrespective of features that repurpose legal bits, * e.g. MKTME. */ u8 maxphyaddr; u64 efer; u64 xcr0; u64 xss; u64 arch_capabilities; }; void kvm_spurious_fault(void); #define KVM_NESTED_VMENTER_CONSISTENCY_CHECK(consistency_check) \ ({ \ bool failed = (consistency_check); \ if (failed) \ trace_kvm_nested_vmenter_failed(#consistency_check, 0); \ failed; \ }) /* * The first...last VMX feature MSRs that are emulated by KVM. This may or may * not cover all known VMX MSRs, as KVM doesn't emulate an MSR until there's an * associated feature that KVM supports for nested virtualization. */ #define KVM_FIRST_EMULATED_VMX_MSR MSR_IA32_VMX_BASIC #define KVM_LAST_EMULATED_VMX_MSR MSR_IA32_VMX_VMFUNC #define KVM_DEFAULT_PLE_GAP 128 #define KVM_VMX_DEFAULT_PLE_WINDOW 4096 #define KVM_DEFAULT_PLE_WINDOW_GROW 2 #define KVM_DEFAULT_PLE_WINDOW_SHRINK 0 #define KVM_VMX_DEFAULT_PLE_WINDOW_MAX UINT_MAX #define KVM_SVM_DEFAULT_PLE_WINDOW_MAX USHRT_MAX #define KVM_SVM_DEFAULT_PLE_WINDOW 3000 static inline unsigned int __grow_ple_window(unsigned int val, unsigned int base, unsigned int modifier, unsigned int max) { u64 ret = val; if (modifier < 1) return base; if (modifier < base) ret *= modifier; else ret += modifier; return min(ret, (u64)max); } static inline unsigned int __shrink_ple_window(unsigned int val, unsigned int base, unsigned int modifier, unsigned int min) { if (modifier < 1) return base; if (modifier < base) val /= modifier; else val -= modifier; return max(val, min); } #define MSR_IA32_CR_PAT_DEFAULT \ PAT_VALUE(WB, WT, UC_MINUS, UC, WB, WT, UC_MINUS, UC) void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu); int kvm_check_nested_events(struct kvm_vcpu *vcpu); /* Forcibly leave the nested mode in cases like a vCPU reset */ static inline void kvm_leave_nested(struct kvm_vcpu *vcpu) { kvm_x86_ops.nested_ops->leave_nested(vcpu); } static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu) { return vcpu->arch.last_vmentry_cpu != -1; } static inline void kvm_set_mp_state(struct kvm_vcpu *vcpu, int mp_state) { vcpu->arch.mp_state = mp_state; if (mp_state == KVM_MP_STATE_RUNNABLE) vcpu->arch.pv.pv_unhalted = false; } static inline bool kvm_is_exception_pending(struct kvm_vcpu *vcpu) { return vcpu->arch.exception.pending || vcpu->arch.exception_vmexit.pending || kvm_test_request(KVM_REQ_TRIPLE_FAULT, vcpu); } static inline void kvm_clear_exception_queue(struct kvm_vcpu *vcpu) { vcpu->arch.exception.pending = false; vcpu->arch.exception.injected = false; vcpu->arch.exception_vmexit.pending = false; } static inline void kvm_queue_interrupt(struct kvm_vcpu *vcpu, u8 vector, bool soft) { vcpu->arch.interrupt.injected = true; vcpu->arch.interrupt.soft = soft; vcpu->arch.interrupt.nr = vector; } static inline void kvm_clear_interrupt_queue(struct kvm_vcpu *vcpu) { vcpu->arch.interrupt.injected = false; } static inline bool kvm_event_needs_reinjection(struct kvm_vcpu *vcpu) { return vcpu->arch.exception.injected || vcpu->arch.interrupt.injected || vcpu->arch.nmi_injected; } static inline bool kvm_exception_is_soft(unsigned int nr) { return (nr == BP_VECTOR) || (nr == OF_VECTOR); } static inline bool is_protmode(struct kvm_vcpu *vcpu) { return kvm_is_cr0_bit_set(vcpu, X86_CR0_PE); } static inline bool is_long_mode(struct kvm_vcpu *vcpu) { #ifdef CONFIG_X86_64 return !!(vcpu->arch.efer & EFER_LMA); #else return false; #endif } static inline bool is_64_bit_mode(struct kvm_vcpu *vcpu) { int cs_db, cs_l; WARN_ON_ONCE(vcpu->arch.guest_state_protected); if (!is_long_mode(vcpu)) return false; kvm_x86_call(get_cs_db_l_bits)(vcpu, &cs_db, &cs_l); return cs_l; } static inline bool is_64_bit_hypercall(struct kvm_vcpu *vcpu) { /* * If running with protected guest state, the CS register is not * accessible. The hypercall register values will have had to been * provided in 64-bit mode, so assume the guest is in 64-bit. */ return vcpu->arch.guest_state_protected || is_64_bit_mode(vcpu); } static inline bool x86_exception_has_error_code(unsigned int vector) { static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) | BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) | BIT(PF_VECTOR) | BIT(AC_VECTOR); return (1U << vector) & exception_has_error_code; } static inline bool mmu_is_nested(struct kvm_vcpu *vcpu) { return vcpu->arch.walk_mmu == &vcpu->arch.nested_mmu; } static inline bool is_pae(struct kvm_vcpu *vcpu) { return kvm_is_cr4_bit_set(vcpu, X86_CR4_PAE); } static inline bool is_pse(struct kvm_vcpu *vcpu) { return kvm_is_cr4_bit_set(vcpu, X86_CR4_PSE); } static inline bool is_paging(struct kvm_vcpu *vcpu) { return likely(kvm_is_cr0_bit_set(vcpu, X86_CR0_PG)); } static inline bool is_pae_paging(struct kvm_vcpu *vcpu) { return !is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu); } static inline u8 vcpu_virt_addr_bits(struct kvm_vcpu *vcpu) { return kvm_is_cr4_bit_set(vcpu, X86_CR4_LA57) ? 57 : 48; } static inline u8 max_host_virt_addr_bits(void) { return kvm_cpu_cap_has(X86_FEATURE_LA57) ? 57 : 48; } /* * x86 MSRs which contain linear addresses, x86 hidden segment bases, and * IDT/GDT bases have static canonicality checks, the size of which depends * only on the CPU's support for 5-level paging, rather than on the state of * CR4.LA57. This applies to both WRMSR and to other instructions that set * their values, e.g. SGDT. * * KVM passes through most of these MSRS and also doesn't intercept the * instructions that set the hidden segment bases. * * Because of this, to be consistent with hardware, even if the guest doesn't * have LA57 enabled in its CPUID, perform canonicality checks based on *host* * support for 5 level paging. * * Finally, instructions which are related to MMU invalidation of a given * linear address, also have a similar static canonical check on address. * This allows for example to invalidate 5-level addresses of a guest from a * host which uses 4-level paging. */ static inline bool is_noncanonical_address(u64 la, struct kvm_vcpu *vcpu, unsigned int flags) { if (flags & (X86EMUL_F_INVLPG | X86EMUL_F_MSR | X86EMUL_F_DT_LOAD)) return !__is_canonical_address(la, max_host_virt_addr_bits()); else return !__is_canonical_address(la, vcpu_virt_addr_bits(vcpu)); } static inline bool is_noncanonical_msr_address(u64 la, struct kvm_vcpu *vcpu) { return is_noncanonical_address(la, vcpu, X86EMUL_F_MSR); } static inline bool is_noncanonical_base_address(u64 la, struct kvm_vcpu *vcpu) { return is_noncanonical_address(la, vcpu, X86EMUL_F_DT_LOAD); } static inline bool is_noncanonical_invlpg_address(u64 la, struct kvm_vcpu *vcpu) { return is_noncanonical_address(la, vcpu, X86EMUL_F_INVLPG); } static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn, unsigned access) { u64 gen = kvm_memslots(vcpu->kvm)->generation; if (unlikely(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS)) return; /* * If this is a shadow nested page table, the "GVA" is * actually a nGPA. */ vcpu->arch.mmio_gva = mmu_is_nested(vcpu) ? 0 : gva & PAGE_MASK; vcpu->arch.mmio_access = access; vcpu->arch.mmio_gfn = gfn; vcpu->arch.mmio_gen = gen; } static inline bool vcpu_match_mmio_gen(struct kvm_vcpu *vcpu) { return vcpu->arch.mmio_gen == kvm_memslots(vcpu->kvm)->generation; } /* * Clear the mmio cache info for the given gva. If gva is MMIO_GVA_ANY, we * clear all mmio cache info. */ #define MMIO_GVA_ANY (~(gva_t)0) static inline void vcpu_clear_mmio_info(struct kvm_vcpu *vcpu, gva_t gva) { if (gva != MMIO_GVA_ANY && vcpu->arch.mmio_gva != (gva & PAGE_MASK)) return; vcpu->arch.mmio_gva = 0; } static inline bool vcpu_match_mmio_gva(struct kvm_vcpu *vcpu, unsigned long gva) { if (vcpu_match_mmio_gen(vcpu) && vcpu->arch.mmio_gva && vcpu->arch.mmio_gva == (gva & PAGE_MASK)) return true; return false; } static inline bool vcpu_match_mmio_gpa(struct kvm_vcpu *vcpu, gpa_t gpa) { if (vcpu_match_mmio_gen(vcpu) && vcpu->arch.mmio_gfn && vcpu->arch.mmio_gfn == gpa >> PAGE_SHIFT) return true; return false; } static inline unsigned long kvm_register_read(struct kvm_vcpu *vcpu, int reg) { unsigned long val = kvm_register_read_raw(vcpu, reg); return is_64_bit_mode(vcpu) ? val : (u32)val; } static inline void kvm_register_write(struct kvm_vcpu *vcpu, int reg, unsigned long val) { if (!is_64_bit_mode(vcpu)) val = (u32)val; return kvm_register_write_raw(vcpu, reg, val); } static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk) { return !(kvm->arch.disabled_quirks & quirk); } void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip); u64 get_kvmclock_ns(struct kvm *kvm); uint64_t kvm_get_wall_clock_epoch(struct kvm *kvm); bool kvm_get_monotonic_and_clockread(s64 *kernel_ns, u64 *tsc_timestamp); int kvm_guest_time_update(struct kvm_vcpu *v); int kvm_read_guest_virt(struct kvm_vcpu *vcpu, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception); int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception); int handle_ud(struct kvm_vcpu *vcpu); void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu, struct kvm_queued_exception *ex); int kvm_mtrr_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data); int kvm_mtrr_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata); bool kvm_vector_hashing_enabled(void); void kvm_fixup_and_inject_pf_error(struct kvm_vcpu *vcpu, gva_t gva, u16 error_code); int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type, void *insn, int insn_len); int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int emulation_type, void *insn, int insn_len); fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu); fastpath_t handle_fastpath_hlt(struct kvm_vcpu *vcpu); extern struct kvm_caps kvm_caps; extern struct kvm_host_values kvm_host; extern bool enable_pmu; /* * Get a filtered version of KVM's supported XCR0 that strips out dynamic * features for which the current process doesn't (yet) have permission to use. * This is intended to be used only when enumerating support to userspace, * e.g. in KVM_GET_SUPPORTED_CPUID and KVM_CAP_XSAVE2, it does NOT need to be * used to check/restrict guest behavior as KVM rejects KVM_SET_CPUID{2} if * userspace attempts to enable unpermitted features. */ static inline u64 kvm_get_filtered_xcr0(void) { u64 permitted_xcr0 = kvm_caps.supported_xcr0; BUILD_BUG_ON(XFEATURE_MASK_USER_DYNAMIC != XFEATURE_MASK_XTILE_DATA); if (permitted_xcr0 & XFEATURE_MASK_USER_DYNAMIC) { permitted_xcr0 &= xstate_get_guest_group_perm(); /* * Treat XTILE_CFG as unsupported if the current process isn't * allowed to use XTILE_DATA, as attempting to set XTILE_CFG in * XCR0 without setting XTILE_DATA is architecturally illegal. */ if (!(permitted_xcr0 & XFEATURE_MASK_XTILE_DATA)) permitted_xcr0 &= ~XFEATURE_MASK_XTILE_CFG; } return permitted_xcr0; } static inline bool kvm_mpx_supported(void) { return (kvm_caps.supported_xcr0 & (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR)) == (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR); } extern unsigned int min_timer_period_us; extern bool enable_vmware_backdoor; extern int pi_inject_timer; extern bool report_ignored_msrs; extern bool eager_page_split; static inline void kvm_pr_unimpl_wrmsr(struct kvm_vcpu *vcpu, u32 msr, u64 data) { if (report_ignored_msrs) vcpu_unimpl(vcpu, "Unhandled WRMSR(0x%x) = 0x%llx\n", msr, data); } static inline void kvm_pr_unimpl_rdmsr(struct kvm_vcpu *vcpu, u32 msr) { if (report_ignored_msrs) vcpu_unimpl(vcpu, "Unhandled RDMSR(0x%x)\n", msr); } static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec) { return pvclock_scale_delta(nsec, vcpu->arch.virtual_tsc_mult, vcpu->arch.virtual_tsc_shift); } /* Same "calling convention" as do_div: * - divide (n << 32) by base * - put result in n * - return remainder */ #define do_shl32_div32(n, base) \ ({ \ u32 __quot, __rem; \ asm("divl %2" : "=a" (__quot), "=d" (__rem) \ : "rm" (base), "0" (0), "1" ((u32) n)); \ n = __quot; \ __rem; \ }) static inline bool kvm_mwait_in_guest(struct kvm *kvm) { return kvm->arch.mwait_in_guest; } static inline bool kvm_hlt_in_guest(struct kvm *kvm) { return kvm->arch.hlt_in_guest; } static inline bool kvm_pause_in_guest(struct kvm *kvm) { return kvm->arch.pause_in_guest; } static inline bool kvm_cstate_in_guest(struct kvm *kvm) { return kvm->arch.cstate_in_guest; } static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm) { return kvm->arch.notify_vmexit_flags & KVM_X86_NOTIFY_VMEXIT_ENABLED; } static __always_inline void kvm_before_interrupt(struct kvm_vcpu *vcpu, enum kvm_intr_type intr) { WRITE_ONCE(vcpu->arch.handling_intr_from_guest, (u8)intr); } static __always_inline void kvm_after_interrupt(struct kvm_vcpu *vcpu) { WRITE_ONCE(vcpu->arch.handling_intr_from_guest, 0); } static inline bool kvm_handling_nmi_from_guest(struct kvm_vcpu *vcpu) { return vcpu->arch.handling_intr_from_guest == KVM_HANDLING_NMI; } static inline bool kvm_pat_valid(u64 data) { if (data & 0xF8F8F8F8F8F8F8F8ull) return false; /* 0, 1, 4, 5, 6, 7 are valid values. */ return (data | ((data & 0x0202020202020202ull) << 1)) == data; } static inline bool kvm_dr7_valid(u64 data) { /* Bits [63:32] are reserved */ return !(data >> 32); } static inline bool kvm_dr6_valid(u64 data) { /* Bits [63:32] are reserved */ return !(data >> 32); } /* * Trigger machine check on the host. We assume all the MSRs are already set up * by the CPU and that we still run on the same CPU as the MCE occurred on. * We pass a fake environment to the machine check handler because we want * the guest to be always treated like user space, no matter what context * it used internally. */ static inline void kvm_machine_check(void) { #if defined(CONFIG_X86_MCE) struct pt_regs regs = { .cs = 3, /* Fake ring 3 no matter what the guest ran on */ .flags = X86_EFLAGS_IF, }; do_machine_check(®s); #endif } void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu); void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu); int kvm_spec_ctrl_test_value(u64 value); int kvm_handle_memory_failure(struct kvm_vcpu *vcpu, int r, struct x86_exception *e); int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva); bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type); enum kvm_msr_access { MSR_TYPE_R = BIT(0), MSR_TYPE_W = BIT(1), MSR_TYPE_RW = MSR_TYPE_R | MSR_TYPE_W, }; /* * Internal error codes that are used to indicate that MSR emulation encountered * an error that should result in #GP in the guest, unless userspace handles it. * Note, '1', '0', and negative numbers are off limits, as they are used by KVM * as part of KVM's lightly documented internal KVM_RUN return codes. * * UNSUPPORTED - The MSR isn't supported, either because it is completely * unknown to KVM, or because the MSR should not exist according * to the vCPU model. * * FILTERED - Access to the MSR is denied by a userspace MSR filter. */ #define KVM_MSR_RET_UNSUPPORTED 2 #define KVM_MSR_RET_FILTERED 3 static inline bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { return !(cr4 & vcpu->arch.cr4_guest_rsvd_bits); } #define __cr4_reserved_bits(__cpu_has, __c) \ ({ \ u64 __reserved_bits = CR4_RESERVED_BITS; \ \ if (!__cpu_has(__c, X86_FEATURE_XSAVE)) \ __reserved_bits |= X86_CR4_OSXSAVE; \ if (!__cpu_has(__c, X86_FEATURE_SMEP)) \ __reserved_bits |= X86_CR4_SMEP; \ if (!__cpu_has(__c, X86_FEATURE_SMAP)) \ __reserved_bits |= X86_CR4_SMAP; \ if (!__cpu_has(__c, X86_FEATURE_FSGSBASE)) \ __reserved_bits |= X86_CR4_FSGSBASE; \ if (!__cpu_has(__c, X86_FEATURE_PKU)) \ __reserved_bits |= X86_CR4_PKE; \ if (!__cpu_has(__c, X86_FEATURE_LA57)) \ __reserved_bits |= X86_CR4_LA57; \ if (!__cpu_has(__c, X86_FEATURE_UMIP)) \ __reserved_bits |= X86_CR4_UMIP; \ if (!__cpu_has(__c, X86_FEATURE_VMX)) \ __reserved_bits |= X86_CR4_VMXE; \ if (!__cpu_has(__c, X86_FEATURE_PCID)) \ __reserved_bits |= X86_CR4_PCIDE; \ if (!__cpu_has(__c, X86_FEATURE_LAM)) \ __reserved_bits |= X86_CR4_LAM_SUP; \ __reserved_bits; \ }) int kvm_sev_es_mmio_write(struct kvm_vcpu *vcpu, gpa_t src, unsigned int bytes, void *dst); int kvm_sev_es_mmio_read(struct kvm_vcpu *vcpu, gpa_t src, unsigned int bytes, void *dst); int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, unsigned int port, void *data, unsigned int count, int in); static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) { return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); } int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, unsigned long a0, unsigned long a1, unsigned long a2, unsigned long a3, int op_64_bit, int cpl, int (*complete_hypercall)(struct kvm_vcpu *)); #define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ ({ \ int __ret; \ \ __ret = ____kvm_emulate_hypercall(_vcpu, \ kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ complete_hypercall); \ \ if (__ret > 0) \ __ret = complete_hypercall(_vcpu); \ __ret; \ }) int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); #endif |
| 2 5 5 5 7 7 7 6 7 7 3 10 9 10 6 6 6 6 6 6 6 6 8 8 8 8 8 8 7 7 10 10 10 7 5 2 7 7 7 7 7 1 1 1 1 1 1 7 1 7 1 1 1 3 13 7 10 7 2 7 7 2 7 7 10 8 8 8 7 1 6 4 4 4 4 3 11 11 11 11 1 11 5 5 5 5 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 5 7 7 7 7 5 7 7 7 5 5 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 | // SPDX-License-Identifier: GPL-2.0-only /* * Input driver to ExplorerPS/2 device driver module. * * Copyright (c) 1999-2002 Vojtech Pavlik * Copyright (c) 2004 Dmitry Torokhov */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #define MOUSEDEV_MINOR_BASE 32 #define MOUSEDEV_MINORS 31 #define MOUSEDEV_MIX 63 #include <linux/bitops.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/poll.h> #include <linux/module.h> #include <linux/init.h> #include <linux/input.h> #include <linux/random.h> #include <linux/major.h> #include <linux/device.h> #include <linux/cdev.h> #include <linux/kernel.h> MODULE_AUTHOR("Vojtech Pavlik <vojtech@ucw.cz>"); MODULE_DESCRIPTION("Mouse (ExplorerPS/2) device interfaces"); MODULE_LICENSE("GPL"); #ifndef CONFIG_INPUT_MOUSEDEV_SCREEN_X #define CONFIG_INPUT_MOUSEDEV_SCREEN_X 1024 #endif #ifndef CONFIG_INPUT_MOUSEDEV_SCREEN_Y #define CONFIG_INPUT_MOUSEDEV_SCREEN_Y 768 #endif static int xres = CONFIG_INPUT_MOUSEDEV_SCREEN_X; module_param(xres, uint, 0644); MODULE_PARM_DESC(xres, "Horizontal screen resolution"); static int yres = CONFIG_INPUT_MOUSEDEV_SCREEN_Y; module_param(yres, uint, 0644); MODULE_PARM_DESC(yres, "Vertical screen resolution"); static unsigned tap_time = 200; module_param(tap_time, uint, 0644); MODULE_PARM_DESC(tap_time, "Tap time for touchpads in absolute mode (msecs)"); struct mousedev_hw_data { int dx, dy, dz; int x, y; int abs_event; unsigned long buttons; }; struct mousedev { int open; struct input_handle handle; wait_queue_head_t wait; struct list_head client_list; spinlock_t client_lock; /* protects client_list */ struct mutex mutex; struct device dev; struct cdev cdev; bool exist; struct list_head mixdev_node; bool opened_by_mixdev; struct mousedev_hw_data packet; unsigned int pkt_count; int old_x[4], old_y[4]; int frac_dx, frac_dy; unsigned long touch; int (*open_device)(struct mousedev *mousedev); void (*close_device)(struct mousedev *mousedev); }; enum mousedev_emul { MOUSEDEV_EMUL_PS2, MOUSEDEV_EMUL_IMPS, MOUSEDEV_EMUL_EXPS }; struct mousedev_motion { int dx, dy, dz; unsigned long buttons; }; #define PACKET_QUEUE_LEN 16 struct mousedev_client { struct fasync_struct *fasync; struct mousedev *mousedev; struct list_head node; struct mousedev_motion packets[PACKET_QUEUE_LEN]; unsigned int head, tail; spinlock_t packet_lock; int pos_x, pos_y; u8 ps2[6]; unsigned char ready, buffer, bufsiz; unsigned char imexseq, impsseq; enum mousedev_emul mode; unsigned long last_buttons; }; #define MOUSEDEV_SEQ_LEN 6 static unsigned char mousedev_imps_seq[] = { 0xf3, 200, 0xf3, 100, 0xf3, 80 }; static unsigned char mousedev_imex_seq[] = { 0xf3, 200, 0xf3, 200, 0xf3, 80 }; static struct mousedev *mousedev_mix; static LIST_HEAD(mousedev_mix_list); #define fx(i) (mousedev->old_x[(mousedev->pkt_count - (i)) & 03]) #define fy(i) (mousedev->old_y[(mousedev->pkt_count - (i)) & 03]) static void mousedev_touchpad_event(struct input_dev *dev, struct mousedev *mousedev, unsigned int code, int value) { int size, tmp; enum { FRACTION_DENOM = 128 }; switch (code) { case ABS_X: fx(0) = value; if (mousedev->touch && mousedev->pkt_count >= 2) { size = input_abs_get_max(dev, ABS_X) - input_abs_get_min(dev, ABS_X); if (size == 0) size = 256 * 2; tmp = ((value - fx(2)) * 256 * FRACTION_DENOM) / size; tmp += mousedev->frac_dx; mousedev->packet.dx = tmp / FRACTION_DENOM; mousedev->frac_dx = tmp - mousedev->packet.dx * FRACTION_DENOM; } break; case ABS_Y: fy(0) = value; if (mousedev->touch && mousedev->pkt_count >= 2) { /* use X size for ABS_Y to keep the same scale */ size = input_abs_get_max(dev, ABS_X) - input_abs_get_min(dev, ABS_X); if (size == 0) size = 256 * 2; tmp = -((value - fy(2)) * 256 * FRACTION_DENOM) / size; tmp += mousedev->frac_dy; mousedev->packet.dy = tmp / FRACTION_DENOM; mousedev->frac_dy = tmp - mousedev->packet.dy * FRACTION_DENOM; } break; } } static void mousedev_abs_event(struct input_dev *dev, struct mousedev *mousedev, unsigned int code, int value) { int min, max, size; switch (code) { case ABS_X: min = input_abs_get_min(dev, ABS_X); max = input_abs_get_max(dev, ABS_X); size = max - min; if (size == 0) size = xres ? : 1; value = clamp(value, min, max); mousedev->packet.x = ((value - min) * xres) / size; mousedev->packet.abs_event = 1; break; case ABS_Y: min = input_abs_get_min(dev, ABS_Y); max = input_abs_get_max(dev, ABS_Y); size = max - min; if (size == 0) size = yres ? : 1; value = clamp(value, min, max); mousedev->packet.y = yres - ((value - min) * yres) / size; mousedev->packet.abs_event = 1; break; } } static void mousedev_rel_event(struct mousedev *mousedev, unsigned int code, int value) { switch (code) { case REL_X: mousedev->packet.dx += value; break; case REL_Y: mousedev->packet.dy -= value; break; case REL_WHEEL: mousedev->packet.dz -= value; break; } } static void mousedev_key_event(struct mousedev *mousedev, unsigned int code, int value) { int index; switch (code) { case BTN_TOUCH: case BTN_0: case BTN_LEFT: index = 0; break; case BTN_STYLUS: case BTN_1: case BTN_RIGHT: index = 1; break; case BTN_2: case BTN_FORWARD: case BTN_STYLUS2: case BTN_MIDDLE: index = 2; break; case BTN_3: case BTN_BACK: case BTN_SIDE: index = 3; break; case BTN_4: case BTN_EXTRA: index = 4; break; default: return; } if (value) { set_bit(index, &mousedev->packet.buttons); set_bit(index, &mousedev_mix->packet.buttons); } else { clear_bit(index, &mousedev->packet.buttons); clear_bit(index, &mousedev_mix->packet.buttons); } } static void mousedev_notify_readers(struct mousedev *mousedev, struct mousedev_hw_data *packet) { struct mousedev_client *client; struct mousedev_motion *p; unsigned int new_head; int wake_readers = 0; rcu_read_lock(); list_for_each_entry_rcu(client, &mousedev->client_list, node) { /* Just acquire the lock, interrupts already disabled */ spin_lock(&client->packet_lock); p = &client->packets[client->head]; if (client->ready && p->buttons != mousedev->packet.buttons) { new_head = (client->head + 1) % PACKET_QUEUE_LEN; if (new_head != client->tail) { p = &client->packets[client->head = new_head]; memset(p, 0, sizeof(struct mousedev_motion)); } } if (packet->abs_event) { p->dx += packet->x - client->pos_x; p->dy += packet->y - client->pos_y; client->pos_x = packet->x; client->pos_y = packet->y; } client->pos_x += packet->dx; client->pos_x = clamp_val(client->pos_x, 0, xres); client->pos_y += packet->dy; client->pos_y = clamp_val(client->pos_y, 0, yres); p->dx += packet->dx; p->dy += packet->dy; p->dz += packet->dz; p->buttons = mousedev->packet.buttons; if (p->dx || p->dy || p->dz || p->buttons != client->last_buttons) client->ready = 1; spin_unlock(&client->packet_lock); if (client->ready) { kill_fasync(&client->fasync, SIGIO, POLL_IN); wake_readers = 1; } } rcu_read_unlock(); if (wake_readers) wake_up_interruptible(&mousedev->wait); } static void mousedev_touchpad_touch(struct mousedev *mousedev, int value) { if (!value) { if (mousedev->touch && time_before(jiffies, mousedev->touch + msecs_to_jiffies(tap_time))) { /* * Toggle left button to emulate tap. * We rely on the fact that mousedev_mix always has 0 * motion packet so we won't mess current position. */ set_bit(0, &mousedev->packet.buttons); set_bit(0, &mousedev_mix->packet.buttons); mousedev_notify_readers(mousedev, &mousedev_mix->packet); mousedev_notify_readers(mousedev_mix, &mousedev_mix->packet); clear_bit(0, &mousedev->packet.buttons); clear_bit(0, &mousedev_mix->packet.buttons); } mousedev->touch = mousedev->pkt_count = 0; mousedev->frac_dx = 0; mousedev->frac_dy = 0; } else if (!mousedev->touch) mousedev->touch = jiffies; } static void mousedev_event(struct input_handle *handle, unsigned int type, unsigned int code, int value) { struct mousedev *mousedev = handle->private; switch (type) { case EV_ABS: /* Ignore joysticks */ if (test_bit(BTN_TRIGGER, handle->dev->keybit)) return; if (test_bit(BTN_TOOL_FINGER, handle->dev->keybit)) mousedev_touchpad_event(handle->dev, mousedev, code, value); else mousedev_abs_event(handle->dev, mousedev, code, value); break; case EV_REL: mousedev_rel_event(mousedev, code, value); break; case EV_KEY: if (value != 2) { if (code == BTN_TOUCH && test_bit(BTN_TOOL_FINGER, handle->dev->keybit)) mousedev_touchpad_touch(mousedev, value); else mousedev_key_event(mousedev, code, value); } break; case EV_SYN: if (code == SYN_REPORT) { if (mousedev->touch) { mousedev->pkt_count++; /* * Input system eats duplicate events, * but we need all of them to do correct * averaging so apply present one forward */ fx(0) = fx(1); fy(0) = fy(1); } mousedev_notify_readers(mousedev, &mousedev->packet); mousedev_notify_readers(mousedev_mix, &mousedev->packet); mousedev->packet.dx = mousedev->packet.dy = mousedev->packet.dz = 0; mousedev->packet.abs_event = 0; } break; } } static int mousedev_fasync(int fd, struct file *file, int on) { struct mousedev_client *client = file->private_data; return fasync_helper(fd, file, on, &client->fasync); } static void mousedev_free(struct device *dev) { struct mousedev *mousedev = container_of(dev, struct mousedev, dev); input_put_device(mousedev->handle.dev); kfree(mousedev); } static int mousedev_open_device(struct mousedev *mousedev) { int retval; retval = mutex_lock_interruptible(&mousedev->mutex); if (retval) return retval; if (!mousedev->exist) retval = -ENODEV; else if (!mousedev->open++) { retval = input_open_device(&mousedev->handle); if (retval) mousedev->open--; } mutex_unlock(&mousedev->mutex); return retval; } static void mousedev_close_device(struct mousedev *mousedev) { mutex_lock(&mousedev->mutex); if (mousedev->exist && !--mousedev->open) input_close_device(&mousedev->handle); mutex_unlock(&mousedev->mutex); } /* * Open all available devices so they can all be multiplexed in one. * stream. Note that this function is called with mousedev_mix->mutex * held. */ static int mixdev_open_devices(struct mousedev *mixdev) { int error; error = mutex_lock_interruptible(&mixdev->mutex); if (error) return error; if (!mixdev->open++) { struct mousedev *mousedev; list_for_each_entry(mousedev, &mousedev_mix_list, mixdev_node) { if (!mousedev->opened_by_mixdev) { if (mousedev_open_device(mousedev)) continue; mousedev->opened_by_mixdev = true; } } } mutex_unlock(&mixdev->mutex); return 0; } /* * Close all devices that were opened as part of multiplexed * device. Note that this function is called with mousedev_mix->mutex * held. */ static void mixdev_close_devices(struct mousedev *mixdev) { mutex_lock(&mixdev->mutex); if (!--mixdev->open) { struct mousedev *mousedev; list_for_each_entry(mousedev, &mousedev_mix_list, mixdev_node) { if (mousedev->opened_by_mixdev) { mousedev->opened_by_mixdev = false; mousedev_close_device(mousedev); } } } mutex_unlock(&mixdev->mutex); } static void mousedev_attach_client(struct mousedev *mousedev, struct mousedev_client *client) { spin_lock(&mousedev->client_lock); list_add_tail_rcu(&client->node, &mousedev->client_list); spin_unlock(&mousedev->client_lock); } static void mousedev_detach_client(struct mousedev *mousedev, struct mousedev_client *client) { spin_lock(&mousedev->client_lock); list_del_rcu(&client->node); spin_unlock(&mousedev->client_lock); synchronize_rcu(); } static int mousedev_release(struct inode *inode, struct file *file) { struct mousedev_client *client = file->private_data; struct mousedev *mousedev = client->mousedev; mousedev_detach_client(mousedev, client); kfree(client); mousedev->close_device(mousedev); return 0; } static int mousedev_open(struct inode *inode, struct file *file) { struct mousedev_client *client; struct mousedev *mousedev; int error; #ifdef CONFIG_INPUT_MOUSEDEV_PSAUX if (imajor(inode) == MISC_MAJOR) mousedev = mousedev_mix; else #endif mousedev = container_of(inode->i_cdev, struct mousedev, cdev); client = kzalloc(sizeof(struct mousedev_client), GFP_KERNEL); if (!client) return -ENOMEM; spin_lock_init(&client->packet_lock); client->pos_x = xres / 2; client->pos_y = yres / 2; client->mousedev = mousedev; mousedev_attach_client(mousedev, client); error = mousedev->open_device(mousedev); if (error) goto err_free_client; file->private_data = client; stream_open(inode, file); return 0; err_free_client: mousedev_detach_client(mousedev, client); kfree(client); return error; } static void mousedev_packet(struct mousedev_client *client, u8 *ps2_data) { struct mousedev_motion *p = &client->packets[client->tail]; s8 dx, dy, dz; dx = clamp_val(p->dx, -127, 127); p->dx -= dx; dy = clamp_val(p->dy, -127, 127); p->dy -= dy; ps2_data[0] = BIT(3); ps2_data[0] |= ((dx & BIT(7)) >> 3) | ((dy & BIT(7)) >> 2); ps2_data[0] |= p->buttons & 0x07; ps2_data[1] = dx; ps2_data[2] = dy; switch (client->mode) { case MOUSEDEV_EMUL_EXPS: dz = clamp_val(p->dz, -7, 7); p->dz -= dz; ps2_data[3] = (dz & 0x0f) | ((p->buttons & 0x18) << 1); client->bufsiz = 4; break; case MOUSEDEV_EMUL_IMPS: dz = clamp_val(p->dz, -127, 127); p->dz -= dz; ps2_data[0] |= ((p->buttons & 0x10) >> 3) | ((p->buttons & 0x08) >> 1); ps2_data[3] = dz; client->bufsiz = 4; break; case MOUSEDEV_EMUL_PS2: default: p->dz = 0; ps2_data[0] |= ((p->buttons & 0x10) >> 3) | ((p->buttons & 0x08) >> 1); client->bufsiz = 3; break; } if (!p->dx && !p->dy && !p->dz) { if (client->tail == client->head) { client->ready = 0; client->last_buttons = p->buttons; } else client->tail = (client->tail + 1) % PACKET_QUEUE_LEN; } } static void mousedev_generate_response(struct mousedev_client *client, int command) { client->ps2[0] = 0xfa; /* ACK */ switch (command) { case 0xeb: /* Poll */ mousedev_packet(client, &client->ps2[1]); client->bufsiz++; /* account for leading ACK */ break; case 0xf2: /* Get ID */ switch (client->mode) { case MOUSEDEV_EMUL_PS2: client->ps2[1] = 0; break; case MOUSEDEV_EMUL_IMPS: client->ps2[1] = 3; break; case MOUSEDEV_EMUL_EXPS: client->ps2[1] = 4; break; } client->bufsiz = 2; break; case 0xe9: /* Get info */ client->ps2[1] = 0x60; client->ps2[2] = 3; client->ps2[3] = 200; client->bufsiz = 4; break; case 0xff: /* Reset */ client->impsseq = client->imexseq = 0; client->mode = MOUSEDEV_EMUL_PS2; client->ps2[1] = 0xaa; client->ps2[2] = 0x00; client->bufsiz = 3; break; default: client->bufsiz = 1; break; } client->buffer = client->bufsiz; } static ssize_t mousedev_write(struct file *file, const char __user *buffer, size_t count, loff_t *ppos) { struct mousedev_client *client = file->private_data; unsigned char c; unsigned int i; for (i = 0; i < count; i++) { if (get_user(c, buffer + i)) return -EFAULT; spin_lock_irq(&client->packet_lock); if (c == mousedev_imex_seq[client->imexseq]) { if (++client->imexseq == MOUSEDEV_SEQ_LEN) { client->imexseq = 0; client->mode = MOUSEDEV_EMUL_EXPS; } } else client->imexseq = 0; if (c == mousedev_imps_seq[client->impsseq]) { if (++client->impsseq == MOUSEDEV_SEQ_LEN) { client->impsseq = 0; client->mode = MOUSEDEV_EMUL_IMPS; } } else client->impsseq = 0; mousedev_generate_response(client, c); spin_unlock_irq(&client->packet_lock); cond_resched(); } kill_fasync(&client->fasync, SIGIO, POLL_IN); wake_up_interruptible(&client->mousedev->wait); return count; } static ssize_t mousedev_read(struct file *file, char __user *buffer, size_t count, loff_t *ppos) { struct mousedev_client *client = file->private_data; struct mousedev *mousedev = client->mousedev; u8 data[sizeof(client->ps2)]; int retval = 0; if (!client->ready && !client->buffer && mousedev->exist && (file->f_flags & O_NONBLOCK)) return -EAGAIN; retval = wait_event_interruptible(mousedev->wait, !mousedev->exist || client->ready || client->buffer); if (retval) return retval; if (!mousedev->exist) return -ENODEV; spin_lock_irq(&client->packet_lock); if (!client->buffer && client->ready) { mousedev_packet(client, client->ps2); client->buffer = client->bufsiz; } if (count > client->buffer) count = client->buffer; memcpy(data, client->ps2 + client->bufsiz - client->buffer, count); client->buffer -= count; spin_unlock_irq(&client->packet_lock); if (copy_to_user(buffer, data, count)) return -EFAULT; return count; } /* No kernel lock - fine */ static __poll_t mousedev_poll(struct file *file, poll_table *wait) { struct mousedev_client *client = file->private_data; struct mousedev *mousedev = client->mousedev; __poll_t mask; poll_wait(file, &mousedev->wait, wait); mask = mousedev->exist ? EPOLLOUT | EPOLLWRNORM : EPOLLHUP | EPOLLERR; if (client->ready || client->buffer) mask |= EPOLLIN | EPOLLRDNORM; return mask; } static const struct file_operations mousedev_fops = { .owner = THIS_MODULE, .read = mousedev_read, .write = mousedev_write, .poll = mousedev_poll, .open = mousedev_open, .release = mousedev_release, .fasync = mousedev_fasync, .llseek = noop_llseek, }; /* * Mark device non-existent. This disables writes, ioctls and * prevents new users from opening the device. Already posted * blocking reads will stay, however new ones will fail. */ static void mousedev_mark_dead(struct mousedev *mousedev) { mutex_lock(&mousedev->mutex); mousedev->exist = false; mutex_unlock(&mousedev->mutex); } /* * Wake up users waiting for IO so they can disconnect from * dead device. */ static void mousedev_hangup(struct mousedev *mousedev) { struct mousedev_client *client; spin_lock(&mousedev->client_lock); list_for_each_entry(client, &mousedev->client_list, node) kill_fasync(&client->fasync, SIGIO, POLL_HUP); spin_unlock(&mousedev->client_lock); wake_up_interruptible(&mousedev->wait); } static void mousedev_cleanup(struct mousedev *mousedev) { struct input_handle *handle = &mousedev->handle; mousedev_mark_dead(mousedev); mousedev_hangup(mousedev); /* mousedev is marked dead so no one else accesses mousedev->open */ if (mousedev->open) input_close_device(handle); } static int mousedev_reserve_minor(bool mixdev) { int minor; if (mixdev) { minor = input_get_new_minor(MOUSEDEV_MIX, 1, false); if (minor < 0) pr_err("failed to reserve mixdev minor: %d\n", minor); } else { minor = input_get_new_minor(MOUSEDEV_MINOR_BASE, MOUSEDEV_MINORS, true); if (minor < 0) pr_err("failed to reserve new minor: %d\n", minor); } return minor; } static struct mousedev *mousedev_create(struct input_dev *dev, struct input_handler *handler, bool mixdev) { struct mousedev *mousedev; int minor; int error; minor = mousedev_reserve_minor(mixdev); if (minor < 0) { error = minor; goto err_out; } mousedev = kzalloc(sizeof(struct mousedev), GFP_KERNEL); if (!mousedev) { error = -ENOMEM; goto err_free_minor; } INIT_LIST_HEAD(&mousedev->client_list); INIT_LIST_HEAD(&mousedev->mixdev_node); spin_lock_init(&mousedev->client_lock); mutex_init(&mousedev->mutex); lockdep_set_subclass(&mousedev->mutex, mixdev ? SINGLE_DEPTH_NESTING : 0); init_waitqueue_head(&mousedev->wait); if (mixdev) { dev_set_name(&mousedev->dev, "mice"); mousedev->open_device = mixdev_open_devices; mousedev->close_device = mixdev_close_devices; } else { int dev_no = minor; /* Normalize device number if it falls into legacy range */ if (dev_no < MOUSEDEV_MINOR_BASE + MOUSEDEV_MINORS) dev_no -= MOUSEDEV_MINOR_BASE; dev_set_name(&mousedev->dev, "mouse%d", dev_no); mousedev->open_device = mousedev_open_device; mousedev->close_device = mousedev_close_device; } mousedev->exist = true; mousedev->handle.dev = input_get_device(dev); mousedev->handle.name = dev_name(&mousedev->dev); mousedev->handle.handler = handler; mousedev->handle.private = mousedev; mousedev->dev.class = &input_class; if (dev) mousedev->dev.parent = &dev->dev; mousedev->dev.devt = MKDEV(INPUT_MAJOR, minor); mousedev->dev.release = mousedev_free; device_initialize(&mousedev->dev); if (!mixdev) { error = input_register_handle(&mousedev->handle); if (error) goto err_free_mousedev; } cdev_init(&mousedev->cdev, &mousedev_fops); error = cdev_device_add(&mousedev->cdev, &mousedev->dev); if (error) goto err_cleanup_mousedev; return mousedev; err_cleanup_mousedev: mousedev_cleanup(mousedev); if (!mixdev) input_unregister_handle(&mousedev->handle); err_free_mousedev: put_device(&mousedev->dev); err_free_minor: input_free_minor(minor); err_out: return ERR_PTR(error); } static void mousedev_destroy(struct mousedev *mousedev) { cdev_device_del(&mousedev->cdev, &mousedev->dev); mousedev_cleanup(mousedev); input_free_minor(MINOR(mousedev->dev.devt)); if (mousedev != mousedev_mix) input_unregister_handle(&mousedev->handle); put_device(&mousedev->dev); } static int mixdev_add_device(struct mousedev *mousedev) { int retval; retval = mutex_lock_interruptible(&mousedev_mix->mutex); if (retval) return retval; if (mousedev_mix->open) { retval = mousedev_open_device(mousedev); if (retval) goto out; mousedev->opened_by_mixdev = true; } get_device(&mousedev->dev); list_add_tail(&mousedev->mixdev_node, &mousedev_mix_list); out: mutex_unlock(&mousedev_mix->mutex); return retval; } static void mixdev_remove_device(struct mousedev *mousedev) { mutex_lock(&mousedev_mix->mutex); if (mousedev->opened_by_mixdev) { mousedev->opened_by_mixdev = false; mousedev_close_device(mousedev); } list_del_init(&mousedev->mixdev_node); mutex_unlock(&mousedev_mix->mutex); put_device(&mousedev->dev); } static int mousedev_connect(struct input_handler *handler, struct input_dev *dev, const struct input_device_id *id) { struct mousedev *mousedev; int error; mousedev = mousedev_create(dev, handler, false); if (IS_ERR(mousedev)) return PTR_ERR(mousedev); error = mixdev_add_device(mousedev); if (error) { mousedev_destroy(mousedev); return error; } return 0; } static void mousedev_disconnect(struct input_handle *handle) { struct mousedev *mousedev = handle->private; mixdev_remove_device(mousedev); mousedev_destroy(mousedev); } static const struct input_device_id mousedev_ids[] = { { .flags = INPUT_DEVICE_ID_MATCH_EVBIT | INPUT_DEVICE_ID_MATCH_KEYBIT | INPUT_DEVICE_ID_MATCH_RELBIT, .evbit = { BIT_MASK(EV_KEY) | BIT_MASK(EV_REL) }, .keybit = { [BIT_WORD(BTN_LEFT)] = BIT_MASK(BTN_LEFT) }, .relbit = { BIT_MASK(REL_X) | BIT_MASK(REL_Y) }, }, /* A mouse like device, at least one button, two relative axes */ { .flags = INPUT_DEVICE_ID_MATCH_EVBIT | INPUT_DEVICE_ID_MATCH_RELBIT, .evbit = { BIT_MASK(EV_KEY) | BIT_MASK(EV_REL) }, .relbit = { BIT_MASK(REL_WHEEL) }, }, /* A separate scrollwheel */ { .flags = INPUT_DEVICE_ID_MATCH_EVBIT | INPUT_DEVICE_ID_MATCH_KEYBIT | INPUT_DEVICE_ID_MATCH_ABSBIT, .evbit = { BIT_MASK(EV_KEY) | BIT_MASK(EV_ABS) }, .keybit = { [BIT_WORD(BTN_TOUCH)] = BIT_MASK(BTN_TOUCH) }, .absbit = { BIT_MASK(ABS_X) | BIT_MASK(ABS_Y) }, }, /* A tablet like device, at least touch detection, two absolute axes */ { .flags = INPUT_DEVICE_ID_MATCH_EVBIT | INPUT_DEVICE_ID_MATCH_KEYBIT | INPUT_DEVICE_ID_MATCH_ABSBIT, .evbit = { BIT_MASK(EV_KEY) | BIT_MASK(EV_ABS) }, .keybit = { [BIT_WORD(BTN_TOOL_FINGER)] = BIT_MASK(BTN_TOOL_FINGER) }, .absbit = { BIT_MASK(ABS_X) | BIT_MASK(ABS_Y) | BIT_MASK(ABS_PRESSURE) | BIT_MASK(ABS_TOOL_WIDTH) }, }, /* A touchpad */ { .flags = INPUT_DEVICE_ID_MATCH_EVBIT | INPUT_DEVICE_ID_MATCH_KEYBIT | INPUT_DEVICE_ID_MATCH_ABSBIT, .evbit = { BIT_MASK(EV_KEY) | BIT_MASK(EV_ABS) }, .keybit = { [BIT_WORD(BTN_LEFT)] = BIT_MASK(BTN_LEFT) }, .absbit = { BIT_MASK(ABS_X) | BIT_MASK(ABS_Y) }, }, /* Mouse-like device with absolute X and Y but ordinary clicks, like hp ILO2 High Performance mouse */ { }, /* Terminating entry */ }; MODULE_DEVICE_TABLE(input, mousedev_ids); static struct input_handler mousedev_handler = { .event = mousedev_event, .connect = mousedev_connect, .disconnect = mousedev_disconnect, .legacy_minors = true, .minor = MOUSEDEV_MINOR_BASE, .name = "mousedev", .id_table = mousedev_ids, }; #ifdef CONFIG_INPUT_MOUSEDEV_PSAUX #include <linux/miscdevice.h> static struct miscdevice psaux_mouse = { .minor = PSMOUSE_MINOR, .name = "psaux", .fops = &mousedev_fops, }; static bool psaux_registered; static void __init mousedev_psaux_register(void) { int error; error = misc_register(&psaux_mouse); if (error) pr_warn("could not register psaux device, error: %d\n", error); else psaux_registered = true; } static void __exit mousedev_psaux_unregister(void) { if (psaux_registered) misc_deregister(&psaux_mouse); } #else static inline void mousedev_psaux_register(void) { } static inline void mousedev_psaux_unregister(void) { } #endif static int __init mousedev_init(void) { int error; mousedev_mix = mousedev_create(NULL, &mousedev_handler, true); if (IS_ERR(mousedev_mix)) return PTR_ERR(mousedev_mix); error = input_register_handler(&mousedev_handler); if (error) { mousedev_destroy(mousedev_mix); return error; } mousedev_psaux_register(); pr_info("PS/2 mouse device common for all mice\n"); return 0; } static void __exit mousedev_exit(void) { mousedev_psaux_unregister(); input_unregister_handler(&mousedev_handler); mousedev_destroy(mousedev_mix); } module_init(mousedev_init); module_exit(mousedev_exit); |
| 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 4 4 4 4 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 | // SPDX-License-Identifier: GPL-2.0-only /* * vivid-sdr-cap.c - software defined radio support functions. * * Copyright 2014 Cisco Systems, Inc. and/or its affiliates. All rights reserved. */ #include <linux/errno.h> #include <linux/kernel.h> #include <linux/delay.h> #include <linux/kthread.h> #include <linux/freezer.h> #include <linux/math64.h> #include <linux/videodev2.h> #include <linux/v4l2-dv-timings.h> #include <media/v4l2-common.h> #include <media/v4l2-event.h> #include <media/v4l2-dv-timings.h> #include <linux/fixp-arith.h> #include <linux/jiffies.h> #include "vivid-core.h" #include "vivid-ctrls.h" #include "vivid-sdr-cap.h" /* stream formats */ struct vivid_format { u32 pixelformat; u32 buffersize; }; /* format descriptions for capture and preview */ static const struct vivid_format formats[] = { { .pixelformat = V4L2_SDR_FMT_CU8, .buffersize = SDR_CAP_SAMPLES_PER_BUF * 2, }, { .pixelformat = V4L2_SDR_FMT_CS8, .buffersize = SDR_CAP_SAMPLES_PER_BUF * 2, }, }; static const struct v4l2_frequency_band bands_adc[] = { { .tuner = 0, .type = V4L2_TUNER_ADC, .index = 0, .capability = V4L2_TUNER_CAP_1HZ | V4L2_TUNER_CAP_FREQ_BANDS, .rangelow = 300000, .rangehigh = 300000, }, { .tuner = 0, .type = V4L2_TUNER_ADC, .index = 1, .capability = V4L2_TUNER_CAP_1HZ | V4L2_TUNER_CAP_FREQ_BANDS, .rangelow = 900001, .rangehigh = 2800000, }, { .tuner = 0, .type = V4L2_TUNER_ADC, .index = 2, .capability = V4L2_TUNER_CAP_1HZ | V4L2_TUNER_CAP_FREQ_BANDS, .rangelow = 3200000, .rangehigh = 3200000, }, }; /* ADC band midpoints */ #define BAND_ADC_0 ((bands_adc[0].rangehigh + bands_adc[1].rangelow) / 2) #define BAND_ADC_1 ((bands_adc[1].rangehigh + bands_adc[2].rangelow) / 2) static const struct v4l2_frequency_band bands_fm[] = { { .tuner = 1, .type = V4L2_TUNER_RF, .index = 0, .capability = V4L2_TUNER_CAP_1HZ | V4L2_TUNER_CAP_FREQ_BANDS, .rangelow = 50000000, .rangehigh = 2000000000, }, }; static void vivid_thread_sdr_cap_tick(struct vivid_dev *dev) { struct vivid_buffer *sdr_cap_buf = NULL; dprintk(dev, 1, "SDR Capture Thread Tick\n"); /* Drop a certain percentage of buffers. */ if (dev->perc_dropped_buffers && get_random_u32_below(100) < dev->perc_dropped_buffers) return; spin_lock(&dev->slock); if (!list_empty(&dev->sdr_cap_active)) { sdr_cap_buf = list_entry(dev->sdr_cap_active.next, struct vivid_buffer, list); list_del(&sdr_cap_buf->list); } spin_unlock(&dev->slock); if (sdr_cap_buf) { sdr_cap_buf->vb.sequence = dev->sdr_cap_with_seq_wrap_count; v4l2_ctrl_request_setup(sdr_cap_buf->vb.vb2_buf.req_obj.req, &dev->ctrl_hdl_sdr_cap); v4l2_ctrl_request_complete(sdr_cap_buf->vb.vb2_buf.req_obj.req, &dev->ctrl_hdl_sdr_cap); vivid_sdr_cap_process(dev, sdr_cap_buf); sdr_cap_buf->vb.vb2_buf.timestamp = ktime_get_ns() + dev->time_wrap_offset; vb2_buffer_done(&sdr_cap_buf->vb.vb2_buf, dev->dqbuf_error ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE); dev->dqbuf_error = false; } } static int vivid_thread_sdr_cap(void *data) { struct vivid_dev *dev = data; u64 samples_since_start; u64 buffers_since_start; u64 next_jiffies_since_start; unsigned long jiffies_since_start; unsigned long cur_jiffies; unsigned wait_jiffies; dprintk(dev, 1, "SDR Capture Thread Start\n"); set_freezable(); /* Resets frame counters */ dev->sdr_cap_seq_offset = 0; dev->sdr_cap_seq_count = 0; dev->jiffies_sdr_cap = jiffies; dev->sdr_cap_seq_resync = false; if (dev->time_wrap) dev->time_wrap_offset = dev->time_wrap - ktime_get_ns(); else dev->time_wrap_offset = 0; for (;;) { try_to_freeze(); if (kthread_should_stop()) break; if (!mutex_trylock(&dev->mutex)) { schedule(); continue; } cur_jiffies = jiffies; if (dev->sdr_cap_seq_resync) { dev->jiffies_sdr_cap = cur_jiffies; dev->sdr_cap_seq_offset = dev->sdr_cap_seq_count + 1; dev->sdr_cap_seq_count = 0; dev->sdr_cap_seq_resync = false; } /* Calculate the number of jiffies since we started streaming */ jiffies_since_start = cur_jiffies - dev->jiffies_sdr_cap; /* Get the number of buffers streamed since the start */ buffers_since_start = (u64)jiffies_since_start * dev->sdr_adc_freq + (HZ * SDR_CAP_SAMPLES_PER_BUF) / 2; do_div(buffers_since_start, HZ * SDR_CAP_SAMPLES_PER_BUF); /* * After more than 0xf0000000 (rounded down to a multiple of * 'jiffies-per-day' to ease jiffies_to_msecs calculation) * jiffies have passed since we started streaming reset the * counters and keep track of the sequence offset. */ if (jiffies_since_start > JIFFIES_RESYNC) { dev->jiffies_sdr_cap = cur_jiffies; dev->sdr_cap_seq_offset = buffers_since_start; buffers_since_start = 0; } dev->sdr_cap_seq_count = buffers_since_start + dev->sdr_cap_seq_offset; dev->sdr_cap_with_seq_wrap_count = dev->sdr_cap_seq_count - dev->sdr_cap_seq_start; vivid_thread_sdr_cap_tick(dev); mutex_unlock(&dev->mutex); /* * Calculate the number of samples streamed since we started, * not including the current buffer. */ samples_since_start = buffers_since_start * SDR_CAP_SAMPLES_PER_BUF; /* And the number of jiffies since we started */ jiffies_since_start = jiffies - dev->jiffies_sdr_cap; /* Increase by the number of samples in one buffer */ samples_since_start += SDR_CAP_SAMPLES_PER_BUF; /* * Calculate when that next buffer is supposed to start * in jiffies since we started streaming. */ next_jiffies_since_start = samples_since_start * HZ + dev->sdr_adc_freq / 2; do_div(next_jiffies_since_start, dev->sdr_adc_freq); /* If it is in the past, then just schedule asap */ if (next_jiffies_since_start < jiffies_since_start) next_jiffies_since_start = jiffies_since_start; wait_jiffies = next_jiffies_since_start - jiffies_since_start; if (!time_is_after_jiffies(cur_jiffies + wait_jiffies)) continue; wait_queue_head_t wait; init_waitqueue_head(&wait); wait_event_interruptible_timeout(wait, kthread_should_stop(), cur_jiffies + wait_jiffies - jiffies); } dprintk(dev, 1, "SDR Capture Thread End\n"); return 0; } static int sdr_cap_queue_setup(struct vb2_queue *vq, unsigned *nbuffers, unsigned *nplanes, unsigned sizes[], struct device *alloc_devs[]) { /* 2 = max 16-bit sample returned */ u32 size = SDR_CAP_SAMPLES_PER_BUF * 2; if (*nplanes) return sizes[0] < size ? -EINVAL : 0; *nplanes = 1; sizes[0] = size; return 0; } static int sdr_cap_buf_prepare(struct vb2_buffer *vb) { struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); unsigned size = SDR_CAP_SAMPLES_PER_BUF * 2; dprintk(dev, 1, "%s\n", __func__); if (dev->buf_prepare_error) { /* * Error injection: test what happens if buf_prepare() returns * an error. */ dev->buf_prepare_error = false; return -EINVAL; } if (vb2_plane_size(vb, 0) < size) { dprintk(dev, 1, "%s data will not fit into plane (%lu < %u)\n", __func__, vb2_plane_size(vb, 0), size); return -EINVAL; } vb2_set_plane_payload(vb, 0, size); return 0; } static void sdr_cap_buf_queue(struct vb2_buffer *vb) { struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); struct vivid_buffer *buf = container_of(vbuf, struct vivid_buffer, vb); dprintk(dev, 1, "%s\n", __func__); spin_lock(&dev->slock); list_add_tail(&buf->list, &dev->sdr_cap_active); spin_unlock(&dev->slock); } static int sdr_cap_start_streaming(struct vb2_queue *vq, unsigned count) { struct vivid_dev *dev = vb2_get_drv_priv(vq); int err = 0; dprintk(dev, 1, "%s\n", __func__); dev->sdr_cap_seq_start = dev->seq_wrap * 128; if (dev->start_streaming_error) { dev->start_streaming_error = false; err = -EINVAL; } else if (dev->kthread_sdr_cap == NULL) { dev->kthread_sdr_cap = kthread_run(vivid_thread_sdr_cap, dev, "%s-sdr-cap", dev->v4l2_dev.name); if (IS_ERR(dev->kthread_sdr_cap)) { v4l2_err(&dev->v4l2_dev, "kernel_thread() failed\n"); err = PTR_ERR(dev->kthread_sdr_cap); dev->kthread_sdr_cap = NULL; } } if (err) { struct vivid_buffer *buf, *tmp; list_for_each_entry_safe(buf, tmp, &dev->sdr_cap_active, list) { list_del(&buf->list); vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_QUEUED); } } return err; } /* abort streaming and wait for last buffer */ static void sdr_cap_stop_streaming(struct vb2_queue *vq) { struct vivid_dev *dev = vb2_get_drv_priv(vq); if (dev->kthread_sdr_cap == NULL) return; while (!list_empty(&dev->sdr_cap_active)) { struct vivid_buffer *buf; buf = list_entry(dev->sdr_cap_active.next, struct vivid_buffer, list); list_del(&buf->list); v4l2_ctrl_request_complete(buf->vb.vb2_buf.req_obj.req, &dev->ctrl_hdl_sdr_cap); vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR); } /* shutdown control thread */ kthread_stop(dev->kthread_sdr_cap); dev->kthread_sdr_cap = NULL; } static void sdr_cap_buf_request_complete(struct vb2_buffer *vb) { struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); v4l2_ctrl_request_complete(vb->req_obj.req, &dev->ctrl_hdl_sdr_cap); } const struct vb2_ops vivid_sdr_cap_qops = { .queue_setup = sdr_cap_queue_setup, .buf_prepare = sdr_cap_buf_prepare, .buf_queue = sdr_cap_buf_queue, .start_streaming = sdr_cap_start_streaming, .stop_streaming = sdr_cap_stop_streaming, .buf_request_complete = sdr_cap_buf_request_complete, }; int vivid_sdr_enum_freq_bands(struct file *file, void *fh, struct v4l2_frequency_band *band) { switch (band->tuner) { case 0: if (band->index >= ARRAY_SIZE(bands_adc)) return -EINVAL; *band = bands_adc[band->index]; return 0; case 1: if (band->index >= ARRAY_SIZE(bands_fm)) return -EINVAL; *band = bands_fm[band->index]; return 0; default: return -EINVAL; } } int vivid_sdr_g_frequency(struct file *file, void *fh, struct v4l2_frequency *vf) { struct vivid_dev *dev = video_drvdata(file); switch (vf->tuner) { case 0: vf->frequency = dev->sdr_adc_freq; vf->type = V4L2_TUNER_ADC; return 0; case 1: vf->frequency = dev->sdr_fm_freq; vf->type = V4L2_TUNER_RF; return 0; default: return -EINVAL; } } int vivid_sdr_s_frequency(struct file *file, void *fh, const struct v4l2_frequency *vf) { struct vivid_dev *dev = video_drvdata(file); unsigned freq = vf->frequency; unsigned band; switch (vf->tuner) { case 0: if (vf->type != V4L2_TUNER_ADC) return -EINVAL; if (freq < BAND_ADC_0) band = 0; else if (freq < BAND_ADC_1) band = 1; else band = 2; freq = clamp_t(unsigned, freq, bands_adc[band].rangelow, bands_adc[band].rangehigh); if (vb2_is_streaming(&dev->vb_sdr_cap_q) && freq != dev->sdr_adc_freq) { /* resync the thread's timings */ dev->sdr_cap_seq_resync = true; } dev->sdr_adc_freq = freq; return 0; case 1: if (vf->type != V4L2_TUNER_RF) return -EINVAL; dev->sdr_fm_freq = clamp_t(unsigned, freq, bands_fm[0].rangelow, bands_fm[0].rangehigh); return 0; default: return -EINVAL; } } int vivid_sdr_g_tuner(struct file *file, void *fh, struct v4l2_tuner *vt) { switch (vt->index) { case 0: strscpy(vt->name, "ADC", sizeof(vt->name)); vt->type = V4L2_TUNER_ADC; vt->capability = V4L2_TUNER_CAP_1HZ | V4L2_TUNER_CAP_FREQ_BANDS; vt->rangelow = bands_adc[0].rangelow; vt->rangehigh = bands_adc[2].rangehigh; return 0; case 1: strscpy(vt->name, "RF", sizeof(vt->name)); vt->type = V4L2_TUNER_RF; vt->capability = V4L2_TUNER_CAP_1HZ | V4L2_TUNER_CAP_FREQ_BANDS; vt->rangelow = bands_fm[0].rangelow; vt->rangehigh = bands_fm[0].rangehigh; return 0; default: return -EINVAL; } } int vivid_sdr_s_tuner(struct file *file, void *fh, const struct v4l2_tuner *vt) { if (vt->index > 1) return -EINVAL; return 0; } int vidioc_enum_fmt_sdr_cap(struct file *file, void *fh, struct v4l2_fmtdesc *f) { if (f->index >= ARRAY_SIZE(formats)) return -EINVAL; f->pixelformat = formats[f->index].pixelformat; return 0; } int vidioc_g_fmt_sdr_cap(struct file *file, void *fh, struct v4l2_format *f) { struct vivid_dev *dev = video_drvdata(file); f->fmt.sdr.pixelformat = dev->sdr_pixelformat; f->fmt.sdr.buffersize = dev->sdr_buffersize; return 0; } int vidioc_s_fmt_sdr_cap(struct file *file, void *fh, struct v4l2_format *f) { struct vivid_dev *dev = video_drvdata(file); struct vb2_queue *q = &dev->vb_sdr_cap_q; int i; if (vb2_is_busy(q)) return -EBUSY; for (i = 0; i < ARRAY_SIZE(formats); i++) { if (formats[i].pixelformat == f->fmt.sdr.pixelformat) { dev->sdr_pixelformat = formats[i].pixelformat; dev->sdr_buffersize = formats[i].buffersize; f->fmt.sdr.buffersize = formats[i].buffersize; return 0; } } dev->sdr_pixelformat = formats[0].pixelformat; dev->sdr_buffersize = formats[0].buffersize; f->fmt.sdr.pixelformat = formats[0].pixelformat; f->fmt.sdr.buffersize = formats[0].buffersize; return 0; } int vidioc_try_fmt_sdr_cap(struct file *file, void *fh, struct v4l2_format *f) { int i; for (i = 0; i < ARRAY_SIZE(formats); i++) { if (formats[i].pixelformat == f->fmt.sdr.pixelformat) { f->fmt.sdr.buffersize = formats[i].buffersize; return 0; } } f->fmt.sdr.pixelformat = formats[0].pixelformat; f->fmt.sdr.buffersize = formats[0].buffersize; return 0; } #define FIXP_N (15) #define FIXP_FRAC (1 << FIXP_N) #define FIXP_2PI ((int)(2 * 3.141592653589 * FIXP_FRAC)) #define M_100000PI (3.14159 * 100000) void vivid_sdr_cap_process(struct vivid_dev *dev, struct vivid_buffer *buf) { u8 *vbuf = vb2_plane_vaddr(&buf->vb.vb2_buf, 0); unsigned long i; unsigned long plane_size = vb2_plane_size(&buf->vb.vb2_buf, 0); s64 s64tmp; s32 src_phase_step; s32 mod_phase_step; s32 fixp_i; s32 fixp_q; /* calculate phase step */ #define BEEP_FREQ 1000 /* 1kHz beep */ src_phase_step = DIV_ROUND_CLOSEST(FIXP_2PI * BEEP_FREQ, dev->sdr_adc_freq); for (i = 0; i < plane_size; i += 2) { mod_phase_step = fixp_cos32_rad(dev->sdr_fixp_src_phase, FIXP_2PI) >> (31 - FIXP_N); dev->sdr_fixp_src_phase += src_phase_step; s64tmp = (s64) mod_phase_step * dev->sdr_fm_deviation; dev->sdr_fixp_mod_phase += div_s64(s64tmp, M_100000PI); /* * Transfer phase angle to [0, 2xPI] in order to avoid variable * overflow and make it suitable for cosine implementation * used, which does not support negative angles. */ dev->sdr_fixp_src_phase %= FIXP_2PI; dev->sdr_fixp_mod_phase %= FIXP_2PI; if (dev->sdr_fixp_mod_phase < 0) dev->sdr_fixp_mod_phase += FIXP_2PI; fixp_i = fixp_cos32_rad(dev->sdr_fixp_mod_phase, FIXP_2PI); fixp_q = fixp_sin32_rad(dev->sdr_fixp_mod_phase, FIXP_2PI); /* Normalize fraction values represented with 32 bit precision * to fixed point representation with FIXP_N bits */ fixp_i >>= (31 - FIXP_N); fixp_q >>= (31 - FIXP_N); switch (dev->sdr_pixelformat) { case V4L2_SDR_FMT_CU8: /* convert 'fixp float' to u8 [0, +255] */ /* u8 = X * 127.5 + 127.5; X is float [-1.0, +1.0] */ fixp_i = fixp_i * 1275 + FIXP_FRAC * 1275; fixp_q = fixp_q * 1275 + FIXP_FRAC * 1275; *vbuf++ = DIV_ROUND_CLOSEST(fixp_i, FIXP_FRAC * 10); *vbuf++ = DIV_ROUND_CLOSEST(fixp_q, FIXP_FRAC * 10); break; case V4L2_SDR_FMT_CS8: /* convert 'fixp float' to s8 [-128, +127] */ /* s8 = X * 127.5 - 0.5; X is float [-1.0, +1.0] */ fixp_i = fixp_i * 1275 - FIXP_FRAC * 5; fixp_q = fixp_q * 1275 - FIXP_FRAC * 5; *vbuf++ = DIV_ROUND_CLOSEST(fixp_i, FIXP_FRAC * 10); *vbuf++ = DIV_ROUND_CLOSEST(fixp_q, FIXP_FRAC * 10); break; default: break; } } } |
| 6 6 6 6 6 6 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 | // SPDX-License-Identifier: GPL-2.0 /* Copyright (C) B.A.T.M.A.N. contributors: * * Marek Lindner, Simon Wunderlich */ #include "send.h" #include "main.h" #include <linux/atomic.h> #include <linux/bug.h> #include <linux/byteorder/generic.h> #include <linux/container_of.h> #include <linux/errno.h> #include <linux/etherdevice.h> #include <linux/gfp.h> #include <linux/if.h> #include <linux/if_ether.h> #include <linux/jiffies.h> #include <linux/kref.h> #include <linux/list.h> #include <linux/netdevice.h> #include <linux/printk.h> #include <linux/rculist.h> #include <linux/rcupdate.h> #include <linux/skbuff.h> #include <linux/slab.h> #include <linux/spinlock.h> #include <linux/stddef.h> #include <linux/workqueue.h> #include "distributed-arp-table.h" #include "fragmentation.h" #include "gateway_client.h" #include "hard-interface.h" #include "log.h" #include "mesh-interface.h" #include "network-coding.h" #include "originator.h" #include "routing.h" #include "translation-table.h" static void batadv_send_outstanding_bcast_packet(struct work_struct *work); /** * batadv_send_skb_packet() - send an already prepared packet * @skb: the packet to send * @hard_iface: the interface to use to send the broadcast packet * @dst_addr: the payload destination * * Send out an already prepared packet to the given neighbor or broadcast it * using the specified interface. Either hard_iface or neigh_node must be not * NULL. * If neigh_node is NULL, then the packet is broadcasted using hard_iface, * otherwise it is sent as unicast to the given neighbor. * * Regardless of the return value, the skb is consumed. * * Return: A negative errno code is returned on a failure. A success does not * guarantee the frame will be transmitted as it may be dropped due * to congestion or traffic shaping. */ int batadv_send_skb_packet(struct sk_buff *skb, struct batadv_hard_iface *hard_iface, const u8 *dst_addr) { struct batadv_priv *bat_priv; struct ethhdr *ethhdr; int ret; bat_priv = netdev_priv(hard_iface->mesh_iface); if (hard_iface->if_status != BATADV_IF_ACTIVE) goto send_skb_err; if (unlikely(!hard_iface->net_dev)) goto send_skb_err; if (!(hard_iface->net_dev->flags & IFF_UP)) { pr_warn("Interface %s is not up - can't send packet via that interface!\n", hard_iface->net_dev->name); goto send_skb_err; } /* push to the ethernet header. */ if (batadv_skb_head_push(skb, ETH_HLEN) < 0) goto send_skb_err; skb_reset_mac_header(skb); ethhdr = eth_hdr(skb); ether_addr_copy(ethhdr->h_source, hard_iface->net_dev->dev_addr); ether_addr_copy(ethhdr->h_dest, dst_addr); ethhdr->h_proto = htons(ETH_P_BATMAN); skb_set_network_header(skb, ETH_HLEN); skb->protocol = htons(ETH_P_BATMAN); skb->dev = hard_iface->net_dev; /* Save a clone of the skb to use when decoding coded packets */ batadv_nc_skb_store_for_decoding(bat_priv, skb); /* dev_queue_xmit() returns a negative result on error. However on * congestion and traffic shaping, it drops and returns NET_XMIT_DROP * (which is > 0). This will not be treated as an error. */ ret = dev_queue_xmit(skb); return net_xmit_eval(ret); send_skb_err: kfree_skb(skb); return NET_XMIT_DROP; } /** * batadv_send_broadcast_skb() - Send broadcast packet via hard interface * @skb: packet to be transmitted (with batadv header and no outer eth header) * @hard_iface: outgoing interface * * Return: A negative errno code is returned on a failure. A success does not * guarantee the frame will be transmitted as it may be dropped due * to congestion or traffic shaping. */ int batadv_send_broadcast_skb(struct sk_buff *skb, struct batadv_hard_iface *hard_iface) { return batadv_send_skb_packet(skb, hard_iface, batadv_broadcast_addr); } /** * batadv_send_unicast_skb() - Send unicast packet to neighbor * @skb: packet to be transmitted (with batadv header and no outer eth header) * @neigh: neighbor which is used as next hop to destination * * Return: A negative errno code is returned on a failure. A success does not * guarantee the frame will be transmitted as it may be dropped due * to congestion or traffic shaping. */ int batadv_send_unicast_skb(struct sk_buff *skb, struct batadv_neigh_node *neigh) { #ifdef CONFIG_BATMAN_ADV_BATMAN_V struct batadv_hardif_neigh_node *hardif_neigh; #endif int ret; ret = batadv_send_skb_packet(skb, neigh->if_incoming, neigh->addr); #ifdef CONFIG_BATMAN_ADV_BATMAN_V hardif_neigh = batadv_hardif_neigh_get(neigh->if_incoming, neigh->addr); if (hardif_neigh && ret != NET_XMIT_DROP) hardif_neigh->bat_v.last_unicast_tx = jiffies; batadv_hardif_neigh_put(hardif_neigh); #endif return ret; } /** * batadv_send_skb_to_orig() - Lookup next-hop and transmit skb. * @skb: Packet to be transmitted. * @orig_node: Final destination of the packet. * @recv_if: Interface used when receiving the packet (can be NULL). * * Looks up the best next-hop towards the passed originator and passes the * skb on for preparation of MAC header. If the packet originated from this * host, NULL can be passed as recv_if and no interface alternating is * attempted. * * Return: negative errno code on a failure, -EINPROGRESS if the skb is * buffered for later transmit or the NET_XMIT status returned by the * lower routine if the packet has been passed down. */ int batadv_send_skb_to_orig(struct sk_buff *skb, struct batadv_orig_node *orig_node, struct batadv_hard_iface *recv_if) { struct batadv_priv *bat_priv = orig_node->bat_priv; struct batadv_neigh_node *neigh_node; int ret; /* batadv_find_router() increases neigh_nodes refcount if found. */ neigh_node = batadv_find_router(bat_priv, orig_node, recv_if); if (!neigh_node) { ret = -EINVAL; goto free_skb; } /* Check if the skb is too large to send in one piece and fragment * it if needed. */ if (atomic_read(&bat_priv->fragmentation) && skb->len > neigh_node->if_incoming->net_dev->mtu) { /* Fragment and send packet. */ ret = batadv_frag_send_packet(skb, orig_node, neigh_node); /* skb was consumed */ skb = NULL; goto put_neigh_node; } /* try to network code the packet, if it is received on an interface * (i.e. being forwarded). If the packet originates from this node or if * network coding fails, then send the packet as usual. */ if (recv_if && batadv_nc_skb_forward(skb, neigh_node)) ret = -EINPROGRESS; else ret = batadv_send_unicast_skb(skb, neigh_node); /* skb was consumed */ skb = NULL; put_neigh_node: batadv_neigh_node_put(neigh_node); free_skb: kfree_skb(skb); return ret; } /** * batadv_send_skb_push_fill_unicast() - extend the buffer and initialize the * common fields for unicast packets * @skb: the skb carrying the unicast header to initialize * @hdr_size: amount of bytes to push at the beginning of the skb * @orig_node: the destination node * * Return: false if the buffer extension was not possible or true otherwise. */ static bool batadv_send_skb_push_fill_unicast(struct sk_buff *skb, int hdr_size, struct batadv_orig_node *orig_node) { struct batadv_unicast_packet *unicast_packet; u8 ttvn = (u8)atomic_read(&orig_node->last_ttvn); if (batadv_skb_head_push(skb, hdr_size) < 0) return false; unicast_packet = (struct batadv_unicast_packet *)skb->data; unicast_packet->version = BATADV_COMPAT_VERSION; /* batman packet type: unicast */ unicast_packet->packet_type = BATADV_UNICAST; /* set unicast ttl */ unicast_packet->ttl = BATADV_TTL; /* copy the destination for faster routing */ ether_addr_copy(unicast_packet->dest, orig_node->orig); /* set the destination tt version number */ unicast_packet->ttvn = ttvn; return true; } /** * batadv_send_skb_prepare_unicast() - encapsulate an skb with a unicast header * @skb: the skb containing the payload to encapsulate * @orig_node: the destination node * * Return: false if the payload could not be encapsulated or true otherwise. */ static bool batadv_send_skb_prepare_unicast(struct sk_buff *skb, struct batadv_orig_node *orig_node) { size_t uni_size = sizeof(struct batadv_unicast_packet); return batadv_send_skb_push_fill_unicast(skb, uni_size, orig_node); } /** * batadv_send_skb_prepare_unicast_4addr() - encapsulate an skb with a * unicast 4addr header * @bat_priv: the bat priv with all the mesh interface information * @skb: the skb containing the payload to encapsulate * @orig: the destination node * @packet_subtype: the unicast 4addr packet subtype to use * * Return: false if the payload could not be encapsulated or true otherwise. */ bool batadv_send_skb_prepare_unicast_4addr(struct batadv_priv *bat_priv, struct sk_buff *skb, struct batadv_orig_node *orig, int packet_subtype) { struct batadv_hard_iface *primary_if; struct batadv_unicast_4addr_packet *uc_4addr_packet; bool ret = false; primary_if = batadv_primary_if_get_selected(bat_priv); if (!primary_if) goto out; /* Pull the header space and fill the unicast_packet substructure. * We can do that because the first member of the uc_4addr_packet * is of type struct unicast_packet */ if (!batadv_send_skb_push_fill_unicast(skb, sizeof(*uc_4addr_packet), orig)) goto out; uc_4addr_packet = (struct batadv_unicast_4addr_packet *)skb->data; uc_4addr_packet->u.packet_type = BATADV_UNICAST_4ADDR; ether_addr_copy(uc_4addr_packet->src, primary_if->net_dev->dev_addr); uc_4addr_packet->subtype = packet_subtype; uc_4addr_packet->reserved = 0; ret = true; out: batadv_hardif_put(primary_if); return ret; } /** * batadv_send_skb_unicast() - encapsulate and send an skb via unicast * @bat_priv: the bat priv with all the mesh interface information * @skb: payload to send * @packet_type: the batman unicast packet type to use * @packet_subtype: the unicast 4addr packet subtype (only relevant for unicast * 4addr packets) * @orig_node: the originator to send the packet to * @vid: the vid to be used to search the translation table * * Wrap the given skb into a batman-adv unicast or unicast-4addr header * depending on whether BATADV_UNICAST or BATADV_UNICAST_4ADDR was supplied * as packet_type. Then send this frame to the given orig_node. * * Return: NET_XMIT_DROP in case of error or NET_XMIT_SUCCESS otherwise. */ int batadv_send_skb_unicast(struct batadv_priv *bat_priv, struct sk_buff *skb, int packet_type, int packet_subtype, struct batadv_orig_node *orig_node, unsigned short vid) { struct batadv_unicast_packet *unicast_packet; struct ethhdr *ethhdr; int ret = NET_XMIT_DROP; if (!orig_node) goto out; switch (packet_type) { case BATADV_UNICAST: if (!batadv_send_skb_prepare_unicast(skb, orig_node)) goto out; break; case BATADV_UNICAST_4ADDR: if (!batadv_send_skb_prepare_unicast_4addr(bat_priv, skb, orig_node, packet_subtype)) goto out; break; default: /* this function supports UNICAST and UNICAST_4ADDR only. It * should never be invoked with any other packet type */ goto out; } /* skb->data might have been reallocated by * batadv_send_skb_prepare_unicast{,_4addr}() */ ethhdr = eth_hdr(skb); unicast_packet = (struct batadv_unicast_packet *)skb->data; /* inform the destination node that we are still missing a correct route * for this client. The destination will receive this packet and will * try to reroute it because the ttvn contained in the header is less * than the current one */ if (batadv_tt_global_client_is_roaming(bat_priv, ethhdr->h_dest, vid)) unicast_packet->ttvn = unicast_packet->ttvn - 1; ret = batadv_send_skb_to_orig(skb, orig_node, NULL); /* skb was consumed */ skb = NULL; out: kfree_skb(skb); return ret; } /** * batadv_send_skb_via_tt_generic() - send an skb via TT lookup * @bat_priv: the bat priv with all the mesh interface information * @skb: payload to send * @packet_type: the batman unicast packet type to use * @packet_subtype: the unicast 4addr packet subtype (only relevant for unicast * 4addr packets) * @dst_hint: can be used to override the destination contained in the skb * @vid: the vid to be used to search the translation table * * Look up the recipient node for the destination address in the ethernet * header via the translation table. Wrap the given skb into a batman-adv * unicast or unicast-4addr header depending on whether BATADV_UNICAST or * BATADV_UNICAST_4ADDR was supplied as packet_type. Then send this frame * to the according destination node. * * Return: NET_XMIT_DROP in case of error or NET_XMIT_SUCCESS otherwise. */ int batadv_send_skb_via_tt_generic(struct batadv_priv *bat_priv, struct sk_buff *skb, int packet_type, int packet_subtype, u8 *dst_hint, unsigned short vid) { struct ethhdr *ethhdr = (struct ethhdr *)skb->data; struct batadv_orig_node *orig_node; u8 *src, *dst; int ret; src = ethhdr->h_source; dst = ethhdr->h_dest; /* if we got an hint! let's send the packet to this client (if any) */ if (dst_hint) { src = NULL; dst = dst_hint; } orig_node = batadv_transtable_search(bat_priv, src, dst, vid); ret = batadv_send_skb_unicast(bat_priv, skb, packet_type, packet_subtype, orig_node, vid); batadv_orig_node_put(orig_node); return ret; } /** * batadv_send_skb_via_gw() - send an skb via gateway lookup * @bat_priv: the bat priv with all the mesh interface information * @skb: payload to send * @vid: the vid to be used to search the translation table * * Look up the currently selected gateway. Wrap the given skb into a batman-adv * unicast header and send this frame to this gateway node. * * Return: NET_XMIT_DROP in case of error or NET_XMIT_SUCCESS otherwise. */ int batadv_send_skb_via_gw(struct batadv_priv *bat_priv, struct sk_buff *skb, unsigned short vid) { struct batadv_orig_node *orig_node; int ret; orig_node = batadv_gw_get_selected_orig(bat_priv); ret = batadv_send_skb_unicast(bat_priv, skb, BATADV_UNICAST_4ADDR, BATADV_P_DATA, orig_node, vid); batadv_orig_node_put(orig_node); return ret; } /** * batadv_forw_packet_free() - free a forwarding packet * @forw_packet: The packet to free * @dropped: whether the packet is freed because is dropped * * This frees a forwarding packet and releases any resources it might * have claimed. */ void batadv_forw_packet_free(struct batadv_forw_packet *forw_packet, bool dropped) { if (dropped) kfree_skb(forw_packet->skb); else consume_skb(forw_packet->skb); batadv_hardif_put(forw_packet->if_incoming); batadv_hardif_put(forw_packet->if_outgoing); if (forw_packet->queue_left) atomic_inc(forw_packet->queue_left); kfree(forw_packet); } /** * batadv_forw_packet_alloc() - allocate a forwarding packet * @if_incoming: The (optional) if_incoming to be grabbed * @if_outgoing: The (optional) if_outgoing to be grabbed * @queue_left: The (optional) queue counter to decrease * @bat_priv: The bat_priv for the mesh of this forw_packet * @skb: The raw packet this forwarding packet shall contain * * Allocates a forwarding packet and tries to get a reference to the * (optional) if_incoming, if_outgoing and queue_left. If queue_left * is NULL then bat_priv is optional, too. * * Return: An allocated forwarding packet on success, NULL otherwise. */ struct batadv_forw_packet * batadv_forw_packet_alloc(struct batadv_hard_iface *if_incoming, struct batadv_hard_iface *if_outgoing, atomic_t *queue_left, struct batadv_priv *bat_priv, struct sk_buff *skb) { struct batadv_forw_packet *forw_packet; const char *qname; if (queue_left && !batadv_atomic_dec_not_zero(queue_left)) { qname = "unknown"; if (queue_left == &bat_priv->bcast_queue_left) qname = "bcast"; if (queue_left == &bat_priv->batman_queue_left) qname = "batman"; batadv_dbg(BATADV_DBG_BATMAN, bat_priv, "%s queue is full\n", qname); return NULL; } forw_packet = kmalloc(sizeof(*forw_packet), GFP_ATOMIC); if (!forw_packet) goto err; if (if_incoming) kref_get(&if_incoming->refcount); if (if_outgoing) kref_get(&if_outgoing->refcount); INIT_HLIST_NODE(&forw_packet->list); INIT_HLIST_NODE(&forw_packet->cleanup_list); forw_packet->skb = skb; forw_packet->queue_left = queue_left; forw_packet->if_incoming = if_incoming; forw_packet->if_outgoing = if_outgoing; forw_packet->num_packets = 1; return forw_packet; err: if (queue_left) atomic_inc(queue_left); return NULL; } /** * batadv_forw_packet_was_stolen() - check whether someone stole this packet * @forw_packet: the forwarding packet to check * * This function checks whether the given forwarding packet was claimed by * someone else for free(). * * Return: True if someone stole it, false otherwise. */ static bool batadv_forw_packet_was_stolen(struct batadv_forw_packet *forw_packet) { return !hlist_unhashed(&forw_packet->cleanup_list); } /** * batadv_forw_packet_steal() - claim a forw_packet for free() * @forw_packet: the forwarding packet to steal * @lock: a key to the store to steal from (e.g. forw_{bat,bcast}_list_lock) * * This function tries to steal a specific forw_packet from global * visibility for the purpose of getting it for free(). That means * the caller is *not* allowed to requeue it afterwards. * * Return: True if stealing was successful. False if someone else stole it * before us. */ bool batadv_forw_packet_steal(struct batadv_forw_packet *forw_packet, spinlock_t *lock) { /* did purging routine steal it earlier? */ spin_lock_bh(lock); if (batadv_forw_packet_was_stolen(forw_packet)) { spin_unlock_bh(lock); return false; } hlist_del_init(&forw_packet->list); /* Just to spot misuse of this function */ hlist_add_fake(&forw_packet->cleanup_list); spin_unlock_bh(lock); return true; } /** * batadv_forw_packet_list_steal() - claim a list of forward packets for free() * @forw_list: the to be stolen forward packets * @cleanup_list: a backup pointer, to be able to dispose the packet later * @hard_iface: the interface to steal forward packets from * * This function claims responsibility to free any forw_packet queued on the * given hard_iface. If hard_iface is NULL forwarding packets on all hard * interfaces will be claimed. * * The packets are being moved from the forw_list to the cleanup_list. This * makes it possible for already running threads to notice the claim. */ static void batadv_forw_packet_list_steal(struct hlist_head *forw_list, struct hlist_head *cleanup_list, const struct batadv_hard_iface *hard_iface) { struct batadv_forw_packet *forw_packet; struct hlist_node *safe_tmp_node; hlist_for_each_entry_safe(forw_packet, safe_tmp_node, forw_list, list) { /* if purge_outstanding_packets() was called with an argument * we delete only packets belonging to the given interface */ if (hard_iface && forw_packet->if_incoming != hard_iface && forw_packet->if_outgoing != hard_iface) continue; hlist_del(&forw_packet->list); hlist_add_head(&forw_packet->cleanup_list, cleanup_list); } } /** * batadv_forw_packet_list_free() - free a list of forward packets * @head: a list of to be freed forw_packets * * This function cancels the scheduling of any packet in the provided list, * waits for any possibly running packet forwarding thread to finish and * finally, safely frees this forward packet. * * This function might sleep. */ static void batadv_forw_packet_list_free(struct hlist_head *head) { struct batadv_forw_packet *forw_packet; struct hlist_node *safe_tmp_node; hlist_for_each_entry_safe(forw_packet, safe_tmp_node, head, cleanup_list) { cancel_delayed_work_sync(&forw_packet->delayed_work); hlist_del(&forw_packet->cleanup_list); batadv_forw_packet_free(forw_packet, true); } } /** * batadv_forw_packet_queue() - try to queue a forwarding packet * @forw_packet: the forwarding packet to queue * @lock: a key to the store (e.g. forw_{bat,bcast}_list_lock) * @head: the shelve to queue it on (e.g. forw_{bat,bcast}_list) * @send_time: timestamp (jiffies) when the packet is to be sent * * This function tries to (re)queue a forwarding packet. Requeuing * is prevented if the according interface is shutting down * (e.g. if batadv_forw_packet_list_steal() was called for this * packet earlier). * * Calling batadv_forw_packet_queue() after a call to * batadv_forw_packet_steal() is forbidden! * * Caller needs to ensure that forw_packet->delayed_work was initialized. */ static void batadv_forw_packet_queue(struct batadv_forw_packet *forw_packet, spinlock_t *lock, struct hlist_head *head, unsigned long send_time) { spin_lock_bh(lock); /* did purging routine steal it from us? */ if (batadv_forw_packet_was_stolen(forw_packet)) { /* If you got it for free() without trouble, then * don't get back into the queue after stealing... */ WARN_ONCE(hlist_fake(&forw_packet->cleanup_list), "Requeuing after batadv_forw_packet_steal() not allowed!\n"); spin_unlock_bh(lock); return; } hlist_del_init(&forw_packet->list); hlist_add_head(&forw_packet->list, head); queue_delayed_work(batadv_event_workqueue, &forw_packet->delayed_work, send_time - jiffies); spin_unlock_bh(lock); } /** * batadv_forw_packet_bcast_queue() - try to queue a broadcast packet * @bat_priv: the bat priv with all the mesh interface information * @forw_packet: the forwarding packet to queue * @send_time: timestamp (jiffies) when the packet is to be sent * * This function tries to (re)queue a broadcast packet. * * Caller needs to ensure that forw_packet->delayed_work was initialized. */ static void batadv_forw_packet_bcast_queue(struct batadv_priv *bat_priv, struct batadv_forw_packet *forw_packet, unsigned long send_time) { batadv_forw_packet_queue(forw_packet, &bat_priv->forw_bcast_list_lock, &bat_priv->forw_bcast_list, send_time); } /** * batadv_forw_packet_ogmv1_queue() - try to queue an OGMv1 packet * @bat_priv: the bat priv with all the mesh interface information * @forw_packet: the forwarding packet to queue * @send_time: timestamp (jiffies) when the packet is to be sent * * This function tries to (re)queue an OGMv1 packet. * * Caller needs to ensure that forw_packet->delayed_work was initialized. */ void batadv_forw_packet_ogmv1_queue(struct batadv_priv *bat_priv, struct batadv_forw_packet *forw_packet, unsigned long send_time) { batadv_forw_packet_queue(forw_packet, &bat_priv->forw_bat_list_lock, &bat_priv->forw_bat_list, send_time); } /** * batadv_forw_bcast_packet_to_list() - queue broadcast packet for transmissions * @bat_priv: the bat priv with all the mesh interface information * @skb: broadcast packet to add * @delay: number of jiffies to wait before sending * @own_packet: true if it is a self-generated broadcast packet * @if_in: the interface where the packet was received on * @if_out: the outgoing interface to queue on * * Adds a broadcast packet to the queue and sets up timers. Broadcast packets * are sent multiple times to increase probability for being received. * * This call clones the given skb, hence the caller needs to take into * account that the data segment of the original skb might not be * modifiable anymore. * * Return: NETDEV_TX_OK on success and NETDEV_TX_BUSY on errors. */ static int batadv_forw_bcast_packet_to_list(struct batadv_priv *bat_priv, struct sk_buff *skb, unsigned long delay, bool own_packet, struct batadv_hard_iface *if_in, struct batadv_hard_iface *if_out) { struct batadv_forw_packet *forw_packet; unsigned long send_time = jiffies; struct sk_buff *newskb; newskb = skb_clone(skb, GFP_ATOMIC); if (!newskb) goto err; forw_packet = batadv_forw_packet_alloc(if_in, if_out, &bat_priv->bcast_queue_left, bat_priv, newskb); if (!forw_packet) goto err_packet_free; forw_packet->own = own_packet; INIT_DELAYED_WORK(&forw_packet->delayed_work, batadv_send_outstanding_bcast_packet); send_time += delay ? delay : msecs_to_jiffies(5); batadv_forw_packet_bcast_queue(bat_priv, forw_packet, send_time); return NETDEV_TX_OK; err_packet_free: kfree_skb(newskb); err: return NETDEV_TX_BUSY; } /** * batadv_forw_bcast_packet_if() - forward and queue a broadcast packet * @bat_priv: the bat priv with all the mesh interface information * @skb: broadcast packet to add * @delay: number of jiffies to wait before sending * @own_packet: true if it is a self-generated broadcast packet * @if_in: the interface where the packet was received on * @if_out: the outgoing interface to forward to * * Transmits a broadcast packet on the specified interface either immediately * or if a delay is given after that. Furthermore, queues additional * retransmissions if this interface is a wireless one. * * This call clones the given skb, hence the caller needs to take into * account that the data segment of the original skb might not be * modifiable anymore. * * Return: NETDEV_TX_OK on success and NETDEV_TX_BUSY on errors. */ static int batadv_forw_bcast_packet_if(struct batadv_priv *bat_priv, struct sk_buff *skb, unsigned long delay, bool own_packet, struct batadv_hard_iface *if_in, struct batadv_hard_iface *if_out) { unsigned int num_bcasts = if_out->num_bcasts; struct sk_buff *newskb; int ret = NETDEV_TX_OK; if (!delay) { newskb = skb_clone(skb, GFP_ATOMIC); if (!newskb) return NETDEV_TX_BUSY; batadv_send_broadcast_skb(newskb, if_out); num_bcasts--; } /* delayed broadcast or rebroadcasts? */ if (num_bcasts >= 1) { BATADV_SKB_CB(skb)->num_bcasts = num_bcasts; ret = batadv_forw_bcast_packet_to_list(bat_priv, skb, delay, own_packet, if_in, if_out); } return ret; } /** * batadv_send_no_broadcast() - check whether (re)broadcast is necessary * @bat_priv: the bat priv with all the mesh interface information * @skb: broadcast packet to check * @own_packet: true if it is a self-generated broadcast packet * @if_out: the outgoing interface checked and considered for (re)broadcast * * Return: False if a packet needs to be (re)broadcasted on the given interface, * true otherwise. */ static bool batadv_send_no_broadcast(struct batadv_priv *bat_priv, struct sk_buff *skb, bool own_packet, struct batadv_hard_iface *if_out) { struct batadv_hardif_neigh_node *neigh_node = NULL; struct batadv_bcast_packet *bcast_packet; u8 *orig_neigh; u8 *neigh_addr; char *type; int ret; if (!own_packet) { neigh_addr = eth_hdr(skb)->h_source; neigh_node = batadv_hardif_neigh_get(if_out, neigh_addr); } bcast_packet = (struct batadv_bcast_packet *)skb->data; orig_neigh = neigh_node ? neigh_node->orig : NULL; ret = batadv_hardif_no_broadcast(if_out, bcast_packet->orig, orig_neigh); batadv_hardif_neigh_put(neigh_node); /* ok, may broadcast */ if (!ret) return false; /* no broadcast */ switch (ret) { case BATADV_HARDIF_BCAST_NORECIPIENT: type = "no neighbor"; break; case BATADV_HARDIF_BCAST_DUPFWD: type = "single neighbor is source"; break; case BATADV_HARDIF_BCAST_DUPORIG: type = "single neighbor is originator"; break; default: type = "unknown"; } batadv_dbg(BATADV_DBG_BATMAN, bat_priv, "BCAST packet from orig %pM on %s suppressed: %s\n", bcast_packet->orig, if_out->net_dev->name, type); return true; } /** * __batadv_forw_bcast_packet() - forward and queue a broadcast packet * @bat_priv: the bat priv with all the mesh interface information * @skb: broadcast packet to add * @delay: number of jiffies to wait before sending * @own_packet: true if it is a self-generated broadcast packet * * Transmits a broadcast packet either immediately or if a delay is given * after that. Furthermore, queues additional retransmissions on wireless * interfaces. * * This call clones the given skb, hence the caller needs to take into * account that the data segment of the given skb might not be * modifiable anymore. * * Return: NETDEV_TX_OK on success and NETDEV_TX_BUSY on errors. */ static int __batadv_forw_bcast_packet(struct batadv_priv *bat_priv, struct sk_buff *skb, unsigned long delay, bool own_packet) { struct batadv_hard_iface *hard_iface; struct batadv_hard_iface *primary_if; int ret = NETDEV_TX_OK; primary_if = batadv_primary_if_get_selected(bat_priv); if (!primary_if) return NETDEV_TX_BUSY; rcu_read_lock(); list_for_each_entry_rcu(hard_iface, &batadv_hardif_list, list) { if (hard_iface->mesh_iface != bat_priv->mesh_iface) continue; if (!kref_get_unless_zero(&hard_iface->refcount)) continue; if (batadv_send_no_broadcast(bat_priv, skb, own_packet, hard_iface)) { batadv_hardif_put(hard_iface); continue; } ret = batadv_forw_bcast_packet_if(bat_priv, skb, delay, own_packet, primary_if, hard_iface); batadv_hardif_put(hard_iface); if (ret == NETDEV_TX_BUSY) break; } rcu_read_unlock(); batadv_hardif_put(primary_if); return ret; } /** * batadv_forw_bcast_packet() - forward and queue a broadcast packet * @bat_priv: the bat priv with all the mesh interface information * @skb: broadcast packet to add * @delay: number of jiffies to wait before sending * @own_packet: true if it is a self-generated broadcast packet * * Transmits a broadcast packet either immediately or if a delay is given * after that. Furthermore, queues additional retransmissions on wireless * interfaces. * * Return: NETDEV_TX_OK on success and NETDEV_TX_BUSY on errors. */ int batadv_forw_bcast_packet(struct batadv_priv *bat_priv, struct sk_buff *skb, unsigned long delay, bool own_packet) { return __batadv_forw_bcast_packet(bat_priv, skb, delay, own_packet); } /** * batadv_send_bcast_packet() - send and queue a broadcast packet * @bat_priv: the bat priv with all the mesh interface information * @skb: broadcast packet to add * @delay: number of jiffies to wait before sending * @own_packet: true if it is a self-generated broadcast packet * * Transmits a broadcast packet either immediately or if a delay is given * after that. Furthermore, queues additional retransmissions on wireless * interfaces. * * Consumes the provided skb. */ void batadv_send_bcast_packet(struct batadv_priv *bat_priv, struct sk_buff *skb, unsigned long delay, bool own_packet) { __batadv_forw_bcast_packet(bat_priv, skb, delay, own_packet); consume_skb(skb); } /** * batadv_forw_packet_bcasts_left() - check if a retransmission is necessary * @forw_packet: the forwarding packet to check * * Checks whether a given packet has any (re)transmissions left on the provided * interface. * * hard_iface may be NULL: In that case the number of transmissions this skb had * so far is compared with the maximum amount of retransmissions independent of * any interface instead. * * Return: True if (re)transmissions are left, false otherwise. */ static bool batadv_forw_packet_bcasts_left(struct batadv_forw_packet *forw_packet) { return BATADV_SKB_CB(forw_packet->skb)->num_bcasts; } /** * batadv_forw_packet_bcasts_dec() - decrement retransmission counter of a * packet * @forw_packet: the packet to decrease the counter for */ static void batadv_forw_packet_bcasts_dec(struct batadv_forw_packet *forw_packet) { BATADV_SKB_CB(forw_packet->skb)->num_bcasts--; } /** * batadv_forw_packet_is_rebroadcast() - check packet for previous transmissions * @forw_packet: the packet to check * * Return: True if this packet was transmitted before, false otherwise. */ bool batadv_forw_packet_is_rebroadcast(struct batadv_forw_packet *forw_packet) { unsigned char num_bcasts = BATADV_SKB_CB(forw_packet->skb)->num_bcasts; return num_bcasts != forw_packet->if_outgoing->num_bcasts; } /** * batadv_send_outstanding_bcast_packet() - transmit a queued broadcast packet * @work: work queue item * * Transmits a queued broadcast packet and if necessary reschedules it. */ static void batadv_send_outstanding_bcast_packet(struct work_struct *work) { unsigned long send_time = jiffies + msecs_to_jiffies(5); struct batadv_forw_packet *forw_packet; struct delayed_work *delayed_work; struct batadv_priv *bat_priv; struct sk_buff *skb1; bool dropped = false; delayed_work = to_delayed_work(work); forw_packet = container_of(delayed_work, struct batadv_forw_packet, delayed_work); bat_priv = netdev_priv(forw_packet->if_incoming->mesh_iface); if (atomic_read(&bat_priv->mesh_state) == BATADV_MESH_DEACTIVATING) { dropped = true; goto out; } if (batadv_dat_drop_broadcast_packet(bat_priv, forw_packet)) { dropped = true; goto out; } /* send a copy of the saved skb */ skb1 = skb_clone(forw_packet->skb, GFP_ATOMIC); if (!skb1) goto out; batadv_send_broadcast_skb(skb1, forw_packet->if_outgoing); batadv_forw_packet_bcasts_dec(forw_packet); if (batadv_forw_packet_bcasts_left(forw_packet)) { batadv_forw_packet_bcast_queue(bat_priv, forw_packet, send_time); return; } out: /* do we get something for free()? */ if (batadv_forw_packet_steal(forw_packet, &bat_priv->forw_bcast_list_lock)) batadv_forw_packet_free(forw_packet, dropped); } /** * batadv_purge_outstanding_packets() - stop/purge scheduled bcast/OGMv1 packets * @bat_priv: the bat priv with all the mesh interface information * @hard_iface: the hard interface to cancel and purge bcast/ogm packets on * * This method cancels and purges any broadcast and OGMv1 packet on the given * hard_iface. If hard_iface is NULL, broadcast and OGMv1 packets on all hard * interfaces will be canceled and purged. * * This function might sleep. */ void batadv_purge_outstanding_packets(struct batadv_priv *bat_priv, const struct batadv_hard_iface *hard_iface) { struct hlist_head head = HLIST_HEAD_INIT; if (hard_iface) batadv_dbg(BATADV_DBG_BATMAN, bat_priv, "%s(): %s\n", __func__, hard_iface->net_dev->name); else batadv_dbg(BATADV_DBG_BATMAN, bat_priv, "%s()\n", __func__); /* claim bcast list for free() */ spin_lock_bh(&bat_priv->forw_bcast_list_lock); batadv_forw_packet_list_steal(&bat_priv->forw_bcast_list, &head, hard_iface); spin_unlock_bh(&bat_priv->forw_bcast_list_lock); /* claim batman packet list for free() */ spin_lock_bh(&bat_priv->forw_bat_list_lock); batadv_forw_packet_list_steal(&bat_priv->forw_bat_list, &head, hard_iface); spin_unlock_bh(&bat_priv->forw_bat_list_lock); /* then cancel or wait for packet workers to finish and free */ batadv_forw_packet_list_free(&head); } |
| 587 526 81 1462 589 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 | /* SPDX-License-Identifier: GPL-2.0 */ #undef TRACE_SYSTEM #define TRACE_SYSTEM filemap #if !defined(_TRACE_FILEMAP_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_FILEMAP_H #include <linux/types.h> #include <linux/tracepoint.h> #include <linux/mm.h> #include <linux/memcontrol.h> #include <linux/device.h> #include <linux/kdev_t.h> #include <linux/errseq.h> DECLARE_EVENT_CLASS(mm_filemap_op_page_cache, TP_PROTO(struct folio *folio), TP_ARGS(folio), TP_STRUCT__entry( __field(unsigned long, pfn) __field(unsigned long, i_ino) __field(unsigned long, index) __field(dev_t, s_dev) __field(unsigned char, order) ), TP_fast_assign( __entry->pfn = folio_pfn(folio); __entry->i_ino = folio->mapping->host->i_ino; __entry->index = folio->index; if (folio->mapping->host->i_sb) __entry->s_dev = folio->mapping->host->i_sb->s_dev; else __entry->s_dev = folio->mapping->host->i_rdev; __entry->order = folio_order(folio); ), TP_printk("dev %d:%d ino %lx pfn=0x%lx ofs=%lu order=%u", MAJOR(__entry->s_dev), MINOR(__entry->s_dev), __entry->i_ino, __entry->pfn, __entry->index << PAGE_SHIFT, __entry->order) ); DEFINE_EVENT(mm_filemap_op_page_cache, mm_filemap_delete_from_page_cache, TP_PROTO(struct folio *folio), TP_ARGS(folio) ); DEFINE_EVENT(mm_filemap_op_page_cache, mm_filemap_add_to_page_cache, TP_PROTO(struct folio *folio), TP_ARGS(folio) ); DECLARE_EVENT_CLASS(mm_filemap_op_page_cache_range, TP_PROTO( struct address_space *mapping, pgoff_t index, pgoff_t last_index ), TP_ARGS(mapping, index, last_index), TP_STRUCT__entry( __field(unsigned long, i_ino) __field(dev_t, s_dev) __field(unsigned long, index) __field(unsigned long, last_index) ), TP_fast_assign( __entry->i_ino = mapping->host->i_ino; if (mapping->host->i_sb) __entry->s_dev = mapping->host->i_sb->s_dev; else __entry->s_dev = mapping->host->i_rdev; __entry->index = index; __entry->last_index = last_index; ), TP_printk( "dev=%d:%d ino=%lx ofs=%lld-%lld", MAJOR(__entry->s_dev), MINOR(__entry->s_dev), __entry->i_ino, ((loff_t)__entry->index) << PAGE_SHIFT, ((((loff_t)__entry->last_index + 1) << PAGE_SHIFT) - 1) ) ); DEFINE_EVENT(mm_filemap_op_page_cache_range, mm_filemap_get_pages, TP_PROTO( struct address_space *mapping, pgoff_t index, pgoff_t last_index ), TP_ARGS(mapping, index, last_index) ); DEFINE_EVENT(mm_filemap_op_page_cache_range, mm_filemap_map_pages, TP_PROTO( struct address_space *mapping, pgoff_t index, pgoff_t last_index ), TP_ARGS(mapping, index, last_index) ); TRACE_EVENT(mm_filemap_fault, TP_PROTO(struct address_space *mapping, pgoff_t index), TP_ARGS(mapping, index), TP_STRUCT__entry( __field(unsigned long, i_ino) __field(dev_t, s_dev) __field(unsigned long, index) ), TP_fast_assign( __entry->i_ino = mapping->host->i_ino; if (mapping->host->i_sb) __entry->s_dev = mapping->host->i_sb->s_dev; else __entry->s_dev = mapping->host->i_rdev; __entry->index = index; ), TP_printk( "dev=%d:%d ino=%lx ofs=%lld", MAJOR(__entry->s_dev), MINOR(__entry->s_dev), __entry->i_ino, ((loff_t)__entry->index) << PAGE_SHIFT ) ); TRACE_EVENT(filemap_set_wb_err, TP_PROTO(struct address_space *mapping, errseq_t eseq), TP_ARGS(mapping, eseq), TP_STRUCT__entry( __field(unsigned long, i_ino) __field(dev_t, s_dev) __field(errseq_t, errseq) ), TP_fast_assign( __entry->i_ino = mapping->host->i_ino; __entry->errseq = eseq; if (mapping->host->i_sb) __entry->s_dev = mapping->host->i_sb->s_dev; else __entry->s_dev = mapping->host->i_rdev; ), TP_printk("dev=%d:%d ino=0x%lx errseq=0x%x", MAJOR(__entry->s_dev), MINOR(__entry->s_dev), __entry->i_ino, __entry->errseq) ); TRACE_EVENT(file_check_and_advance_wb_err, TP_PROTO(struct file *file, errseq_t old), TP_ARGS(file, old), TP_STRUCT__entry( __field(struct file *, file) __field(unsigned long, i_ino) __field(dev_t, s_dev) __field(errseq_t, old) __field(errseq_t, new) ), TP_fast_assign( __entry->file = file; __entry->i_ino = file->f_mapping->host->i_ino; if (file->f_mapping->host->i_sb) __entry->s_dev = file->f_mapping->host->i_sb->s_dev; else __entry->s_dev = file->f_mapping->host->i_rdev; __entry->old = old; __entry->new = file->f_wb_err; ), TP_printk("file=%p dev=%d:%d ino=0x%lx old=0x%x new=0x%x", __entry->file, MAJOR(__entry->s_dev), MINOR(__entry->s_dev), __entry->i_ino, __entry->old, __entry->new) ); #endif /* _TRACE_FILEMAP_H */ /* This part must be outside protection */ #include <trace/define_trace.h> |
| 21 27 27 26 2 2 1 2 27 2 9 2 2 1 12 2 25 2 15 8 6 5 5 5 9 15 19 6 13 19 9 15 15 5 10 15 14 8 14 2 3 9 14 15 12 12 8 4 12 12 10 10 10 13 12 11 6 12 10 9 5 5 2 3 9 12 13 9 9 10 2 8 7 2 6 7 21 21 19 17 14 11 13 18 18 12 6 18 1 17 1 17 21 23 22 23 17 16 6 15 15 5 10 15 3 1 2 15 15 15 6 5 4 4 24 23 9 15 24 6 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 | // SPDX-License-Identifier: GPL-2.0 /* * fs/timerfd.c * * Copyright (C) 2007 Davide Libenzi <davidel@xmailserver.org> * * * Thanks to Thomas Gleixner for code reviews and useful comments. * */ #include <linux/alarmtimer.h> #include <linux/file.h> #include <linux/poll.h> #include <linux/init.h> #include <linux/fs.h> #include <linux/sched.h> #include <linux/kernel.h> #include <linux/slab.h> #include <linux/list.h> #include <linux/spinlock.h> #include <linux/time.h> #include <linux/hrtimer.h> #include <linux/anon_inodes.h> #include <linux/timerfd.h> #include <linux/syscalls.h> #include <linux/compat.h> #include <linux/rcupdate.h> #include <linux/time_namespace.h> struct timerfd_ctx { union { struct hrtimer tmr; struct alarm alarm; } t; ktime_t tintv; ktime_t moffs; wait_queue_head_t wqh; u64 ticks; int clockid; short unsigned expired; short unsigned settime_flags; /* to show in fdinfo */ struct rcu_head rcu; struct list_head clist; spinlock_t cancel_lock; bool might_cancel; }; static LIST_HEAD(cancel_list); static DEFINE_SPINLOCK(cancel_lock); static inline bool isalarm(struct timerfd_ctx *ctx) { return ctx->clockid == CLOCK_REALTIME_ALARM || ctx->clockid == CLOCK_BOOTTIME_ALARM; } /* * This gets called when the timer event triggers. We set the "expired" * flag, but we do not re-arm the timer (in case it's necessary, * tintv != 0) until the timer is accessed. */ static void timerfd_triggered(struct timerfd_ctx *ctx) { unsigned long flags; spin_lock_irqsave(&ctx->wqh.lock, flags); ctx->expired = 1; ctx->ticks++; wake_up_locked_poll(&ctx->wqh, EPOLLIN); spin_unlock_irqrestore(&ctx->wqh.lock, flags); } static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr) { struct timerfd_ctx *ctx = container_of(htmr, struct timerfd_ctx, t.tmr); timerfd_triggered(ctx); return HRTIMER_NORESTART; } static void timerfd_alarmproc(struct alarm *alarm, ktime_t now) { struct timerfd_ctx *ctx = container_of(alarm, struct timerfd_ctx, t.alarm); timerfd_triggered(ctx); } /* * Called when the clock was set to cancel the timers in the cancel * list. This will wake up processes waiting on these timers. The * wake-up requires ctx->ticks to be non zero, therefore we increment * it before calling wake_up_locked(). */ void timerfd_clock_was_set(void) { ktime_t moffs = ktime_mono_to_real(0); struct timerfd_ctx *ctx; unsigned long flags; rcu_read_lock(); list_for_each_entry_rcu(ctx, &cancel_list, clist) { if (!ctx->might_cancel) continue; spin_lock_irqsave(&ctx->wqh.lock, flags); if (ctx->moffs != moffs) { ctx->moffs = KTIME_MAX; ctx->ticks++; wake_up_locked_poll(&ctx->wqh, EPOLLIN); } spin_unlock_irqrestore(&ctx->wqh.lock, flags); } rcu_read_unlock(); } static void timerfd_resume_work(struct work_struct *work) { timerfd_clock_was_set(); } static DECLARE_WORK(timerfd_work, timerfd_resume_work); /* * Invoked from timekeeping_resume(). Defer the actual update to work so * timerfd_clock_was_set() runs in task context. */ void timerfd_resume(void) { schedule_work(&timerfd_work); } static void __timerfd_remove_cancel(struct timerfd_ctx *ctx) { if (ctx->might_cancel) { ctx->might_cancel = false; spin_lock(&cancel_lock); list_del_rcu(&ctx->clist); spin_unlock(&cancel_lock); } } static void timerfd_remove_cancel(struct timerfd_ctx *ctx) { spin_lock(&ctx->cancel_lock); __timerfd_remove_cancel(ctx); spin_unlock(&ctx->cancel_lock); } static bool timerfd_canceled(struct timerfd_ctx *ctx) { if (!ctx->might_cancel || ctx->moffs != KTIME_MAX) return false; ctx->moffs = ktime_mono_to_real(0); return true; } static void timerfd_setup_cancel(struct timerfd_ctx *ctx, int flags) { spin_lock(&ctx->cancel_lock); if ((ctx->clockid == CLOCK_REALTIME || ctx->clockid == CLOCK_REALTIME_ALARM) && (flags & TFD_TIMER_ABSTIME) && (flags & TFD_TIMER_CANCEL_ON_SET)) { if (!ctx->might_cancel) { ctx->might_cancel = true; spin_lock(&cancel_lock); list_add_rcu(&ctx->clist, &cancel_list); spin_unlock(&cancel_lock); } } else { __timerfd_remove_cancel(ctx); } spin_unlock(&ctx->cancel_lock); } static ktime_t timerfd_get_remaining(struct timerfd_ctx *ctx) { ktime_t remaining; if (isalarm(ctx)) remaining = alarm_expires_remaining(&ctx->t.alarm); else remaining = hrtimer_expires_remaining_adjusted(&ctx->t.tmr); return remaining < 0 ? 0: remaining; } static int timerfd_setup(struct timerfd_ctx *ctx, int flags, const struct itimerspec64 *ktmr) { enum hrtimer_mode htmode; ktime_t texp; int clockid = ctx->clockid; htmode = (flags & TFD_TIMER_ABSTIME) ? HRTIMER_MODE_ABS: HRTIMER_MODE_REL; texp = timespec64_to_ktime(ktmr->it_value); ctx->expired = 0; ctx->ticks = 0; ctx->tintv = timespec64_to_ktime(ktmr->it_interval); if (isalarm(ctx)) { alarm_init(&ctx->t.alarm, ctx->clockid == CLOCK_REALTIME_ALARM ? ALARM_REALTIME : ALARM_BOOTTIME, timerfd_alarmproc); } else { hrtimer_setup(&ctx->t.tmr, timerfd_tmrproc, clockid, htmode); hrtimer_set_expires(&ctx->t.tmr, texp); } if (texp != 0) { if (flags & TFD_TIMER_ABSTIME) texp = timens_ktime_to_host(clockid, texp); if (isalarm(ctx)) { if (flags & TFD_TIMER_ABSTIME) alarm_start(&ctx->t.alarm, texp); else alarm_start_relative(&ctx->t.alarm, texp); } else { hrtimer_start(&ctx->t.tmr, texp, htmode); } if (timerfd_canceled(ctx)) return -ECANCELED; } ctx->settime_flags = flags & TFD_SETTIME_FLAGS; return 0; } static int timerfd_release(struct inode *inode, struct file *file) { struct timerfd_ctx *ctx = file->private_data; timerfd_remove_cancel(ctx); if (isalarm(ctx)) alarm_cancel(&ctx->t.alarm); else hrtimer_cancel(&ctx->t.tmr); kfree_rcu(ctx, rcu); return 0; } static __poll_t timerfd_poll(struct file *file, poll_table *wait) { struct timerfd_ctx *ctx = file->private_data; __poll_t events = 0; unsigned long flags; poll_wait(file, &ctx->wqh, wait); spin_lock_irqsave(&ctx->wqh.lock, flags); if (ctx->ticks) events |= EPOLLIN; spin_unlock_irqrestore(&ctx->wqh.lock, flags); return events; } static ssize_t timerfd_read_iter(struct kiocb *iocb, struct iov_iter *to) { struct file *file = iocb->ki_filp; struct timerfd_ctx *ctx = file->private_data; ssize_t res; u64 ticks = 0; if (iov_iter_count(to) < sizeof(ticks)) return -EINVAL; spin_lock_irq(&ctx->wqh.lock); if (file->f_flags & O_NONBLOCK || iocb->ki_flags & IOCB_NOWAIT) res = -EAGAIN; else res = wait_event_interruptible_locked_irq(ctx->wqh, ctx->ticks); /* * If clock has changed, we do not care about the * ticks and we do not rearm the timer. Userspace must * reevaluate anyway. */ if (timerfd_canceled(ctx)) { ctx->ticks = 0; ctx->expired = 0; res = -ECANCELED; } if (ctx->ticks) { ticks = ctx->ticks; if (ctx->expired && ctx->tintv) { /* * If tintv != 0, this is a periodic timer that * needs to be re-armed. We avoid doing it in the timer * callback to avoid DoS attacks specifying a very * short timer period. */ if (isalarm(ctx)) { ticks += alarm_forward_now( &ctx->t.alarm, ctx->tintv) - 1; alarm_restart(&ctx->t.alarm); } else { ticks += hrtimer_forward_now(&ctx->t.tmr, ctx->tintv) - 1; hrtimer_restart(&ctx->t.tmr); } } ctx->expired = 0; ctx->ticks = 0; } spin_unlock_irq(&ctx->wqh.lock); if (ticks) { res = copy_to_iter(&ticks, sizeof(ticks), to); if (!res) res = -EFAULT; } return res; } #ifdef CONFIG_PROC_FS static void timerfd_show(struct seq_file *m, struct file *file) { struct timerfd_ctx *ctx = file->private_data; struct timespec64 value, interval; spin_lock_irq(&ctx->wqh.lock); value = ktime_to_timespec64(timerfd_get_remaining(ctx)); interval = ktime_to_timespec64(ctx->tintv); spin_unlock_irq(&ctx->wqh.lock); seq_printf(m, "clockid: %d\n" "ticks: %llu\n" "settime flags: 0%o\n" "it_value: (%llu, %llu)\n" "it_interval: (%llu, %llu)\n", ctx->clockid, (unsigned long long)ctx->ticks, ctx->settime_flags, (unsigned long long)value.tv_sec, (unsigned long long)value.tv_nsec, (unsigned long long)interval.tv_sec, (unsigned long long)interval.tv_nsec); } #else #define timerfd_show NULL #endif #ifdef CONFIG_CHECKPOINT_RESTORE static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct timerfd_ctx *ctx = file->private_data; int ret = 0; switch (cmd) { case TFD_IOC_SET_TICKS: { u64 ticks; if (copy_from_user(&ticks, (u64 __user *)arg, sizeof(ticks))) return -EFAULT; if (!ticks) return -EINVAL; spin_lock_irq(&ctx->wqh.lock); if (!timerfd_canceled(ctx)) { ctx->ticks = ticks; wake_up_locked_poll(&ctx->wqh, EPOLLIN); } else ret = -ECANCELED; spin_unlock_irq(&ctx->wqh.lock); break; } default: ret = -ENOTTY; break; } return ret; } #else #define timerfd_ioctl NULL #endif static const struct file_operations timerfd_fops = { .release = timerfd_release, .poll = timerfd_poll, .read_iter = timerfd_read_iter, .llseek = noop_llseek, .show_fdinfo = timerfd_show, .unlocked_ioctl = timerfd_ioctl, }; SYSCALL_DEFINE2(timerfd_create, int, clockid, int, flags) { int ufd; struct timerfd_ctx *ctx; struct file *file; /* Check the TFD_* constants for consistency. */ BUILD_BUG_ON(TFD_CLOEXEC != O_CLOEXEC); BUILD_BUG_ON(TFD_NONBLOCK != O_NONBLOCK); if ((flags & ~TFD_CREATE_FLAGS) || (clockid != CLOCK_MONOTONIC && clockid != CLOCK_REALTIME && clockid != CLOCK_REALTIME_ALARM && clockid != CLOCK_BOOTTIME && clockid != CLOCK_BOOTTIME_ALARM)) return -EINVAL; if ((clockid == CLOCK_REALTIME_ALARM || clockid == CLOCK_BOOTTIME_ALARM) && !capable(CAP_WAKE_ALARM)) return -EPERM; ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); if (!ctx) return -ENOMEM; init_waitqueue_head(&ctx->wqh); spin_lock_init(&ctx->cancel_lock); ctx->clockid = clockid; if (isalarm(ctx)) alarm_init(&ctx->t.alarm, ctx->clockid == CLOCK_REALTIME_ALARM ? ALARM_REALTIME : ALARM_BOOTTIME, timerfd_alarmproc); else hrtimer_setup(&ctx->t.tmr, timerfd_tmrproc, clockid, HRTIMER_MODE_ABS); ctx->moffs = ktime_mono_to_real(0); ufd = get_unused_fd_flags(flags & TFD_SHARED_FCNTL_FLAGS); if (ufd < 0) { kfree(ctx); return ufd; } file = anon_inode_getfile_fmode("[timerfd]", &timerfd_fops, ctx, O_RDWR | (flags & TFD_SHARED_FCNTL_FLAGS), FMODE_NOWAIT); if (IS_ERR(file)) { put_unused_fd(ufd); kfree(ctx); return PTR_ERR(file); } fd_install(ufd, file); return ufd; } static int do_timerfd_settime(int ufd, int flags, const struct itimerspec64 *new, struct itimerspec64 *old) { struct timerfd_ctx *ctx; int ret; if ((flags & ~TFD_SETTIME_FLAGS) || !itimerspec64_valid(new)) return -EINVAL; CLASS(fd, f)(ufd); if (fd_empty(f)) return -EBADF; if (fd_file(f)->f_op != &timerfd_fops) return -EINVAL; ctx = fd_file(f)->private_data; if (isalarm(ctx) && !capable(CAP_WAKE_ALARM)) return -EPERM; timerfd_setup_cancel(ctx, flags); /* * We need to stop the existing timer before reprogramming * it to the new values. */ for (;;) { spin_lock_irq(&ctx->wqh.lock); if (isalarm(ctx)) { if (alarm_try_to_cancel(&ctx->t.alarm) >= 0) break; } else { if (hrtimer_try_to_cancel(&ctx->t.tmr) >= 0) break; } spin_unlock_irq(&ctx->wqh.lock); if (isalarm(ctx)) hrtimer_cancel_wait_running(&ctx->t.alarm.timer); else hrtimer_cancel_wait_running(&ctx->t.tmr); } /* * If the timer is expired and it's periodic, we need to advance it * because the caller may want to know the previous expiration time. * We do not update "ticks" and "expired" since the timer will be * re-programmed again in the following timerfd_setup() call. */ if (ctx->expired && ctx->tintv) { if (isalarm(ctx)) alarm_forward_now(&ctx->t.alarm, ctx->tintv); else hrtimer_forward_now(&ctx->t.tmr, ctx->tintv); } old->it_value = ktime_to_timespec64(timerfd_get_remaining(ctx)); old->it_interval = ktime_to_timespec64(ctx->tintv); /* * Re-program the timer to the new value ... */ ret = timerfd_setup(ctx, flags, new); spin_unlock_irq(&ctx->wqh.lock); return ret; } static int do_timerfd_gettime(int ufd, struct itimerspec64 *t) { struct timerfd_ctx *ctx; CLASS(fd, f)(ufd); if (fd_empty(f)) return -EBADF; if (fd_file(f)->f_op != &timerfd_fops) return -EINVAL; ctx = fd_file(f)->private_data; spin_lock_irq(&ctx->wqh.lock); if (ctx->expired && ctx->tintv) { ctx->expired = 0; if (isalarm(ctx)) { ctx->ticks += alarm_forward_now( &ctx->t.alarm, ctx->tintv) - 1; alarm_restart(&ctx->t.alarm); } else { ctx->ticks += hrtimer_forward_now(&ctx->t.tmr, ctx->tintv) - 1; hrtimer_restart(&ctx->t.tmr); } } t->it_value = ktime_to_timespec64(timerfd_get_remaining(ctx)); t->it_interval = ktime_to_timespec64(ctx->tintv); spin_unlock_irq(&ctx->wqh.lock); return 0; } SYSCALL_DEFINE4(timerfd_settime, int, ufd, int, flags, const struct __kernel_itimerspec __user *, utmr, struct __kernel_itimerspec __user *, otmr) { struct itimerspec64 new, old; int ret; if (get_itimerspec64(&new, utmr)) return -EFAULT; ret = do_timerfd_settime(ufd, flags, &new, &old); if (ret) return ret; if (otmr && put_itimerspec64(&old, otmr)) return -EFAULT; return ret; } SYSCALL_DEFINE2(timerfd_gettime, int, ufd, struct __kernel_itimerspec __user *, otmr) { struct itimerspec64 kotmr; int ret = do_timerfd_gettime(ufd, &kotmr); if (ret) return ret; return put_itimerspec64(&kotmr, otmr) ? -EFAULT : 0; } #ifdef CONFIG_COMPAT_32BIT_TIME SYSCALL_DEFINE4(timerfd_settime32, int, ufd, int, flags, const struct old_itimerspec32 __user *, utmr, struct old_itimerspec32 __user *, otmr) { struct itimerspec64 new, old; int ret; if (get_old_itimerspec32(&new, utmr)) return -EFAULT; ret = do_timerfd_settime(ufd, flags, &new, &old); if (ret) return ret; if (otmr && put_old_itimerspec32(&old, otmr)) return -EFAULT; return ret; } SYSCALL_DEFINE2(timerfd_gettime32, int, ufd, struct old_itimerspec32 __user *, otmr) { struct itimerspec64 kotmr; int ret = do_timerfd_gettime(ufd, &kotmr); if (ret) return ret; return put_old_itimerspec32(&kotmr, otmr) ? -EFAULT : 0; } #endif |
| 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 11 11 11 11 11 11 11 11 11 4 4 4 4 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 | // SPDX-License-Identifier: GPL-2.0 /* * linux/fs/ext4/xattr.c * * Copyright (C) 2001-2003 Andreas Gruenbacher, <agruen@suse.de> * * Fix by Harrison Xing <harrison@mountainviewdata.com>. * Ext4 code with a lot of help from Eric Jarman <ejarman@acm.org>. * Extended attributes for symlinks and special files added per * suggestion of Luka Renko <luka.renko@hermes.si>. * xattr consolidation Copyright (c) 2004 James Morris <jmorris@redhat.com>, * Red Hat Inc. * ea-in-inode support by Alex Tomas <alex@clusterfs.com> aka bzzz * and Andreas Gruenbacher <agruen@suse.de>. */ /* * Extended attributes are stored directly in inodes (on file systems with * inodes bigger than 128 bytes) and on additional disk blocks. The i_file_acl * field contains the block number if an inode uses an additional block. All * attributes must fit in the inode and one additional block. Blocks that * contain the identical set of attributes may be shared among several inodes. * Identical blocks are detected by keeping a cache of blocks that have * recently been accessed. * * The attributes in inodes and on blocks have a different header; the entries * are stored in the same format: * * +------------------+ * | header | * | entry 1 | | * | entry 2 | | growing downwards * | entry 3 | v * | four null bytes | * | . . . | * | value 1 | ^ * | value 3 | | growing upwards * | value 2 | | * +------------------+ * * The header is followed by multiple entry descriptors. In disk blocks, the * entry descriptors are kept sorted. In inodes, they are unsorted. The * attribute values are aligned to the end of the block in no specific order. * * Locking strategy * ---------------- * EXT4_I(inode)->i_file_acl is protected by EXT4_I(inode)->xattr_sem. * EA blocks are only changed if they are exclusive to an inode, so * holding xattr_sem also means that nothing but the EA block's reference * count can change. Multiple writers to the same block are synchronized * by the buffer lock. */ #include <linux/init.h> #include <linux/fs.h> #include <linux/slab.h> #include <linux/mbcache.h> #include <linux/quotaops.h> #include <linux/iversion.h> #include "ext4_jbd2.h" #include "ext4.h" #include "xattr.h" #include "acl.h" #ifdef EXT4_XATTR_DEBUG # define ea_idebug(inode, fmt, ...) \ printk(KERN_DEBUG "inode %s:%lu: " fmt "\n", \ inode->i_sb->s_id, inode->i_ino, ##__VA_ARGS__) # define ea_bdebug(bh, fmt, ...) \ printk(KERN_DEBUG "block %pg:%lu: " fmt "\n", \ bh->b_bdev, (unsigned long)bh->b_blocknr, ##__VA_ARGS__) #else # define ea_idebug(inode, fmt, ...) no_printk(fmt, ##__VA_ARGS__) # define ea_bdebug(bh, fmt, ...) no_printk(fmt, ##__VA_ARGS__) #endif static void ext4_xattr_block_cache_insert(struct mb_cache *, struct buffer_head *); static struct buffer_head * ext4_xattr_block_cache_find(struct inode *, struct ext4_xattr_header *, struct mb_cache_entry **); static __le32 ext4_xattr_hash_entry(char *name, size_t name_len, __le32 *value, size_t value_count); static __le32 ext4_xattr_hash_entry_signed(char *name, size_t name_len, __le32 *value, size_t value_count); static void ext4_xattr_rehash(struct ext4_xattr_header *); static const struct xattr_handler * const ext4_xattr_handler_map[] = { [EXT4_XATTR_INDEX_USER] = &ext4_xattr_user_handler, #ifdef CONFIG_EXT4_FS_POSIX_ACL [EXT4_XATTR_INDEX_POSIX_ACL_ACCESS] = &nop_posix_acl_access, [EXT4_XATTR_INDEX_POSIX_ACL_DEFAULT] = &nop_posix_acl_default, #endif [EXT4_XATTR_INDEX_TRUSTED] = &ext4_xattr_trusted_handler, #ifdef CONFIG_EXT4_FS_SECURITY [EXT4_XATTR_INDEX_SECURITY] = &ext4_xattr_security_handler, #endif [EXT4_XATTR_INDEX_HURD] = &ext4_xattr_hurd_handler, }; const struct xattr_handler * const ext4_xattr_handlers[] = { &ext4_xattr_user_handler, &ext4_xattr_trusted_handler, #ifdef CONFIG_EXT4_FS_SECURITY &ext4_xattr_security_handler, #endif &ext4_xattr_hurd_handler, NULL }; #define EA_BLOCK_CACHE(inode) (((struct ext4_sb_info *) \ inode->i_sb->s_fs_info)->s_ea_block_cache) #define EA_INODE_CACHE(inode) (((struct ext4_sb_info *) \ inode->i_sb->s_fs_info)->s_ea_inode_cache) static int ext4_expand_inode_array(struct ext4_xattr_inode_array **ea_inode_array, struct inode *inode); #ifdef CONFIG_LOCKDEP void ext4_xattr_inode_set_class(struct inode *ea_inode) { struct ext4_inode_info *ei = EXT4_I(ea_inode); lockdep_set_subclass(&ea_inode->i_rwsem, 1); (void) ei; /* shut up clang warning if !CONFIG_LOCKDEP */ lockdep_set_subclass(&ei->i_data_sem, I_DATA_SEM_EA); } #endif static __le32 ext4_xattr_block_csum(struct inode *inode, sector_t block_nr, struct ext4_xattr_header *hdr) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); __u32 csum; __le64 dsk_block_nr = cpu_to_le64(block_nr); __u32 dummy_csum = 0; int offset = offsetof(struct ext4_xattr_header, h_checksum); csum = ext4_chksum(sbi, sbi->s_csum_seed, (__u8 *)&dsk_block_nr, sizeof(dsk_block_nr)); csum = ext4_chksum(sbi, csum, (__u8 *)hdr, offset); csum = ext4_chksum(sbi, csum, (__u8 *)&dummy_csum, sizeof(dummy_csum)); offset += sizeof(dummy_csum); csum = ext4_chksum(sbi, csum, (__u8 *)hdr + offset, EXT4_BLOCK_SIZE(inode->i_sb) - offset); return cpu_to_le32(csum); } static int ext4_xattr_block_csum_verify(struct inode *inode, struct buffer_head *bh) { struct ext4_xattr_header *hdr = BHDR(bh); int ret = 1; if (ext4_has_feature_metadata_csum(inode->i_sb)) { lock_buffer(bh); ret = (hdr->h_checksum == ext4_xattr_block_csum(inode, bh->b_blocknr, hdr)); unlock_buffer(bh); } return ret; } static void ext4_xattr_block_csum_set(struct inode *inode, struct buffer_head *bh) { if (ext4_has_feature_metadata_csum(inode->i_sb)) BHDR(bh)->h_checksum = ext4_xattr_block_csum(inode, bh->b_blocknr, BHDR(bh)); } static inline const char *ext4_xattr_prefix(int name_index, struct dentry *dentry) { const struct xattr_handler *handler = NULL; if (name_index > 0 && name_index < ARRAY_SIZE(ext4_xattr_handler_map)) handler = ext4_xattr_handler_map[name_index]; if (!xattr_handler_can_list(handler, dentry)) return NULL; return xattr_prefix(handler); } static int check_xattrs(struct inode *inode, struct buffer_head *bh, struct ext4_xattr_entry *entry, void *end, void *value_start, const char *function, unsigned int line) { struct ext4_xattr_entry *e = entry; int err = -EFSCORRUPTED; char *err_str; if (bh) { if (BHDR(bh)->h_magic != cpu_to_le32(EXT4_XATTR_MAGIC) || BHDR(bh)->h_blocks != cpu_to_le32(1)) { err_str = "invalid header"; goto errout; } if (buffer_verified(bh)) return 0; if (!ext4_xattr_block_csum_verify(inode, bh)) { err = -EFSBADCRC; err_str = "invalid checksum"; goto errout; } } else { struct ext4_xattr_ibody_header *header = value_start; header -= 1; if (end - (void *)header < sizeof(*header) + sizeof(u32)) { err_str = "in-inode xattr block too small"; goto errout; } if (header->h_magic != cpu_to_le32(EXT4_XATTR_MAGIC)) { err_str = "bad magic number in in-inode xattr"; goto errout; } } /* Find the end of the names list */ while (!IS_LAST_ENTRY(e)) { struct ext4_xattr_entry *next = EXT4_XATTR_NEXT(e); if ((void *)next >= end) { err_str = "e_name out of bounds"; goto errout; } if (strnlen(e->e_name, e->e_name_len) != e->e_name_len) { err_str = "bad e_name length"; goto errout; } e = next; } /* Check the values */ while (!IS_LAST_ENTRY(entry)) { u32 size = le32_to_cpu(entry->e_value_size); unsigned long ea_ino = le32_to_cpu(entry->e_value_inum); if (!ext4_has_feature_ea_inode(inode->i_sb) && ea_ino) { err_str = "ea_inode specified without ea_inode feature enabled"; goto errout; } if (ea_ino && ((ea_ino == EXT4_ROOT_INO) || !ext4_valid_inum(inode->i_sb, ea_ino))) { err_str = "invalid ea_ino"; goto errout; } if (size > EXT4_XATTR_SIZE_MAX) { err_str = "e_value size too large"; goto errout; } if (size != 0 && entry->e_value_inum == 0) { u16 offs = le16_to_cpu(entry->e_value_offs); void *value; /* * The value cannot overlap the names, and the value * with padding cannot extend beyond 'end'. Check both * the padded and unpadded sizes, since the size may * overflow to 0 when adding padding. */ if (offs > end - value_start) { err_str = "e_value out of bounds"; goto errout; } value = value_start + offs; if (value < (void *)e + sizeof(u32) || size > end - value || EXT4_XATTR_SIZE(size) > end - value) { err_str = "overlapping e_value "; goto errout; } } entry = EXT4_XATTR_NEXT(entry); } if (bh) set_buffer_verified(bh); return 0; errout: if (bh) __ext4_error_inode(inode, function, line, 0, -err, "corrupted xattr block %llu: %s", (unsigned long long) bh->b_blocknr, err_str); else __ext4_error_inode(inode, function, line, 0, -err, "corrupted in-inode xattr: %s", err_str); return err; } static inline int __ext4_xattr_check_block(struct inode *inode, struct buffer_head *bh, const char *function, unsigned int line) { return check_xattrs(inode, bh, BFIRST(bh), bh->b_data + bh->b_size, bh->b_data, function, line); } #define ext4_xattr_check_block(inode, bh) \ __ext4_xattr_check_block((inode), (bh), __func__, __LINE__) int __xattr_check_inode(struct inode *inode, struct ext4_xattr_ibody_header *header, void *end, const char *function, unsigned int line) { return check_xattrs(inode, NULL, IFIRST(header), end, IFIRST(header), function, line); } static int xattr_find_entry(struct inode *inode, struct ext4_xattr_entry **pentry, void *end, int name_index, const char *name, int sorted) { struct ext4_xattr_entry *entry, *next; size_t name_len; int cmp = 1; if (name == NULL) return -EINVAL; name_len = strlen(name); for (entry = *pentry; !IS_LAST_ENTRY(entry); entry = next) { next = EXT4_XATTR_NEXT(entry); if ((void *) next >= end) { EXT4_ERROR_INODE(inode, "corrupted xattr entries"); return -EFSCORRUPTED; } cmp = name_index - entry->e_name_index; if (!cmp) cmp = name_len - entry->e_name_len; if (!cmp) cmp = memcmp(name, entry->e_name, name_len); if (cmp <= 0 && (sorted || cmp == 0)) break; } *pentry = entry; return cmp ? -ENODATA : 0; } static u32 ext4_xattr_inode_hash(struct ext4_sb_info *sbi, const void *buffer, size_t size) { return ext4_chksum(sbi, sbi->s_csum_seed, buffer, size); } static u64 ext4_xattr_inode_get_ref(struct inode *ea_inode) { return ((u64) inode_get_ctime_sec(ea_inode) << 32) | (u32) inode_peek_iversion_raw(ea_inode); } static void ext4_xattr_inode_set_ref(struct inode *ea_inode, u64 ref_count) { inode_set_ctime(ea_inode, (u32)(ref_count >> 32), 0); inode_set_iversion_raw(ea_inode, ref_count & 0xffffffff); } static u32 ext4_xattr_inode_get_hash(struct inode *ea_inode) { return (u32) inode_get_atime_sec(ea_inode); } static void ext4_xattr_inode_set_hash(struct inode *ea_inode, u32 hash) { inode_set_atime(ea_inode, hash, 0); } /* * Read the EA value from an inode. */ static int ext4_xattr_inode_read(struct inode *ea_inode, void *buf, size_t size) { int blocksize = 1 << ea_inode->i_blkbits; int bh_count = (size + blocksize - 1) >> ea_inode->i_blkbits; int tail_size = (size % blocksize) ?: blocksize; struct buffer_head *bhs_inline[8]; struct buffer_head **bhs = bhs_inline; int i, ret; if (bh_count > ARRAY_SIZE(bhs_inline)) { bhs = kmalloc_array(bh_count, sizeof(*bhs), GFP_NOFS); if (!bhs) return -ENOMEM; } ret = ext4_bread_batch(ea_inode, 0 /* block */, bh_count, true /* wait */, bhs); if (ret) goto free_bhs; for (i = 0; i < bh_count; i++) { /* There shouldn't be any holes in ea_inode. */ if (!bhs[i]) { ret = -EFSCORRUPTED; goto put_bhs; } memcpy((char *)buf + blocksize * i, bhs[i]->b_data, i < bh_count - 1 ? blocksize : tail_size); } ret = 0; put_bhs: for (i = 0; i < bh_count; i++) brelse(bhs[i]); free_bhs: if (bhs != bhs_inline) kfree(bhs); return ret; } #define EXT4_XATTR_INODE_GET_PARENT(inode) ((__u32)(inode_get_mtime_sec(inode))) static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, u32 ea_inode_hash, struct inode **ea_inode) { struct inode *inode; int err; /* * We have to check for this corruption early as otherwise * iget_locked() could wait indefinitely for the state of our * parent inode. */ if (parent->i_ino == ea_ino) { ext4_error(parent->i_sb, "Parent and EA inode have the same ino %lu", ea_ino); return -EFSCORRUPTED; } inode = ext4_iget(parent->i_sb, ea_ino, EXT4_IGET_EA_INODE); if (IS_ERR(inode)) { err = PTR_ERR(inode); ext4_error(parent->i_sb, "error while reading EA inode %lu err=%d", ea_ino, err); return err; } ext4_xattr_inode_set_class(inode); /* * Check whether this is an old Lustre-style xattr inode. Lustre * implementation does not have hash validation, rather it has a * backpointer from ea_inode to the parent inode. */ if (ea_inode_hash != ext4_xattr_inode_get_hash(inode) && EXT4_XATTR_INODE_GET_PARENT(inode) == parent->i_ino && inode->i_generation == parent->i_generation) { ext4_set_inode_state(inode, EXT4_STATE_LUSTRE_EA_INODE); ext4_xattr_inode_set_ref(inode, 1); } else { inode_lock_nested(inode, I_MUTEX_XATTR); inode->i_flags |= S_NOQUOTA; inode_unlock(inode); } *ea_inode = inode; return 0; } /* Remove entry from mbcache when EA inode is getting evicted */ void ext4_evict_ea_inode(struct inode *inode) { struct mb_cache_entry *oe; if (!EA_INODE_CACHE(inode)) return; /* Wait for entry to get unused so that we can remove it */ while ((oe = mb_cache_entry_delete_or_get(EA_INODE_CACHE(inode), ext4_xattr_inode_get_hash(inode), inode->i_ino))) { mb_cache_entry_wait_unused(oe); mb_cache_entry_put(EA_INODE_CACHE(inode), oe); } } static int ext4_xattr_inode_verify_hashes(struct inode *ea_inode, struct ext4_xattr_entry *entry, void *buffer, size_t size) { u32 hash; /* Verify stored hash matches calculated hash. */ hash = ext4_xattr_inode_hash(EXT4_SB(ea_inode->i_sb), buffer, size); if (hash != ext4_xattr_inode_get_hash(ea_inode)) return -EFSCORRUPTED; if (entry) { __le32 e_hash, tmp_data; /* Verify entry hash. */ tmp_data = cpu_to_le32(hash); e_hash = ext4_xattr_hash_entry(entry->e_name, entry->e_name_len, &tmp_data, 1); /* All good? */ if (e_hash == entry->e_hash) return 0; /* * Not good. Maybe the entry hash was calculated * using the buggy signed char version? */ e_hash = ext4_xattr_hash_entry_signed(entry->e_name, entry->e_name_len, &tmp_data, 1); /* Still no match - bad */ if (e_hash != entry->e_hash) return -EFSCORRUPTED; /* Let people know about old hash */ pr_warn_once("ext4: filesystem with signed xattr name hash"); } return 0; } /* * Read xattr value from the EA inode. */ static int ext4_xattr_inode_get(struct inode *inode, struct ext4_xattr_entry *entry, void *buffer, size_t size) { struct mb_cache *ea_inode_cache = EA_INODE_CACHE(inode); struct inode *ea_inode; int err; err = ext4_xattr_inode_iget(inode, le32_to_cpu(entry->e_value_inum), le32_to_cpu(entry->e_hash), &ea_inode); if (err) { ea_inode = NULL; goto out; } if (i_size_read(ea_inode) != size) { ext4_warning_inode(ea_inode, "ea_inode file size=%llu entry size=%zu", i_size_read(ea_inode), size); err = -EFSCORRUPTED; goto out; } err = ext4_xattr_inode_read(ea_inode, buffer, size); if (err) goto out; if (!ext4_test_inode_state(ea_inode, EXT4_STATE_LUSTRE_EA_INODE)) { err = ext4_xattr_inode_verify_hashes(ea_inode, entry, buffer, size); if (err) { ext4_warning_inode(ea_inode, "EA inode hash validation failed"); goto out; } if (ea_inode_cache) mb_cache_entry_create(ea_inode_cache, GFP_NOFS, ext4_xattr_inode_get_hash(ea_inode), ea_inode->i_ino, true /* reusable */); } out: iput(ea_inode); return err; } static int ext4_xattr_block_get(struct inode *inode, int name_index, const char *name, void *buffer, size_t buffer_size) { struct buffer_head *bh = NULL; struct ext4_xattr_entry *entry; size_t size; void *end; int error; struct mb_cache *ea_block_cache = EA_BLOCK_CACHE(inode); ea_idebug(inode, "name=%d.%s, buffer=%p, buffer_size=%ld", name_index, name, buffer, (long)buffer_size); if (!EXT4_I(inode)->i_file_acl) return -ENODATA; ea_idebug(inode, "reading block %llu", (unsigned long long)EXT4_I(inode)->i_file_acl); bh = ext4_sb_bread(inode->i_sb, EXT4_I(inode)->i_file_acl, REQ_PRIO); if (IS_ERR(bh)) return PTR_ERR(bh); ea_bdebug(bh, "b_count=%d, refcount=%d", atomic_read(&(bh->b_count)), le32_to_cpu(BHDR(bh)->h_refcount)); error = ext4_xattr_check_block(inode, bh); if (error) goto cleanup; ext4_xattr_block_cache_insert(ea_block_cache, bh); entry = BFIRST(bh); end = bh->b_data + bh->b_size; error = xattr_find_entry(inode, &entry, end, name_index, name, 1); if (error) goto cleanup; size = le32_to_cpu(entry->e_value_size); error = -ERANGE; if (unlikely(size > EXT4_XATTR_SIZE_MAX)) goto cleanup; if (buffer) { if (size > buffer_size) goto cleanup; if (entry->e_value_inum) { error = ext4_xattr_inode_get(inode, entry, buffer, size); if (error) goto cleanup; } else { u16 offset = le16_to_cpu(entry->e_value_offs); void *p = bh->b_data + offset; if (unlikely(p + size > end)) goto cleanup; memcpy(buffer, p, size); } } error = size; cleanup: brelse(bh); return error; } int ext4_xattr_ibody_get(struct inode *inode, int name_index, const char *name, void *buffer, size_t buffer_size) { struct ext4_xattr_ibody_header *header; struct ext4_xattr_entry *entry; struct ext4_inode *raw_inode; struct ext4_iloc iloc; size_t size; void *end; int error; if (!ext4_test_inode_state(inode, EXT4_STATE_XATTR)) return -ENODATA; error = ext4_get_inode_loc(inode, &iloc); if (error) return error; raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); end = ITAIL(inode, raw_inode); entry = IFIRST(header); error = xattr_find_entry(inode, &entry, end, name_index, name, 0); if (error) goto cleanup; size = le32_to_cpu(entry->e_value_size); error = -ERANGE; if (unlikely(size > EXT4_XATTR_SIZE_MAX)) goto cleanup; if (buffer) { if (size > buffer_size) goto cleanup; if (entry->e_value_inum) { error = ext4_xattr_inode_get(inode, entry, buffer, size); if (error) goto cleanup; } else { u16 offset = le16_to_cpu(entry->e_value_offs); void *p = (void *)IFIRST(header) + offset; if (unlikely(p + size > end)) goto cleanup; memcpy(buffer, p, size); } } error = size; cleanup: brelse(iloc.bh); return error; } /* * ext4_xattr_get() * * Copy an extended attribute into the buffer * provided, or compute the buffer size required. * Buffer is NULL to compute the size of the buffer required. * * Returns a negative error number on failure, or the number of bytes * used / required on success. */ int ext4_xattr_get(struct inode *inode, int name_index, const char *name, void *buffer, size_t buffer_size) { int error; if (unlikely(ext4_forced_shutdown(inode->i_sb))) return -EIO; if (strlen(name) > 255) return -ERANGE; down_read(&EXT4_I(inode)->xattr_sem); error = ext4_xattr_ibody_get(inode, name_index, name, buffer, buffer_size); if (error == -ENODATA) error = ext4_xattr_block_get(inode, name_index, name, buffer, buffer_size); up_read(&EXT4_I(inode)->xattr_sem); return error; } static int ext4_xattr_list_entries(struct dentry *dentry, struct ext4_xattr_entry *entry, char *buffer, size_t buffer_size) { size_t rest = buffer_size; for (; !IS_LAST_ENTRY(entry); entry = EXT4_XATTR_NEXT(entry)) { const char *prefix; prefix = ext4_xattr_prefix(entry->e_name_index, dentry); if (prefix) { size_t prefix_len = strlen(prefix); size_t size = prefix_len + entry->e_name_len + 1; if (buffer) { if (size > rest) return -ERANGE; memcpy(buffer, prefix, prefix_len); buffer += prefix_len; memcpy(buffer, entry->e_name, entry->e_name_len); buffer += entry->e_name_len; *buffer++ = 0; } rest -= size; } } return buffer_size - rest; /* total size */ } static int ext4_xattr_block_list(struct dentry *dentry, char *buffer, size_t buffer_size) { struct inode *inode = d_inode(dentry); struct buffer_head *bh = NULL; int error; ea_idebug(inode, "buffer=%p, buffer_size=%ld", buffer, (long)buffer_size); if (!EXT4_I(inode)->i_file_acl) return 0; ea_idebug(inode, "reading block %llu", (unsigned long long)EXT4_I(inode)->i_file_acl); bh = ext4_sb_bread(inode->i_sb, EXT4_I(inode)->i_file_acl, REQ_PRIO); if (IS_ERR(bh)) return PTR_ERR(bh); ea_bdebug(bh, "b_count=%d, refcount=%d", atomic_read(&(bh->b_count)), le32_to_cpu(BHDR(bh)->h_refcount)); error = ext4_xattr_check_block(inode, bh); if (error) goto cleanup; ext4_xattr_block_cache_insert(EA_BLOCK_CACHE(inode), bh); error = ext4_xattr_list_entries(dentry, BFIRST(bh), buffer, buffer_size); cleanup: brelse(bh); return error; } static int ext4_xattr_ibody_list(struct dentry *dentry, char *buffer, size_t buffer_size) { struct inode *inode = d_inode(dentry); struct ext4_xattr_ibody_header *header; struct ext4_inode *raw_inode; struct ext4_iloc iloc; int error; if (!ext4_test_inode_state(inode, EXT4_STATE_XATTR)) return 0; error = ext4_get_inode_loc(inode, &iloc); if (error) return error; raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); error = ext4_xattr_list_entries(dentry, IFIRST(header), buffer, buffer_size); brelse(iloc.bh); return error; } /* * Inode operation listxattr() * * d_inode(dentry)->i_rwsem: don't care * * Copy a list of attribute names into the buffer * provided, or compute the buffer size required. * Buffer is NULL to compute the size of the buffer required. * * Returns a negative error number on failure, or the number of bytes * used / required on success. */ ssize_t ext4_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size) { int ret, ret2; down_read(&EXT4_I(d_inode(dentry))->xattr_sem); ret = ret2 = ext4_xattr_ibody_list(dentry, buffer, buffer_size); if (ret < 0) goto errout; if (buffer) { buffer += ret; buffer_size -= ret; } ret = ext4_xattr_block_list(dentry, buffer, buffer_size); if (ret < 0) goto errout; ret += ret2; errout: up_read(&EXT4_I(d_inode(dentry))->xattr_sem); return ret; } /* * If the EXT4_FEATURE_COMPAT_EXT_ATTR feature of this file system is * not set, set it. */ static void ext4_xattr_update_super_block(handle_t *handle, struct super_block *sb) { if (ext4_has_feature_xattr(sb)) return; BUFFER_TRACE(EXT4_SB(sb)->s_sbh, "get_write_access"); if (ext4_journal_get_write_access(handle, sb, EXT4_SB(sb)->s_sbh, EXT4_JTR_NONE) == 0) { lock_buffer(EXT4_SB(sb)->s_sbh); ext4_set_feature_xattr(sb); ext4_superblock_csum_set(sb); unlock_buffer(EXT4_SB(sb)->s_sbh); ext4_handle_dirty_metadata(handle, NULL, EXT4_SB(sb)->s_sbh); } } int ext4_get_inode_usage(struct inode *inode, qsize_t *usage) { struct ext4_iloc iloc = { .bh = NULL }; struct buffer_head *bh = NULL; struct ext4_inode *raw_inode; struct ext4_xattr_ibody_header *header; struct ext4_xattr_entry *entry; qsize_t ea_inode_refs = 0; int ret; lockdep_assert_held_read(&EXT4_I(inode)->xattr_sem); if (ext4_test_inode_state(inode, EXT4_STATE_XATTR)) { ret = ext4_get_inode_loc(inode, &iloc); if (ret) goto out; raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); for (entry = IFIRST(header); !IS_LAST_ENTRY(entry); entry = EXT4_XATTR_NEXT(entry)) if (entry->e_value_inum) ea_inode_refs++; } if (EXT4_I(inode)->i_file_acl) { bh = ext4_sb_bread(inode->i_sb, EXT4_I(inode)->i_file_acl, REQ_PRIO); if (IS_ERR(bh)) { ret = PTR_ERR(bh); bh = NULL; goto out; } ret = ext4_xattr_check_block(inode, bh); if (ret) goto out; for (entry = BFIRST(bh); !IS_LAST_ENTRY(entry); entry = EXT4_XATTR_NEXT(entry)) if (entry->e_value_inum) ea_inode_refs++; } *usage = ea_inode_refs + 1; ret = 0; out: brelse(iloc.bh); brelse(bh); return ret; } static inline size_t round_up_cluster(struct inode *inode, size_t length) { struct super_block *sb = inode->i_sb; size_t cluster_size = 1 << (EXT4_SB(sb)->s_cluster_bits + inode->i_blkbits); size_t mask = ~(cluster_size - 1); return (length + cluster_size - 1) & mask; } static int ext4_xattr_inode_alloc_quota(struct inode *inode, size_t len) { int err; err = dquot_alloc_inode(inode); if (err) return err; err = dquot_alloc_space_nodirty(inode, round_up_cluster(inode, len)); if (err) dquot_free_inode(inode); return err; } static void ext4_xattr_inode_free_quota(struct inode *parent, struct inode *ea_inode, size_t len) { if (ea_inode && ext4_test_inode_state(ea_inode, EXT4_STATE_LUSTRE_EA_INODE)) return; dquot_free_space_nodirty(parent, round_up_cluster(parent, len)); dquot_free_inode(parent); } int __ext4_xattr_set_credits(struct super_block *sb, struct inode *inode, struct buffer_head *block_bh, size_t value_len, bool is_create) { int credits; int blocks; /* * 1) Owner inode update * 2) Ref count update on old xattr block * 3) new xattr block * 4) block bitmap update for new xattr block * 5) group descriptor for new xattr block * 6) block bitmap update for old xattr block * 7) group descriptor for old block * * 6 & 7 can happen if we have two racing threads T_a and T_b * which are each trying to set an xattr on inodes I_a and I_b * which were both initially sharing an xattr block. */ credits = 7; /* Quota updates. */ credits += EXT4_MAXQUOTAS_TRANS_BLOCKS(sb); /* * In case of inline data, we may push out the data to a block, * so we need to reserve credits for this eventuality */ if (inode && ext4_has_inline_data(inode)) credits += ext4_writepage_trans_blocks(inode) + 1; /* We are done if ea_inode feature is not enabled. */ if (!ext4_has_feature_ea_inode(sb)) return credits; /* New ea_inode, inode map, block bitmap, group descriptor. */ credits += 4; /* Data blocks. */ blocks = (value_len + sb->s_blocksize - 1) >> sb->s_blocksize_bits; /* Indirection block or one level of extent tree. */ blocks += 1; /* Block bitmap and group descriptor updates for each block. */ credits += blocks * 2; /* Blocks themselves. */ credits += blocks; if (!is_create) { /* Dereference ea_inode holding old xattr value. * Old ea_inode, inode map, block bitmap, group descriptor. */ credits += 4; /* Data blocks for old ea_inode. */ blocks = XATTR_SIZE_MAX >> sb->s_blocksize_bits; /* Indirection block or one level of extent tree for old * ea_inode. */ blocks += 1; /* Block bitmap and group descriptor updates for each block. */ credits += blocks * 2; } /* We may need to clone the existing xattr block in which case we need * to increment ref counts for existing ea_inodes referenced by it. */ if (block_bh) { struct ext4_xattr_entry *entry = BFIRST(block_bh); for (; !IS_LAST_ENTRY(entry); entry = EXT4_XATTR_NEXT(entry)) if (entry->e_value_inum) /* Ref count update on ea_inode. */ credits += 1; } return credits; } static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, int ref_change) { struct ext4_iloc iloc; s64 ref_count; int ret; inode_lock_nested(ea_inode, I_MUTEX_XATTR); ret = ext4_reserve_inode_write(handle, ea_inode, &iloc); if (ret) goto out; ref_count = ext4_xattr_inode_get_ref(ea_inode); ref_count += ref_change; ext4_xattr_inode_set_ref(ea_inode, ref_count); if (ref_change > 0) { WARN_ONCE(ref_count <= 0, "EA inode %lu ref_count=%lld", ea_inode->i_ino, ref_count); if (ref_count == 1) { WARN_ONCE(ea_inode->i_nlink, "EA inode %lu i_nlink=%u", ea_inode->i_ino, ea_inode->i_nlink); set_nlink(ea_inode, 1); ext4_orphan_del(handle, ea_inode); } } else { WARN_ONCE(ref_count < 0, "EA inode %lu ref_count=%lld", ea_inode->i_ino, ref_count); if (ref_count == 0) { WARN_ONCE(ea_inode->i_nlink != 1, "EA inode %lu i_nlink=%u", ea_inode->i_ino, ea_inode->i_nlink); clear_nlink(ea_inode); ext4_orphan_add(handle, ea_inode); } } ret = ext4_mark_iloc_dirty(handle, ea_inode, &iloc); if (ret) ext4_warning_inode(ea_inode, "ext4_mark_iloc_dirty() failed ret=%d", ret); out: inode_unlock(ea_inode); return ret; } static int ext4_xattr_inode_inc_ref(handle_t *handle, struct inode *ea_inode) { return ext4_xattr_inode_update_ref(handle, ea_inode, 1); } static int ext4_xattr_inode_dec_ref(handle_t *handle, struct inode *ea_inode) { return ext4_xattr_inode_update_ref(handle, ea_inode, -1); } static int ext4_xattr_inode_inc_ref_all(handle_t *handle, struct inode *parent, struct ext4_xattr_entry *first) { struct inode *ea_inode; struct ext4_xattr_entry *entry; struct ext4_xattr_entry *failed_entry; unsigned int ea_ino; int err, saved_err; for (entry = first; !IS_LAST_ENTRY(entry); entry = EXT4_XATTR_NEXT(entry)) { if (!entry->e_value_inum) continue; ea_ino = le32_to_cpu(entry->e_value_inum); err = ext4_xattr_inode_iget(parent, ea_ino, le32_to_cpu(entry->e_hash), &ea_inode); if (err) goto cleanup; err = ext4_xattr_inode_inc_ref(handle, ea_inode); if (err) { ext4_warning_inode(ea_inode, "inc ref error %d", err); iput(ea_inode); goto cleanup; } iput(ea_inode); } return 0; cleanup: saved_err = err; failed_entry = entry; for (entry = first; entry != failed_entry; entry = EXT4_XATTR_NEXT(entry)) { if (!entry->e_value_inum) continue; ea_ino = le32_to_cpu(entry->e_value_inum); err = ext4_xattr_inode_iget(parent, ea_ino, le32_to_cpu(entry->e_hash), &ea_inode); if (err) { ext4_warning(parent->i_sb, "cleanup ea_ino %u iget error %d", ea_ino, err); continue; } err = ext4_xattr_inode_dec_ref(handle, ea_inode); if (err) ext4_warning_inode(ea_inode, "cleanup dec ref error %d", err); iput(ea_inode); } return saved_err; } static int ext4_xattr_restart_fn(handle_t *handle, struct inode *inode, struct buffer_head *bh, bool block_csum, bool dirty) { int error; if (bh && dirty) { if (block_csum) ext4_xattr_block_csum_set(inode, bh); error = ext4_handle_dirty_metadata(handle, NULL, bh); if (error) { ext4_warning(inode->i_sb, "Handle metadata (error %d)", error); return error; } } return 0; } static void ext4_xattr_inode_dec_ref_all(handle_t *handle, struct inode *parent, struct buffer_head *bh, struct ext4_xattr_entry *first, bool block_csum, struct ext4_xattr_inode_array **ea_inode_array, int extra_credits, bool skip_quota) { struct inode *ea_inode; struct ext4_xattr_entry *entry; struct ext4_iloc iloc; bool dirty = false; unsigned int ea_ino; int err; int credits; void *end; if (block_csum) end = (void *)bh->b_data + bh->b_size; else { ext4_get_inode_loc(parent, &iloc); end = (void *)ext4_raw_inode(&iloc) + EXT4_SB(parent->i_sb)->s_inode_size; } /* One credit for dec ref on ea_inode, one for orphan list addition, */ credits = 2 + extra_credits; for (entry = first; (void *)entry < end && !IS_LAST_ENTRY(entry); entry = EXT4_XATTR_NEXT(entry)) { if (!entry->e_value_inum) continue; ea_ino = le32_to_cpu(entry->e_value_inum); err = ext4_xattr_inode_iget(parent, ea_ino, le32_to_cpu(entry->e_hash), &ea_inode); if (err) continue; err = ext4_expand_inode_array(ea_inode_array, ea_inode); if (err) { ext4_warning_inode(ea_inode, "Expand inode array err=%d", err); iput(ea_inode); continue; } err = ext4_journal_ensure_credits_fn(handle, credits, credits, ext4_free_metadata_revoke_credits(parent->i_sb, 1), ext4_xattr_restart_fn(handle, parent, bh, block_csum, dirty)); if (err < 0) { ext4_warning_inode(ea_inode, "Ensure credits err=%d", err); continue; } if (err > 0) { err = ext4_journal_get_write_access(handle, parent->i_sb, bh, EXT4_JTR_NONE); if (err) { ext4_warning_inode(ea_inode, "Re-get write access err=%d", err); continue; } } err = ext4_xattr_inode_dec_ref(handle, ea_inode); if (err) { ext4_warning_inode(ea_inode, "ea_inode dec ref err=%d", err); continue; } if (!skip_quota) ext4_xattr_inode_free_quota(parent, ea_inode, le32_to_cpu(entry->e_value_size)); /* * Forget about ea_inode within the same transaction that * decrements the ref count. This avoids duplicate decrements in * case the rest of the work spills over to subsequent * transactions. */ entry->e_value_inum = 0; entry->e_value_size = 0; dirty = true; } if (dirty) { /* * Note that we are deliberately skipping csum calculation for * the final update because we do not expect any journal * restarts until xattr block is freed. */ err = ext4_handle_dirty_metadata(handle, NULL, bh); if (err) ext4_warning_inode(parent, "handle dirty metadata err=%d", err); } } /* * Release the xattr block BH: If the reference count is > 1, decrement it; * otherwise free the block. */ static void ext4_xattr_release_block(handle_t *handle, struct inode *inode, struct buffer_head *bh, struct ext4_xattr_inode_array **ea_inode_array, int extra_credits) { struct mb_cache *ea_block_cache = EA_BLOCK_CACHE(inode); u32 hash, ref; int error = 0; BUFFER_TRACE(bh, "get_write_access"); error = ext4_journal_get_write_access(handle, inode->i_sb, bh, EXT4_JTR_NONE); if (error) goto out; retry_ref: lock_buffer(bh); hash = le32_to_cpu(BHDR(bh)->h_hash); ref = le32_to_cpu(BHDR(bh)->h_refcount); if (ref == 1) { ea_bdebug(bh, "refcount now=0; freeing"); /* * This must happen under buffer lock for * ext4_xattr_block_set() to reliably detect freed block */ if (ea_block_cache) { struct mb_cache_entry *oe; oe = mb_cache_entry_delete_or_get(ea_block_cache, hash, bh->b_blocknr); if (oe) { unlock_buffer(bh); mb_cache_entry_wait_unused(oe); mb_cache_entry_put(ea_block_cache, oe); goto retry_ref; } } get_bh(bh); unlock_buffer(bh); if (ext4_has_feature_ea_inode(inode->i_sb)) ext4_xattr_inode_dec_ref_all(handle, inode, bh, BFIRST(bh), true /* block_csum */, ea_inode_array, extra_credits, true /* skip_quota */); ext4_free_blocks(handle, inode, bh, 0, 1, EXT4_FREE_BLOCKS_METADATA | EXT4_FREE_BLOCKS_FORGET); } else { ref--; BHDR(bh)->h_refcount = cpu_to_le32(ref); if (ref == EXT4_XATTR_REFCOUNT_MAX - 1) { struct mb_cache_entry *ce; if (ea_block_cache) { ce = mb_cache_entry_get(ea_block_cache, hash, bh->b_blocknr); if (ce) { set_bit(MBE_REUSABLE_B, &ce->e_flags); mb_cache_entry_put(ea_block_cache, ce); } } } ext4_xattr_block_csum_set(inode, bh); /* * Beware of this ugliness: Releasing of xattr block references * from different inodes can race and so we have to protect * from a race where someone else frees the block (and releases * its journal_head) before we are done dirtying the buffer. In * nojournal mode this race is harmless and we actually cannot * call ext4_handle_dirty_metadata() with locked buffer as * that function can call sync_dirty_buffer() so for that case * we handle the dirtying after unlocking the buffer. */ if (ext4_handle_valid(handle)) error = ext4_handle_dirty_metadata(handle, inode, bh); unlock_buffer(bh); if (!ext4_handle_valid(handle)) error = ext4_handle_dirty_metadata(handle, inode, bh); if (IS_SYNC(inode)) ext4_handle_sync(handle); dquot_free_block(inode, EXT4_C2B(EXT4_SB(inode->i_sb), 1)); ea_bdebug(bh, "refcount now=%d; releasing", le32_to_cpu(BHDR(bh)->h_refcount)); } out: ext4_std_error(inode->i_sb, error); return; } /* * Find the available free space for EAs. This also returns the total number of * bytes used by EA entries. */ static size_t ext4_xattr_free_space(struct ext4_xattr_entry *last, size_t *min_offs, void *base, int *total) { for (; !IS_LAST_ENTRY(last); last = EXT4_XATTR_NEXT(last)) { if (!last->e_value_inum && last->e_value_size) { size_t offs = le16_to_cpu(last->e_value_offs); if (offs < *min_offs) *min_offs = offs; } if (total) *total += EXT4_XATTR_LEN(last->e_name_len); } return (*min_offs - ((void *)last - base) - sizeof(__u32)); } /* * Write the value of the EA in an inode. */ static int ext4_xattr_inode_write(handle_t *handle, struct inode *ea_inode, const void *buf, int bufsize) { struct buffer_head *bh = NULL; unsigned long block = 0; int blocksize = ea_inode->i_sb->s_blocksize; int max_blocks = (bufsize + blocksize - 1) >> ea_inode->i_blkbits; int csize, wsize = 0; int ret = 0, ret2 = 0; int retries = 0; retry: while (ret >= 0 && ret < max_blocks) { struct ext4_map_blocks map; map.m_lblk = block += ret; map.m_len = max_blocks -= ret; ret = ext4_map_blocks(handle, ea_inode, &map, EXT4_GET_BLOCKS_CREATE); if (ret <= 0) { ext4_mark_inode_dirty(handle, ea_inode); if (ret == -ENOSPC && ext4_should_retry_alloc(ea_inode->i_sb, &retries)) { ret = 0; goto retry; } break; } } if (ret < 0) return ret; block = 0; while (wsize < bufsize) { brelse(bh); csize = (bufsize - wsize) > blocksize ? blocksize : bufsize - wsize; bh = ext4_getblk(handle, ea_inode, block, 0); if (IS_ERR(bh)) return PTR_ERR(bh); if (!bh) { WARN_ON_ONCE(1); EXT4_ERROR_INODE(ea_inode, "ext4_getblk() return bh = NULL"); return -EFSCORRUPTED; } ret = ext4_journal_get_write_access(handle, ea_inode->i_sb, bh, EXT4_JTR_NONE); if (ret) goto out; memcpy(bh->b_data, buf, csize); /* * Zero out block tail to avoid writing uninitialized memory * to disk. */ if (csize < blocksize) memset(bh->b_data + csize, 0, blocksize - csize); set_buffer_uptodate(bh); ext4_handle_dirty_metadata(handle, ea_inode, bh); buf += csize; wsize += csize; block += 1; } inode_lock(ea_inode); i_size_write(ea_inode, wsize); ext4_update_i_disksize(ea_inode, wsize); inode_unlock(ea_inode); ret2 = ext4_mark_inode_dirty(handle, ea_inode); if (unlikely(ret2 && !ret)) ret = ret2; out: brelse(bh); return ret; } /* * Create an inode to store the value of a large EA. */ static struct inode *ext4_xattr_inode_create(handle_t *handle, struct inode *inode, u32 hash) { struct inode *ea_inode = NULL; uid_t owner[2] = { i_uid_read(inode), i_gid_read(inode) }; int err; if (inode->i_sb->s_root == NULL) { ext4_warning(inode->i_sb, "refuse to create EA inode when umounting"); WARN_ON(1); return ERR_PTR(-EINVAL); } /* * Let the next inode be the goal, so we try and allocate the EA inode * in the same group, or nearby one. */ ea_inode = ext4_new_inode(handle, inode->i_sb->s_root->d_inode, S_IFREG | 0600, NULL, inode->i_ino + 1, owner, EXT4_EA_INODE_FL); if (!IS_ERR(ea_inode)) { ea_inode->i_op = &ext4_file_inode_operations; ea_inode->i_fop = &ext4_file_operations; ext4_set_aops(ea_inode); ext4_xattr_inode_set_class(ea_inode); unlock_new_inode(ea_inode); ext4_xattr_inode_set_ref(ea_inode, 1); ext4_xattr_inode_set_hash(ea_inode, hash); err = ext4_mark_inode_dirty(handle, ea_inode); if (!err) err = ext4_inode_attach_jinode(ea_inode); if (err) { if (ext4_xattr_inode_dec_ref(handle, ea_inode)) ext4_warning_inode(ea_inode, "cleanup dec ref error %d", err); iput(ea_inode); return ERR_PTR(err); } /* * Xattr inodes are shared therefore quota charging is performed * at a higher level. */ dquot_free_inode(ea_inode); dquot_drop(ea_inode); inode_lock(ea_inode); ea_inode->i_flags |= S_NOQUOTA; inode_unlock(ea_inode); } return ea_inode; } static struct inode * ext4_xattr_inode_cache_find(struct inode *inode, const void *value, size_t value_len, u32 hash) { struct inode *ea_inode; struct mb_cache_entry *ce; struct mb_cache *ea_inode_cache = EA_INODE_CACHE(inode); void *ea_data; if (!ea_inode_cache) return NULL; ce = mb_cache_entry_find_first(ea_inode_cache, hash); if (!ce) return NULL; WARN_ON_ONCE(ext4_handle_valid(journal_current_handle()) && !(current->flags & PF_MEMALLOC_NOFS)); ea_data = kvmalloc(value_len, GFP_KERNEL); if (!ea_data) { mb_cache_entry_put(ea_inode_cache, ce); return NULL; } while (ce) { ea_inode = ext4_iget(inode->i_sb, ce->e_value, EXT4_IGET_EA_INODE); if (IS_ERR(ea_inode)) goto next_entry; ext4_xattr_inode_set_class(ea_inode); if (i_size_read(ea_inode) == value_len && !ext4_xattr_inode_read(ea_inode, ea_data, value_len) && !ext4_xattr_inode_verify_hashes(ea_inode, NULL, ea_data, value_len) && !memcmp(value, ea_data, value_len)) { mb_cache_entry_touch(ea_inode_cache, ce); mb_cache_entry_put(ea_inode_cache, ce); kvfree(ea_data); return ea_inode; } iput(ea_inode); next_entry: ce = mb_cache_entry_find_next(ea_inode_cache, ce); } kvfree(ea_data); return NULL; } /* * Add value of the EA in an inode. */ static struct inode *ext4_xattr_inode_lookup_create(handle_t *handle, struct inode *inode, const void *value, size_t value_len) { struct inode *ea_inode; u32 hash; int err; /* Account inode & space to quota even if sharing... */ err = ext4_xattr_inode_alloc_quota(inode, value_len); if (err) return ERR_PTR(err); hash = ext4_xattr_inode_hash(EXT4_SB(inode->i_sb), value, value_len); ea_inode = ext4_xattr_inode_cache_find(inode, value, value_len, hash); if (ea_inode) { err = ext4_xattr_inode_inc_ref(handle, ea_inode); if (err) goto out_err; return ea_inode; } /* Create an inode for the EA value */ ea_inode = ext4_xattr_inode_create(handle, inode, hash); if (IS_ERR(ea_inode)) { ext4_xattr_inode_free_quota(inode, NULL, value_len); return ea_inode; } err = ext4_xattr_inode_write(handle, ea_inode, value, value_len); if (err) { if (ext4_xattr_inode_dec_ref(handle, ea_inode)) ext4_warning_inode(ea_inode, "cleanup dec ref error %d", err); goto out_err; } if (EA_INODE_CACHE(inode)) mb_cache_entry_create(EA_INODE_CACHE(inode), GFP_NOFS, hash, ea_inode->i_ino, true /* reusable */); return ea_inode; out_err: iput(ea_inode); ext4_xattr_inode_free_quota(inode, NULL, value_len); return ERR_PTR(err); } /* * Reserve min(block_size/8, 1024) bytes for xattr entries/names if ea_inode * feature is enabled. */ #define EXT4_XATTR_BLOCK_RESERVE(inode) min(i_blocksize(inode)/8, 1024U) static int ext4_xattr_set_entry(struct ext4_xattr_info *i, struct ext4_xattr_search *s, handle_t *handle, struct inode *inode, struct inode *new_ea_inode, bool is_block) { struct ext4_xattr_entry *last, *next; struct ext4_xattr_entry *here = s->here; size_t min_offs = s->end - s->base, name_len = strlen(i->name); int in_inode = i->in_inode; struct inode *old_ea_inode = NULL; size_t old_size, new_size; int ret; /* Space used by old and new values. */ old_size = (!s->not_found && !here->e_value_inum) ? EXT4_XATTR_SIZE(le32_to_cpu(here->e_value_size)) : 0; new_size = (i->value && !in_inode) ? EXT4_XATTR_SIZE(i->value_len) : 0; /* * Optimization for the simple case when old and new values have the * same padded sizes. Not applicable if external inodes are involved. */ if (new_size && new_size == old_size) { size_t offs = le16_to_cpu(here->e_value_offs); void *val = s->base + offs; here->e_value_size = cpu_to_le32(i->value_len); if (i->value == EXT4_ZERO_XATTR_VALUE) { memset(val, 0, new_size); } else { memcpy(val, i->value, i->value_len); /* Clear padding bytes. */ memset(val + i->value_len, 0, new_size - i->value_len); } goto update_hash; } /* Compute min_offs and last. */ last = s->first; for (; !IS_LAST_ENTRY(last); last = next) { next = EXT4_XATTR_NEXT(last); if ((void *)next >= s->end) { EXT4_ERROR_INODE(inode, "corrupted xattr entries"); ret = -EFSCORRUPTED; goto out; } if (!last->e_value_inum && last->e_value_size) { size_t offs = le16_to_cpu(last->e_value_offs); if (offs < min_offs) min_offs = offs; } } /* Check whether we have enough space. */ if (i->value) { size_t free; free = min_offs - ((void *)last - s->base) - sizeof(__u32); if (!s->not_found) free += EXT4_XATTR_LEN(name_len) + old_size; if (free < EXT4_XATTR_LEN(name_len) + new_size) { ret = -ENOSPC; goto out; } /* * If storing the value in an external inode is an option, * reserve space for xattr entries/names in the external * attribute block so that a long value does not occupy the * whole space and prevent further entries being added. */ if (ext4_has_feature_ea_inode(inode->i_sb) && new_size && is_block && (min_offs + old_size - new_size) < EXT4_XATTR_BLOCK_RESERVE(inode)) { ret = -ENOSPC; goto out; } } /* * Getting access to old and new ea inodes is subject to failures. * Finish that work before doing any modifications to the xattr data. */ if (!s->not_found && here->e_value_inum) { ret = ext4_xattr_inode_iget(inode, le32_to_cpu(here->e_value_inum), le32_to_cpu(here->e_hash), &old_ea_inode); if (ret) { old_ea_inode = NULL; goto out; } /* We are ready to release ref count on the old_ea_inode. */ ret = ext4_xattr_inode_dec_ref(handle, old_ea_inode); if (ret) goto out; ext4_xattr_inode_free_quota(inode, old_ea_inode, le32_to_cpu(here->e_value_size)); } /* No failures allowed past this point. */ if (!s->not_found && here->e_value_size && !here->e_value_inum) { /* Remove the old value. */ void *first_val = s->base + min_offs; size_t offs = le16_to_cpu(here->e_value_offs); void *val = s->base + offs; memmove(first_val + old_size, first_val, val - first_val); memset(first_val, 0, old_size); min_offs += old_size; /* Adjust all value offsets. */ last = s->first; while (!IS_LAST_ENTRY(last)) { size_t o = le16_to_cpu(last->e_value_offs); if (!last->e_value_inum && last->e_value_size && o < offs) last->e_value_offs = cpu_to_le16(o + old_size); last = EXT4_XATTR_NEXT(last); } } if (!i->value) { /* Remove old name. */ size_t size = EXT4_XATTR_LEN(name_len); last = ENTRY((void *)last - size); memmove(here, (void *)here + size, (void *)last - (void *)here + sizeof(__u32)); memset(last, 0, size); /* * Update i_inline_off - moved ibody region might contain * system.data attribute. Handling a failure here won't * cause other complications for setting an xattr. */ if (!is_block && ext4_has_inline_data(inode)) { ret = ext4_find_inline_data_nolock(inode); if (ret) { ext4_warning_inode(inode, "unable to update i_inline_off"); goto out; } } } else if (s->not_found) { /* Insert new name. */ size_t size = EXT4_XATTR_LEN(name_len); size_t rest = (void *)last - (void *)here + sizeof(__u32); memmove((void *)here + size, here, rest); memset(here, 0, size); here->e_name_index = i->name_index; here->e_name_len = name_len; memcpy(here->e_name, i->name, name_len); } else { /* This is an update, reset value info. */ here->e_value_inum = 0; here->e_value_offs = 0; here->e_value_size = 0; } if (i->value) { /* Insert new value. */ if (in_inode) { here->e_value_inum = cpu_to_le32(new_ea_inode->i_ino); } else if (i->value_len) { void *val = s->base + min_offs - new_size; here->e_value_offs = cpu_to_le16(min_offs - new_size); if (i->value == EXT4_ZERO_XATTR_VALUE) { memset(val, 0, new_size); } else { memcpy(val, i->value, i->value_len); /* Clear padding bytes. */ memset(val + i->value_len, 0, new_size - i->value_len); } } here->e_value_size = cpu_to_le32(i->value_len); } update_hash: if (i->value) { __le32 hash = 0; /* Entry hash calculation. */ if (in_inode) { __le32 crc32c_hash; /* * Feed crc32c hash instead of the raw value for entry * hash calculation. This is to avoid walking * potentially long value buffer again. */ crc32c_hash = cpu_to_le32( ext4_xattr_inode_get_hash(new_ea_inode)); hash = ext4_xattr_hash_entry(here->e_name, here->e_name_len, &crc32c_hash, 1); } else if (is_block) { __le32 *value = s->base + le16_to_cpu( here->e_value_offs); hash = ext4_xattr_hash_entry(here->e_name, here->e_name_len, value, new_size >> 2); } here->e_hash = hash; } if (is_block) ext4_xattr_rehash((struct ext4_xattr_header *)s->base); ret = 0; out: iput(old_ea_inode); return ret; } struct ext4_xattr_block_find { struct ext4_xattr_search s; struct buffer_head *bh; }; static int ext4_xattr_block_find(struct inode *inode, struct ext4_xattr_info *i, struct ext4_xattr_block_find *bs) { struct super_block *sb = inode->i_sb; int error; ea_idebug(inode, "name=%d.%s, value=%p, value_len=%ld", i->name_index, i->name, i->value, (long)i->value_len); if (EXT4_I(inode)->i_file_acl) { /* The inode already has an extended attribute block. */ bs->bh = ext4_sb_bread(sb, EXT4_I(inode)->i_file_acl, REQ_PRIO); if (IS_ERR(bs->bh)) { error = PTR_ERR(bs->bh); bs->bh = NULL; return error; } ea_bdebug(bs->bh, "b_count=%d, refcount=%d", atomic_read(&(bs->bh->b_count)), le32_to_cpu(BHDR(bs->bh)->h_refcount)); error = ext4_xattr_check_block(inode, bs->bh); if (error) return error; /* Find the named attribute. */ bs->s.base = BHDR(bs->bh); bs->s.first = BFIRST(bs->bh); bs->s.end = bs->bh->b_data + bs->bh->b_size; bs->s.here = bs->s.first; error = xattr_find_entry(inode, &bs->s.here, bs->s.end, i->name_index, i->name, 1); if (error && error != -ENODATA) return error; bs->s.not_found = error; } return 0; } static int ext4_xattr_block_set(handle_t *handle, struct inode *inode, struct ext4_xattr_info *i, struct ext4_xattr_block_find *bs) { struct super_block *sb = inode->i_sb; struct buffer_head *new_bh = NULL; struct ext4_xattr_search s_copy = bs->s; struct ext4_xattr_search *s = &s_copy; struct mb_cache_entry *ce = NULL; int error = 0; struct mb_cache *ea_block_cache = EA_BLOCK_CACHE(inode); struct inode *ea_inode = NULL, *tmp_inode; size_t old_ea_inode_quota = 0; unsigned int ea_ino; #define header(x) ((struct ext4_xattr_header *)(x)) /* If we need EA inode, prepare it before locking the buffer */ if (i->value && i->in_inode) { WARN_ON_ONCE(!i->value_len); ea_inode = ext4_xattr_inode_lookup_create(handle, inode, i->value, i->value_len); if (IS_ERR(ea_inode)) { error = PTR_ERR(ea_inode); ea_inode = NULL; goto cleanup; } } if (s->base) { int offset = (char *)s->here - bs->bh->b_data; BUFFER_TRACE(bs->bh, "get_write_access"); error = ext4_journal_get_write_access(handle, sb, bs->bh, EXT4_JTR_NONE); if (error) goto cleanup; lock_buffer(bs->bh); if (header(s->base)->h_refcount == cpu_to_le32(1)) { __u32 hash = le32_to_cpu(BHDR(bs->bh)->h_hash); /* * This must happen under buffer lock for * ext4_xattr_block_set() to reliably detect modified * block */ if (ea_block_cache) { struct mb_cache_entry *oe; oe = mb_cache_entry_delete_or_get(ea_block_cache, hash, bs->bh->b_blocknr); if (oe) { /* * Xattr block is getting reused. Leave * it alone. */ mb_cache_entry_put(ea_block_cache, oe); goto clone_block; } } ea_bdebug(bs->bh, "modifying in-place"); error = ext4_xattr_set_entry(i, s, handle, inode, ea_inode, true /* is_block */); ext4_xattr_block_csum_set(inode, bs->bh); unlock_buffer(bs->bh); if (error == -EFSCORRUPTED) goto bad_block; if (!error) error = ext4_handle_dirty_metadata(handle, inode, bs->bh); if (error) goto cleanup; goto inserted; } clone_block: unlock_buffer(bs->bh); ea_bdebug(bs->bh, "cloning"); s->base = kmemdup(BHDR(bs->bh), bs->bh->b_size, GFP_NOFS); error = -ENOMEM; if (s->base == NULL) goto cleanup; s->first = ENTRY(header(s->base)+1); header(s->base)->h_refcount = cpu_to_le32(1); s->here = ENTRY(s->base + offset); s->end = s->base + bs->bh->b_size; /* * If existing entry points to an xattr inode, we need * to prevent ext4_xattr_set_entry() from decrementing * ref count on it because the reference belongs to the * original block. In this case, make the entry look * like it has an empty value. */ if (!s->not_found && s->here->e_value_inum) { ea_ino = le32_to_cpu(s->here->e_value_inum); error = ext4_xattr_inode_iget(inode, ea_ino, le32_to_cpu(s->here->e_hash), &tmp_inode); if (error) goto cleanup; if (!ext4_test_inode_state(tmp_inode, EXT4_STATE_LUSTRE_EA_INODE)) { /* * Defer quota free call for previous * inode until success is guaranteed. */ old_ea_inode_quota = le32_to_cpu( s->here->e_value_size); } iput(tmp_inode); s->here->e_value_inum = 0; s->here->e_value_size = 0; } } else { /* Allocate a buffer where we construct the new block. */ s->base = kzalloc(sb->s_blocksize, GFP_NOFS); error = -ENOMEM; if (s->base == NULL) goto cleanup; header(s->base)->h_magic = cpu_to_le32(EXT4_XATTR_MAGIC); header(s->base)->h_blocks = cpu_to_le32(1); header(s->base)->h_refcount = cpu_to_le32(1); s->first = ENTRY(header(s->base)+1); s->here = ENTRY(header(s->base)+1); s->end = s->base + sb->s_blocksize; } error = ext4_xattr_set_entry(i, s, handle, inode, ea_inode, true /* is_block */); if (error == -EFSCORRUPTED) goto bad_block; if (error) goto cleanup; inserted: if (!IS_LAST_ENTRY(s->first)) { new_bh = ext4_xattr_block_cache_find(inode, header(s->base), &ce); if (IS_ERR(new_bh)) { error = PTR_ERR(new_bh); new_bh = NULL; goto cleanup; } if (new_bh) { /* We found an identical block in the cache. */ if (new_bh == bs->bh) ea_bdebug(new_bh, "keeping"); else { u32 ref; #ifdef EXT4_XATTR_DEBUG WARN_ON_ONCE(dquot_initialize_needed(inode)); #endif /* The old block is released after updating the inode. */ error = dquot_alloc_block(inode, EXT4_C2B(EXT4_SB(sb), 1)); if (error) goto cleanup; BUFFER_TRACE(new_bh, "get_write_access"); error = ext4_journal_get_write_access( handle, sb, new_bh, EXT4_JTR_NONE); if (error) goto cleanup_dquot; lock_buffer(new_bh); /* * We have to be careful about races with * adding references to xattr block. Once we * hold buffer lock xattr block's state is * stable so we can check the additional * reference fits. */ ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1; if (ref > EXT4_XATTR_REFCOUNT_MAX) { /* * Undo everything and check mbcache * again. */ unlock_buffer(new_bh); dquot_free_block(inode, EXT4_C2B(EXT4_SB(sb), 1)); brelse(new_bh); mb_cache_entry_put(ea_block_cache, ce); ce = NULL; new_bh = NULL; goto inserted; } BHDR(new_bh)->h_refcount = cpu_to_le32(ref); if (ref == EXT4_XATTR_REFCOUNT_MAX) clear_bit(MBE_REUSABLE_B, &ce->e_flags); ea_bdebug(new_bh, "reusing; refcount now=%d", ref); ext4_xattr_block_csum_set(inode, new_bh); unlock_buffer(new_bh); error = ext4_handle_dirty_metadata(handle, inode, new_bh); if (error) goto cleanup_dquot; } mb_cache_entry_touch(ea_block_cache, ce); mb_cache_entry_put(ea_block_cache, ce); ce = NULL; } else if (bs->bh && s->base == bs->bh->b_data) { /* We were modifying this block in-place. */ ea_bdebug(bs->bh, "keeping this block"); ext4_xattr_block_cache_insert(ea_block_cache, bs->bh); new_bh = bs->bh; get_bh(new_bh); } else { /* We need to allocate a new block */ ext4_fsblk_t goal, block; #ifdef EXT4_XATTR_DEBUG WARN_ON_ONCE(dquot_initialize_needed(inode)); #endif goal = ext4_group_first_block_no(sb, EXT4_I(inode)->i_block_group); block = ext4_new_meta_blocks(handle, inode, goal, 0, NULL, &error); if (error) goto cleanup; ea_idebug(inode, "creating block %llu", (unsigned long long)block); new_bh = sb_getblk(sb, block); if (unlikely(!new_bh)) { error = -ENOMEM; getblk_failed: ext4_free_blocks(handle, inode, NULL, block, 1, EXT4_FREE_BLOCKS_METADATA); goto cleanup; } error = ext4_xattr_inode_inc_ref_all(handle, inode, ENTRY(header(s->base)+1)); if (error) goto getblk_failed; if (ea_inode) { /* Drop the extra ref on ea_inode. */ error = ext4_xattr_inode_dec_ref(handle, ea_inode); if (error) ext4_warning_inode(ea_inode, "dec ref error=%d", error); iput(ea_inode); ea_inode = NULL; } lock_buffer(new_bh); error = ext4_journal_get_create_access(handle, sb, new_bh, EXT4_JTR_NONE); if (error) { unlock_buffer(new_bh); error = -EIO; goto getblk_failed; } memcpy(new_bh->b_data, s->base, new_bh->b_size); ext4_xattr_block_csum_set(inode, new_bh); set_buffer_uptodate(new_bh); unlock_buffer(new_bh); ext4_xattr_block_cache_insert(ea_block_cache, new_bh); error = ext4_handle_dirty_metadata(handle, inode, new_bh); if (error) goto cleanup; } } if (old_ea_inode_quota) ext4_xattr_inode_free_quota(inode, NULL, old_ea_inode_quota); /* Update the inode. */ EXT4_I(inode)->i_file_acl = new_bh ? new_bh->b_blocknr : 0; /* Drop the previous xattr block. */ if (bs->bh && bs->bh != new_bh) { struct ext4_xattr_inode_array *ea_inode_array = NULL; ext4_xattr_release_block(handle, inode, bs->bh, &ea_inode_array, 0 /* extra_credits */); ext4_xattr_inode_array_free(ea_inode_array); } error = 0; cleanup: if (ea_inode) { if (error) { int error2; error2 = ext4_xattr_inode_dec_ref(handle, ea_inode); if (error2) ext4_warning_inode(ea_inode, "dec ref error=%d", error2); ext4_xattr_inode_free_quota(inode, ea_inode, i_size_read(ea_inode)); } iput(ea_inode); } if (ce) mb_cache_entry_put(ea_block_cache, ce); brelse(new_bh); if (!(bs->bh && s->base == bs->bh->b_data)) kfree(s->base); return error; cleanup_dquot: dquot_free_block(inode, EXT4_C2B(EXT4_SB(sb), 1)); goto cleanup; bad_block: EXT4_ERROR_INODE(inode, " |