105 107 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 | // SPDX-License-Identifier: GPL-2.0-only /* PIPAPO: PIle PAcket POlicies: set for arbitrary concatenations of ranges * * Copyright (c) 2019-2020 Red Hat GmbH * * Author: Stefano Brivio <sbrivio@redhat.com> */ /** * DOC: Theory of Operation * * * Problem * ------- * * Match packet bytes against entries composed of ranged or non-ranged packet * field specifiers, mapping them to arbitrary references. For example: * * :: * * --- fields ---> * | [net],[port],[net]... => [reference] * entries [net],[port],[net]... => [reference] * | [net],[port],[net]... => [reference] * V ... * * where [net] fields can be IP ranges or netmasks, and [port] fields are port * ranges. Arbitrary packet fields can be matched. * * * Algorithm Overview * ------------------ * * This algorithm is loosely inspired by [Ligatti 2010], and fundamentally * relies on the consideration that every contiguous range in a space of b bits * can be converted into b * 2 netmasks, from Theorem 3 in [Rottenstreich 2010], * as also illustrated in Section 9 of [Kogan 2014]. * * Classification against a number of entries, that require matching given bits * of a packet field, is performed by grouping those bits in sets of arbitrary * size, and classifying packet bits one group at a time. * * Example: * to match the source port (16 bits) of a packet, we can divide those 16 bits * in 4 groups of 4 bits each. Given the entry: * 0000 0001 0101 1001 * and a packet with source port: * 0000 0001 1010 1001 * first and second groups match, but the third doesn't. We conclude that the * packet doesn't match the given entry. * * Translate the set to a sequence of lookup tables, one per field. Each table * has two dimensions: bit groups to be matched for a single packet field, and * all the possible values of said groups (buckets). Input entries are * represented as one or more rules, depending on the number of composing * netmasks for the given field specifier, and a group match is indicated as a * set bit, with number corresponding to the rule index, in all the buckets * whose value matches the entry for a given group. * * Rules are mapped between fields through an array of x, n pairs, with each * item mapping a matched rule to one or more rules. The position of the pair in * the array indicates the matched rule to be mapped to the next field, x * indicates the first rule index in the next field, and n the amount of * next-field rules the current rule maps to. * * The mapping array for the last field maps to the desired references. * * To match, we perform table lookups using the values of grouped packet bits, * and use a sequence of bitwise operations to progressively evaluate rule * matching. * * A stand-alone, reference implementation, also including notes about possible * future optimisations, is available at: * https://pipapo.lameexcu.se/ * * Insertion * --------- * * - For each packet field: * * - divide the b packet bits we want to classify into groups of size t, * obtaining ceil(b / t) groups * * Example: match on destination IP address, with t = 4: 32 bits, 8 groups * of 4 bits each * * - allocate a lookup table with one column ("bucket") for each possible * value of a group, and with one row for each group * * Example: 8 groups, 2^4 buckets: * * :: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 * 1 * 2 * 3 * 4 * 5 * 6 * 7 * * - map the bits we want to classify for the current field, for a given * entry, to a single rule for non-ranged and netmask set items, and to one * or multiple rules for ranges. Ranges are expanded to composing netmasks * by pipapo_expand(). * * Example: 2 entries, 10.0.0.5:1024 and 192.168.1.0-192.168.2.1:2048 * - rule #0: 10.0.0.5 * - rule #1: 192.168.1.0/24 * - rule #2: 192.168.2.0/31 * * - insert references to the rules in the lookup table, selecting buckets * according to bit values of a rule in the given group. This is done by * pipapo_insert(). * * Example: given: * - rule #0: 10.0.0.5 mapping to buckets * < 0 10 0 0 0 0 0 5 > * - rule #1: 192.168.1.0/24 mapping to buckets * < 12 0 10 8 0 1 < 0..15 > < 0..15 > > * - rule #2: 192.168.2.0/31 mapping to buckets * < 12 0 10 8 0 2 0 < 0..1 > > * * these bits are set in the lookup table: * * :: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 0 1,2 * 1 1,2 0 * 2 0 1,2 * 3 0 1,2 * 4 0,1,2 * 5 0 1 2 * 6 0,1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 * 7 1,2 1,2 1 1 1 0,1 1 1 1 1 1 1 1 1 1 1 * * - if this is not the last field in the set, fill a mapping array that maps * rules from the lookup table to rules belonging to the same entry in * the next lookup table, done by pipapo_map(). * * Note that as rules map to contiguous ranges of rules, given how netmask * expansion and insertion is performed, &union nft_pipapo_map_bucket stores * this information as pairs of first rule index, rule count. * * Example: 2 entries, 10.0.0.5:1024 and 192.168.1.0-192.168.2.1:2048, * given lookup table #0 for field 0 (see example above): * * :: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 0 1,2 * 1 1,2 0 * 2 0 1,2 * 3 0 1,2 * 4 0,1,2 * 5 0 1 2 * 6 0,1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 * 7 1,2 1,2 1 1 1 0,1 1 1 1 1 1 1 1 1 1 1 * * and lookup table #1 for field 1 with: * - rule #0: 1024 mapping to buckets * < 0 0 4 0 > * - rule #1: 2048 mapping to buckets * < 0 0 5 0 > * * :: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 0,1 * 1 0,1 * 2 0 1 * 3 0,1 * * we need to map rules for 10.0.0.5 in lookup table #0 (rule #0) to 1024 * in lookup table #1 (rule #0) and rules for 192.168.1.0-192.168.2.1 * (rules #1, #2) to 2048 in lookup table #2 (rule #1): * * :: * * rule indices in current field: 0 1 2 * map to rules in next field: 0 1 1 * * - if this is the last field in the set, fill a mapping array that maps * rules from the last lookup table to element pointers, also done by * pipapo_map(). * * Note that, in this implementation, we have two elements (start, end) for * each entry. The pointer to the end element is stored in this array, and * the pointer to the start element is linked from it. * * Example: entry 10.0.0.5:1024 has a corresponding &struct nft_pipapo_elem * pointer, 0x66, and element for 192.168.1.0-192.168.2.1:2048 is at 0x42. * From the rules of lookup table #1 as mapped above: * * :: * * rule indices in last field: 0 1 * map to elements: 0x66 0x42 * * * Matching * -------- * * We use a result bitmap, with the size of a single lookup table bucket, to * represent the matching state that applies at every algorithm step. This is * done by pipapo_lookup(). * * - For each packet field: * * - start with an all-ones result bitmap (res_map in pipapo_lookup()) * * - perform a lookup into the table corresponding to the current field, * for each group, and at every group, AND the current result bitmap with * the value from the lookup table bucket * * :: * * Example: 192.168.1.5 < 12 0 10 8 0 1 0 5 >, with lookup table from * insertion examples. * Lookup table buckets are at least 3 bits wide, we'll assume 8 bits for * convenience in this example. Initial result bitmap is 0xff, the steps * below show the value of the result bitmap after each group is processed: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 0 1,2 * result bitmap is now: 0xff & 0x6 [bucket 12] = 0x6 * * 1 1,2 0 * result bitmap is now: 0x6 & 0x6 [bucket 0] = 0x6 * * 2 0 1,2 * result bitmap is now: 0x6 & 0x6 [bucket 10] = 0x6 * * 3 0 1,2 * result bitmap is now: 0x6 & 0x6 [bucket 8] = 0x6 * * 4 0,1,2 * result bitmap is now: 0x6 & 0x7 [bucket 0] = 0x6 * * 5 0 1 2 * result bitmap is now: 0x6 & 0x2 [bucket 1] = 0x2 * * 6 0,1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 * result bitmap is now: 0x2 & 0x7 [bucket 0] = 0x2 * * 7 1,2 1,2 1 1 1 0,1 1 1 1 1 1 1 1 1 1 1 * final result bitmap for this field is: 0x2 & 0x3 [bucket 5] = 0x2 * * - at the next field, start with a new, all-zeroes result bitmap. For each * bit set in the previous result bitmap, fill the new result bitmap * (fill_map in pipapo_lookup()) with the rule indices from the * corresponding buckets of the mapping field for this field, done by * pipapo_refill() * * Example: with mapping table from insertion examples, with the current * result bitmap from the previous example, 0x02: * * :: * * rule indices in current field: 0 1 2 * map to rules in next field: 0 1 1 * * the new result bitmap will be 0x02: rule 1 was set, and rule 1 will be * set. * * We can now extend this example to cover the second iteration of the step * above (lookup and AND bitmap): assuming the port field is * 2048 < 0 0 5 0 >, with starting result bitmap 0x2, and lookup table * for "port" field from pre-computation example: * * :: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 0,1 * 1 0,1 * 2 0 1 * 3 0,1 * * operations are: 0x2 & 0x3 [bucket 0] & 0x3 [bucket 0] & 0x2 [bucket 5] * & 0x3 [bucket 0], resulting bitmap is 0x2. * * - if this is the last field in the set, look up the value from the mapping * array corresponding to the final result bitmap * * Example: 0x2 resulting bitmap from 192.168.1.5:2048, mapping array for * last field from insertion example: * * :: * * rule indices in last field: 0 1 * map to elements: 0x66 0x42 * * the matching element is at 0x42. * * * References * ---------- * * [Ligatti 2010] * A Packet-classification Algorithm for Arbitrary Bitmask Rules, with * Automatic Time-space Tradeoffs * Jay Ligatti, Josh Kuhn, and Chris Gage. * Proceedings of the IEEE International Conference on Computer * Communication Networks (ICCCN), August 2010. * https://www.cse.usf.edu/~ligatti/papers/grouper-conf.pdf * * [Rottenstreich 2010] * Worst-Case TCAM Rule Expansion * Ori Rottenstreich and Isaac Keslassy. * 2010 Proceedings IEEE INFOCOM, San Diego, CA, 2010. * http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.212.4592&rep=rep1&type=pdf * * [Kogan 2014] * SAX-PAC (Scalable And eXpressive PAcket Classification) * Kirill Kogan, Sergey Nikolenko, Ori Rottenstreich, William Culhane, * and Patrick Eugster. * Proceedings of the 2014 ACM conference on SIGCOMM, August 2014. * https://www.sigcomm.org/sites/default/files/ccr/papers/2014/August/2619239-2626294.pdf */ #include <linux/kernel.h> #include <linux/init.h> #include <linux/module.h> #include <linux/netlink.h> #include <linux/netfilter.h> #include <linux/netfilter/nf_tables.h> #include <net/netfilter/nf_tables_core.h> #include <uapi/linux/netfilter/nf_tables.h> #include <linux/bitmap.h> #include <linux/bitops.h> #include "nft_set_pipapo_avx2.h" #include "nft_set_pipapo.h" /** * pipapo_refill() - For each set bit, set bits from selected mapping table item * @map: Bitmap to be scanned for set bits * @len: Length of bitmap in longs * @rules: Number of rules in field * @dst: Destination bitmap * @mt: Mapping table containing bit set specifiers * @match_only: Find a single bit and return, don't fill * * Iteration over set bits with __builtin_ctzl(): Daniel Lemire, public domain. * * For each bit set in map, select the bucket from mapping table with index * corresponding to the position of the bit set. Use start bit and amount of * bits specified in bucket to fill region in dst. * * Return: -1 on no match, bit position on 'match_only', 0 otherwise. */ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules, unsigned long *dst, const union nft_pipapo_map_bucket *mt, bool match_only) { unsigned long bitset; unsigned int k; int ret = -1; for (k = 0; k < len; k++) { bitset = map[k]; while (bitset) { unsigned long t = bitset & -bitset; int r = __builtin_ctzl(bitset); int i = k * BITS_PER_LONG + r; if (unlikely(i >= rules)) { map[k] = 0; return -1; } if (match_only) { bitmap_clear(map, i, 1); return i; } ret = 0; bitmap_set(dst, mt[i].to, mt[i].n); bitset ^= t; } map[k] = 0; } return ret; } /** * nft_pipapo_lookup() - Lookup function * @net: Network namespace * @set: nftables API set representation * @key: nftables API element representation containing key data * @ext: nftables API extension pointer, filled with matching reference * * For more details, see DOC: Theory of Operation. * * Return: true on match, false otherwise. */ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, const u32 *key, const struct nft_set_ext **ext) { struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_scratch *scratch; unsigned long *res_map, *fill_map; u8 genmask = nft_genmask_cur(net); const struct nft_pipapo_match *m; const struct nft_pipapo_field *f; const u8 *rp = (const u8 *)key; bool map_index; int i; local_bh_disable(); m = rcu_dereference(priv->match); if (unlikely(!m || !*raw_cpu_ptr(m->scratch))) goto out; scratch = *raw_cpu_ptr(m->scratch); map_index = scratch->map_index; res_map = scratch->map + (map_index ? m->bsize_max : 0); fill_map = scratch->map + (map_index ? 0 : m->bsize_max); memset(res_map, 0xff, m->bsize_max * sizeof(*res_map)); nft_pipapo_for_each_field(f, i, m) { bool last = i == m->field_count - 1; int b; /* For each bit group: select lookup table bucket depending on * packet bytes value, then AND bucket value */ if (likely(f->bb == 8)) pipapo_and_field_buckets_8bit(f, res_map, rp); else pipapo_and_field_buckets_4bit(f, res_map, rp); NFT_PIPAPO_GROUP_BITS_ARE_8_OR_4; rp += f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f); /* Now populate the bitmap for the next field, unless this is * the last field, in which case return the matched 'ext' * pointer if any. * * Now res_map contains the matching bitmap, and fill_map is the * bitmap for the next field. */ next_match: b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt, last); if (b < 0) { scratch->map_index = map_index; local_bh_enable(); return false; } if (last) { *ext = &f->mt[b].e->ext; if (unlikely(nft_set_elem_expired(*ext) || !nft_set_elem_active(*ext, genmask))) goto next_match; /* Last field: we're just returning the key without * filling the initial bitmap for the next field, so the * current inactive bitmap is clean and can be reused as * *next* bitmap (not initial) for the next packet. */ scratch->map_index = map_index; local_bh_enable(); return true; } /* Swap bitmap indices: res_map is the initial bitmap for the * next field, and fill_map is guaranteed to be all-zeroes at * this point. */ map_index = !map_index; swap(res_map, fill_map); rp += NFT_PIPAPO_GROUPS_PADDING(f); } out: local_bh_enable(); return false; } /** * pipapo_get() - Get matching element reference given key data * @net: Network namespace * @set: nftables API set representation * @data: Key data to be matched against existing elements * @genmask: If set, check that element is active in given genmask * @tstamp: timestamp to check for expired elements * @gfp: the type of memory to allocate (see kmalloc). * * This is essentially the same as the lookup function, except that it matches * key data against the uncommitted copy and doesn't use preallocated maps for * bitmap results. * * Return: pointer to &struct nft_pipapo_elem on match, error pointer otherwise. */ static struct nft_pipapo_elem *pipapo_get(const struct net *net, const struct nft_set *set, const u8 *data, u8 genmask, u64 tstamp, gfp_t gfp) { struct nft_pipapo_elem *ret = ERR_PTR(-ENOENT); struct nft_pipapo *priv = nft_set_priv(set); unsigned long *res_map, *fill_map = NULL; const struct nft_pipapo_match *m; const struct nft_pipapo_field *f; int i; m = priv->clone; if (m->bsize_max == 0) return ret; res_map = kmalloc_array(m->bsize_max, sizeof(*res_map), gfp); if (!res_map) { ret = ERR_PTR(-ENOMEM); goto out; } fill_map = kcalloc(m->bsize_max, sizeof(*res_map), gfp); if (!fill_map) { ret = ERR_PTR(-ENOMEM); goto out; } memset(res_map, 0xff, m->bsize_max * sizeof(*res_map)); nft_pipapo_for_each_field(f, i, m) { bool last = i == m->field_count - 1; int b; /* For each bit group: select lookup table bucket depending on * packet bytes value, then AND bucket value */ if (f->bb == 8) pipapo_and_field_buckets_8bit(f, res_map, data); else if (f->bb == 4) pipapo_and_field_buckets_4bit(f, res_map, data); else BUG(); data += f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f); /* Now populate the bitmap for the next field, unless this is * the last field, in which case return the matched 'ext' * pointer if any. * * Now res_map contains the matching bitmap, and fill_map is the * bitmap for the next field. */ next_match: b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt, last); if (b < 0) goto out; if (last) { if (__nft_set_elem_expired(&f->mt[b].e->ext, tstamp)) goto next_match; if ((genmask && !nft_set_elem_active(&f->mt[b].e->ext, genmask))) goto next_match; ret = f->mt[b].e; goto out; } data += NFT_PIPAPO_GROUPS_PADDING(f); /* Swap bitmap indices: fill_map will be the initial bitmap for * the next field (i.e. the new res_map), and res_map is * guaranteed to be all-zeroes at this point, ready to be filled * according to the next mapping table. */ swap(res_map, fill_map); } out: kfree(fill_map); kfree(res_map); return ret; } /** * nft_pipapo_get() - Get matching element reference given key data * @net: Network namespace * @set: nftables API set representation * @elem: nftables API element representation containing key data * @flags: Unused */ static struct nft_elem_priv * nft_pipapo_get(const struct net *net, const struct nft_set *set, const struct nft_set_elem *elem, unsigned int flags) { struct nft_pipapo_elem *e; e = pipapo_get(net, set, (const u8 *)elem->key.val.data, nft_genmask_cur(net), get_jiffies_64(), GFP_ATOMIC); if (IS_ERR(e)) return ERR_CAST(e); return &e->priv; } /** * pipapo_realloc_mt() - Reallocate mapping table if needed upon resize * @f: Field containing mapping table * @old_rules: Amount of existing mapped rules * @rules: Amount of new rules to map * * Return: 0 on success, negative error code on failure. */ static int pipapo_realloc_mt(struct nft_pipapo_field *f, unsigned int old_rules, unsigned int rules) { union nft_pipapo_map_bucket *new_mt = NULL, *old_mt = f->mt; const unsigned int extra = PAGE_SIZE / sizeof(*new_mt); unsigned int rules_alloc = rules; might_sleep(); if (unlikely(rules == 0)) goto out_free; /* growing and enough space left, no action needed */ if (rules > old_rules && f->rules_alloc > rules) return 0; /* downsize and extra slack has not grown too large */ if (rules < old_rules) { unsigned int remove = f->rules_alloc - rules; if (remove < (2u * extra)) return 0; } /* If set needs more than one page of memory for rules then * allocate another extra page to avoid frequent reallocation. */ if (rules > extra && check_add_overflow(rules, extra, &rules_alloc)) return -EOVERFLOW; new_mt = kvmalloc_array(rules_alloc, sizeof(*new_mt), GFP_KERNEL); if (!new_mt) return -ENOMEM; if (old_mt) memcpy(new_mt, old_mt, min(old_rules, rules) * sizeof(*new_mt)); if (rules > old_rules) { memset(new_mt + old_rules, 0, (rules - old_rules) * sizeof(*new_mt)); } out_free: f->rules_alloc = rules_alloc; f->mt = new_mt; kvfree(old_mt); return 0; } /** * pipapo_resize() - Resize lookup or mapping table, or both * @f: Field containing lookup and mapping tables * @old_rules: Previous amount of rules in field * @rules: New amount of rules * * Increase, decrease or maintain tables size depending on new amount of rules, * and copy data over. In case the new size is smaller, throw away data for * highest-numbered rules. * * Return: 0 on success, -ENOMEM on allocation failure. */ static int pipapo_resize(struct nft_pipapo_field *f, unsigned int old_rules, unsigned int rules) { long *new_lt = NULL, *new_p, *old_lt = f->lt, *old_p; unsigned int new_bucket_size, copy; int group, bucket, err; if (rules >= NFT_PIPAPO_RULE0_MAX) return -ENOSPC; new_bucket_size = DIV_ROUND_UP(rules, BITS_PER_LONG); #ifdef NFT_PIPAPO_ALIGN new_bucket_size = roundup(new_bucket_size, NFT_PIPAPO_ALIGN / sizeof(*new_lt)); #endif if (new_bucket_size == f->bsize) goto mt; if (new_bucket_size > f->bsize) copy = f->bsize; else copy = new_bucket_size; new_lt = kvzalloc(f->groups * NFT_PIPAPO_BUCKETS(f->bb) * new_bucket_size * sizeof(*new_lt) + NFT_PIPAPO_ALIGN_HEADROOM, GFP_KERNEL); if (!new_lt) return -ENOMEM; new_p = NFT_PIPAPO_LT_ALIGN(new_lt); old_p = NFT_PIPAPO_LT_ALIGN(old_lt); for (group = 0; group < f->groups; group++) { for (bucket = 0; bucket < NFT_PIPAPO_BUCKETS(f->bb); bucket++) { memcpy(new_p, old_p, copy * sizeof(*new_p)); new_p += copy; old_p += copy; if (new_bucket_size > f->bsize) new_p += new_bucket_size - f->bsize; else old_p += f->bsize - new_bucket_size; } } mt: err = pipapo_realloc_mt(f, old_rules, rules); if (err) { kvfree(new_lt); return err; } if (new_lt) { f->bsize = new_bucket_size; f->lt = new_lt; kvfree(old_lt); } return 0; } /** * pipapo_bucket_set() - Set rule bit in bucket given group and group value * @f: Field containing lookup table * @rule: Rule index * @group: Group index * @v: Value of bit group */ static void pipapo_bucket_set(struct nft_pipapo_field *f, int rule, int group, int v) { unsigned long *pos; pos = NFT_PIPAPO_LT_ALIGN(f->lt); pos += f->bsize * NFT_PIPAPO_BUCKETS(f->bb) * group; pos += f->bsize * v; __set_bit(rule, pos); } /** * pipapo_lt_4b_to_8b() - Switch lookup table group width from 4 bits to 8 bits * @old_groups: Number of current groups * @bsize: Size of one bucket, in longs * @old_lt: Pointer to the current lookup table * @new_lt: Pointer to the new, pre-allocated lookup table * * Each bucket with index b in the new lookup table, belonging to group g, is * filled with the bit intersection between: * - bucket with index given by the upper 4 bits of b, from group g, and * - bucket with index given by the lower 4 bits of b, from group g + 1 * * That is, given buckets from the new lookup table N(x, y) and the old lookup * table O(x, y), with x bucket index, and y group index: * * N(b, g) := O(b / 16, g) & O(b % 16, g + 1) * * This ensures equivalence of the matching results on lookup. Two examples in * pictures: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ... 254 255 * 0 ^ * 1 | ^ * ... ( & ) | * / \ | * / \ .-( & )-. * / bucket \ | | * group 0 / 1 2 3 \ 4 5 6 7 8 9 10 11 12 13 |14 15 | * 0 / \ | | * 1 \ | | * 2 | --' * 3 '- * ... */ static void pipapo_lt_4b_to_8b(int old_groups, int bsize, unsigned long *old_lt, unsigned long *new_lt) { int g, b, i; for (g = 0; g < old_groups / 2; g++) { int src_g0 = g * 2, src_g1 = g * 2 + 1; for (b = 0; b < NFT_PIPAPO_BUCKETS(8); b++) { int src_b0 = b / NFT_PIPAPO_BUCKETS(4); int src_b1 = b % NFT_PIPAPO_BUCKETS(4); int src_i0 = src_g0 * NFT_PIPAPO_BUCKETS(4) + src_b0; int src_i1 = src_g1 * NFT_PIPAPO_BUCKETS(4) + src_b1; for (i = 0; i < bsize; i++) { *new_lt = old_lt[src_i0 * bsize + i] & old_lt[src_i1 * bsize + i]; new_lt++; } } } } /** * pipapo_lt_8b_to_4b() - Switch lookup table group width from 8 bits to 4 bits * @old_groups: Number of current groups * @bsize: Size of one bucket, in longs * @old_lt: Pointer to the current lookup table * @new_lt: Pointer to the new, pre-allocated lookup table * * Each bucket with index b in the new lookup table, belonging to group g, is * filled with the bit union of: * - all the buckets with index such that the upper four bits of the lower byte * equal b, from group g, with g odd * - all the buckets with index such that the lower four bits equal b, from * group g, with g even * * That is, given buckets from the new lookup table N(x, y) and the old lookup * table O(x, y), with x bucket index, and y group index: * * - with g odd: N(b, g) := U(O(x, g) for each x : x = (b & 0xf0) >> 4) * - with g even: N(b, g) := U(O(x, g) for each x : x = b & 0x0f) * * where U() denotes the arbitrary union operation (binary OR of n terms). This * ensures equivalence of the matching results on lookup. */ static void pipapo_lt_8b_to_4b(int old_groups, int bsize, unsigned long *old_lt, unsigned long *new_lt) { int g, b, bsrc, i; memset(new_lt, 0, old_groups * 2 * NFT_PIPAPO_BUCKETS(4) * bsize * sizeof(unsigned long)); for (g = 0; g < old_groups * 2; g += 2) { int src_g = g / 2; for (b = 0; b < NFT_PIPAPO_BUCKETS(4); b++) { for (bsrc = NFT_PIPAPO_BUCKETS(8) * src_g; bsrc < NFT_PIPAPO_BUCKETS(8) * (src_g + 1); bsrc++) { if (((bsrc & 0xf0) >> 4) != b) continue; for (i = 0; i < bsize; i++) new_lt[i] |= old_lt[bsrc * bsize + i]; } new_lt += bsize; } for (b = 0; b < NFT_PIPAPO_BUCKETS(4); b++) { for (bsrc = NFT_PIPAPO_BUCKETS(8) * src_g; bsrc < NFT_PIPAPO_BUCKETS(8) * (src_g + 1); bsrc++) { if ((bsrc & 0x0f) != b) continue; for (i = 0; i < bsize; i++) new_lt[i] |= old_lt[bsrc * bsize + i]; } new_lt += bsize; } } } /** * pipapo_lt_bits_adjust() - Adjust group size for lookup table if needed * @f: Field containing lookup table */ static void pipapo_lt_bits_adjust(struct nft_pipapo_field *f) { unsigned int groups, bb; unsigned long *new_lt; size_t lt_size; lt_size = f->groups * NFT_PIPAPO_BUCKETS(f->bb) * f->bsize * sizeof(*f->lt); if (f->bb == NFT_PIPAPO_GROUP_BITS_SMALL_SET && lt_size > NFT_PIPAPO_LT_SIZE_HIGH) { groups = f->groups * 2; bb = NFT_PIPAPO_GROUP_BITS_LARGE_SET; lt_size = groups * NFT_PIPAPO_BUCKETS(bb) * f->bsize * sizeof(*f->lt); } else if (f->bb == NFT_PIPAPO_GROUP_BITS_LARGE_SET && lt_size < NFT_PIPAPO_LT_SIZE_LOW) { groups = f->groups / 2; bb = NFT_PIPAPO_GROUP_BITS_SMALL_SET; lt_size = groups * NFT_PIPAPO_BUCKETS(bb) * f->bsize * sizeof(*f->lt); /* Don't increase group width if the resulting lookup table size * would exceed the upper size threshold for a "small" set. */ if (lt_size > NFT_PIPAPO_LT_SIZE_HIGH) return; } else { return; } new_lt = kvzalloc(lt_size + NFT_PIPAPO_ALIGN_HEADROOM, GFP_KERNEL); if (!new_lt) return; NFT_PIPAPO_GROUP_BITS_ARE_8_OR_4; if (f->bb == 4 && bb == 8) { pipapo_lt_4b_to_8b(f->groups, f->bsize, NFT_PIPAPO_LT_ALIGN(f->lt), NFT_PIPAPO_LT_ALIGN(new_lt)); } else if (f->bb == 8 && bb == 4) { pipapo_lt_8b_to_4b(f->groups, f->bsize, NFT_PIPAPO_LT_ALIGN(f->lt), NFT_PIPAPO_LT_ALIGN(new_lt)); } else { BUG(); } f->groups = groups; f->bb = bb; kvfree(f->lt); f->lt = new_lt; } /** * pipapo_insert() - Insert new rule in field given input key and mask length * @f: Field containing lookup table * @k: Input key for classification, without nftables padding * @mask_bits: Length of mask; matches field length for non-ranged entry * * Insert a new rule reference in lookup buckets corresponding to k and * mask_bits. * * Return: 1 on success (one rule inserted), negative error code on failure. */ static int pipapo_insert(struct nft_pipapo_field *f, const uint8_t *k, int mask_bits) { unsigned int rule = f->rules, group, ret, bit_offset = 0; ret = pipapo_resize(f, f->rules, f->rules + 1); if (ret) return ret; f->rules++; for (group = 0; group < f->groups; group++) { int i, v; u8 mask; v = k[group / (BITS_PER_BYTE / f->bb)]; v &= GENMASK(BITS_PER_BYTE - bit_offset - 1, 0); v >>= (BITS_PER_BYTE - bit_offset) - f->bb; bit_offset += f->bb; bit_offset %= BITS_PER_BYTE; if (mask_bits >= (group + 1) * f->bb) { /* Not masked */ pipapo_bucket_set(f, rule, group, v); } else if (mask_bits <= group * f->bb) { /* Completely masked */ for (i = 0; i < NFT_PIPAPO_BUCKETS(f->bb); i++) pipapo_bucket_set(f, rule, group, i); } else { /* The mask limit falls on this group */ mask = GENMASK(f->bb - 1, 0); mask >>= mask_bits - group * f->bb; for (i = 0; i < NFT_PIPAPO_BUCKETS(f->bb); i++) { if ((i & ~mask) == (v & ~mask)) pipapo_bucket_set(f, rule, group, i); } } } pipapo_lt_bits_adjust(f); return 1; } /** * pipapo_step_diff() - Check if setting @step bit in netmask would change it * @base: Mask we are expanding * @step: Step bit for given expansion step * @len: Total length of mask space (set and unset bits), bytes * * Convenience function for mask expansion. * * Return: true if step bit changes mask (i.e. isn't set), false otherwise. */ static bool pipapo_step_diff(u8 *base, int step, int len) { /* Network order, byte-addressed */ #ifdef __BIG_ENDIAN__ return !(BIT(step % BITS_PER_BYTE) & base[step / BITS_PER_BYTE]); #else return !(BIT(step % BITS_PER_BYTE) & base[len - 1 - step / BITS_PER_BYTE]); #endif } /** * pipapo_step_after_end() - Check if mask exceeds range end with given step * @base: Mask we are expanding * @end: End of range * @step: Step bit for given expansion step, highest bit to be set * @len: Total length of mask space (set and unset bits), bytes * * Convenience function for mask expansion. * * Return: true if mask exceeds range setting step bits, false otherwise. */ static bool pipapo_step_after_end(const u8 *base, const u8 *end, int step, int len) { u8 tmp[NFT_PIPAPO_MAX_BYTES]; int i; memcpy(tmp, base, len); /* Network order, byte-addressed */ for (i = 0; i <= step; i++) #ifdef __BIG_ENDIAN__ tmp[i / BITS_PER_BYTE] |= BIT(i % BITS_PER_BYTE); #else tmp[len - 1 - i / BITS_PER_BYTE] |= BIT(i % BITS_PER_BYTE); #endif return memcmp(tmp, end, len) > 0; } /** * pipapo_base_sum() - Sum step bit to given len-sized netmask base with carry * @base: Netmask base * @step: Step bit to sum * @len: Netmask length, bytes */ static void pipapo_base_sum(u8 *base, int step, int len) { bool carry = false; int i; /* Network order, byte-addressed */ #ifdef __BIG_ENDIAN__ for (i = step / BITS_PER_BYTE; i < len; i++) { #else for (i = len - 1 - step / BITS_PER_BYTE; i >= 0; i--) { #endif if (carry) base[i]++; else base[i] += 1 << (step % BITS_PER_BYTE); if (base[i]) break; carry = true; } } /** * pipapo_expand() - Expand to composing netmasks, insert into lookup table * @f: Field containing lookup table * @start: Start of range * @end: End of range * @len: Length of value in bits * * Expand range to composing netmasks and insert corresponding rule references * in lookup buckets. * * Return: number of inserted rules on success, negative error code on failure. */ static int pipapo_expand(struct nft_pipapo_field *f, const u8 *start, const u8 *end, int len) { int step, masks = 0, bytes = DIV_ROUND_UP(len, BITS_PER_BYTE); u8 base[NFT_PIPAPO_MAX_BYTES]; memcpy(base, start, bytes); while (memcmp(base, end, bytes) <= 0) { int err; step = 0; while (pipapo_step_diff(base, step, bytes)) { if (pipapo_step_after_end(base, end, step, bytes)) break; step++; if (step >= len) { if (!masks) { err = pipapo_insert(f, base, 0); if (err < 0) return err; masks = 1; } goto out; } } err = pipapo_insert(f, base, len - step); if (err < 0) return err; masks++; pipapo_base_sum(base, step, bytes); } out: return masks; } /** * pipapo_map() - Insert rules in mapping tables, mapping them between fields * @m: Matching data, including mapping table * @map: Table of rule maps: array of first rule and amount of rules * in next field a given rule maps to, for each field * @e: For last field, nft_set_ext pointer matching rules map to */ static void pipapo_map(struct nft_pipapo_match *m, union nft_pipapo_map_bucket map[NFT_PIPAPO_MAX_FIELDS], struct nft_pipapo_elem *e) { struct nft_pipapo_field *f; int i, j; for (i = 0, f = m->f; i < m->field_count - 1; i++, f++) { for (j = 0; j < map[i].n; j++) { f->mt[map[i].to + j].to = map[i + 1].to; f->mt[map[i].to + j].n = map[i + 1].n; } } /* Last field: map to ext instead of mapping to next field */ for (j = 0; j < map[i].n; j++) f->mt[map[i].to + j].e = e; } /** * pipapo_free_scratch() - Free per-CPU map at original (not aligned) address * @m: Matching data * @cpu: CPU number */ static void pipapo_free_scratch(const struct nft_pipapo_match *m, unsigned int cpu) { struct nft_pipapo_scratch *s; void *mem; s = *per_cpu_ptr(m->scratch, cpu); if (!s) return; mem = s; mem -= s->align_off; kfree(mem); } /** * pipapo_realloc_scratch() - Reallocate scratch maps for partial match results * @clone: Copy of matching data with pending insertions and deletions * @bsize_max: Maximum bucket size, scratch maps cover two buckets * * Return: 0 on success, -ENOMEM on failure. */ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, unsigned long bsize_max) { int i; for_each_possible_cpu(i) { struct nft_pipapo_scratch *scratch; #ifdef NFT_PIPAPO_ALIGN void *scratch_aligned; u32 align_off; #endif scratch = kzalloc_node(struct_size(scratch, map, bsize_max * 2) + NFT_PIPAPO_ALIGN_HEADROOM, GFP_KERNEL, cpu_to_node(i)); if (!scratch) { /* On failure, there's no need to undo previous * allocations: this means that some scratch maps have * a bigger allocated size now (this is only called on * insertion), but the extra space won't be used by any * CPU as new elements are not inserted and m->bsize_max * is not updated. */ return -ENOMEM; } pipapo_free_scratch(clone, i); #ifdef NFT_PIPAPO_ALIGN /* Align &scratch->map (not the struct itself): the extra * %NFT_PIPAPO_ALIGN_HEADROOM bytes passed to kzalloc_node() * above guarantee we can waste up to those bytes in order * to align the map field regardless of its offset within * the struct. */ BUILD_BUG_ON(offsetof(struct nft_pipapo_scratch, map) > NFT_PIPAPO_ALIGN_HEADROOM); scratch_aligned = NFT_PIPAPO_LT_ALIGN(&scratch->map); scratch_aligned -= offsetof(struct nft_pipapo_scratch, map); align_off = scratch_aligned - (void *)scratch; scratch = scratch_aligned; scratch->align_off = align_off; #endif *per_cpu_ptr(clone->scratch, i) = scratch; } return 0; } /** * nft_pipapo_insert() - Validate and insert ranged elements * @net: Network namespace * @set: nftables API set representation * @elem: nftables API element representation containing key data * @elem_priv: Filled with pointer to &struct nft_set_ext in inserted element * * Return: 0 on success, error pointer on failure. */ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set, const struct nft_set_elem *elem, struct nft_elem_priv **elem_priv) { const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv); union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS]; const u8 *start = (const u8 *)elem->key.val.data, *end; struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *m = priv->clone; u8 genmask = nft_genmask_next(net); struct nft_pipapo_elem *e, *dup; u64 tstamp = nft_net_tstamp(net); struct nft_pipapo_field *f; const u8 *start_p, *end_p; int i, bsize_max, err = 0; if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) end = (const u8 *)nft_set_ext_key_end(ext)->data; else end = start; dup = pipapo_get(net, set, start, genmask, tstamp, GFP_KERNEL); if (!IS_ERR(dup)) { /* Check if we already have the same exact entry */ const struct nft_data *dup_key, *dup_end; dup_key = nft_set_ext_key(&dup->ext); if (nft_set_ext_exists(&dup->ext, NFT_SET_EXT_KEY_END)) dup_end = nft_set_ext_key_end(&dup->ext); else dup_end = dup_key; if (!memcmp(start, dup_key->data, sizeof(*dup_key->data)) && !memcmp(end, dup_end->data, sizeof(*dup_end->data))) { *elem_priv = &dup->priv; return -EEXIST; } return -ENOTEMPTY; } if (PTR_ERR(dup) == -ENOENT) { /* Look for partially overlapping entries */ dup = pipapo_get(net, set, end, nft_genmask_next(net), tstamp, GFP_KERNEL); } if (PTR_ERR(dup) != -ENOENT) { if (IS_ERR(dup)) return PTR_ERR(dup); *elem_priv = &dup->priv; return -ENOTEMPTY; } /* Validate */ start_p = start; end_p = end; /* some helpers return -1, or 0 >= for valid rule pos, * so we cannot support more than INT_MAX rules at this time. */ BUILD_BUG_ON(NFT_PIPAPO_RULE0_MAX > INT_MAX); nft_pipapo_for_each_field(f, i, m) { if (f->rules >= NFT_PIPAPO_RULE0_MAX) return -ENOSPC; if (memcmp(start_p, end_p, f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f)) > 0) return -EINVAL; start_p += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); end_p += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); } /* Insert */ priv->dirty = true; bsize_max = m->bsize_max; nft_pipapo_for_each_field(f, i, m) { int ret; rulemap[i].to = f->rules; ret = memcmp(start, end, f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f)); if (!ret) ret = pipapo_insert(f, start, f->groups * f->bb); else ret = pipapo_expand(f, start, end, f->groups * f->bb); if (ret < 0) return ret; if (f->bsize > bsize_max) bsize_max = f->bsize; rulemap[i].n = ret; start += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); end += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); } if (!*get_cpu_ptr(m->scratch) || bsize_max > m->bsize_max) { put_cpu_ptr(m->scratch); err = pipapo_realloc_scratch(m, bsize_max); if (err) return err; m->bsize_max = bsize_max; } else { put_cpu_ptr(m->scratch); } e = nft_elem_priv_cast(elem->priv); *elem_priv = &e->priv; pipapo_map(m, rulemap, e); return 0; } /** * pipapo_clone() - Clone matching data to create new working copy * @old: Existing matching data * * Return: copy of matching data passed as 'old', error pointer on failure */ static struct nft_pipapo_match *pipapo_clone(struct nft_pipapo_match *old) { struct nft_pipapo_field *dst, *src; struct nft_pipapo_match *new; int i; new = kmalloc(struct_size(new, f, old->field_count), GFP_KERNEL); if (!new) return ERR_PTR(-ENOMEM); new->field_count = old->field_count; new->bsize_max = old->bsize_max; new->scratch = alloc_percpu(*new->scratch); if (!new->scratch) goto out_scratch; for_each_possible_cpu(i) *per_cpu_ptr(new->scratch, i) = NULL; if (pipapo_realloc_scratch(new, old->bsize_max)) goto out_scratch_realloc; rcu_head_init(&new->rcu); src = old->f; dst = new->f; for (i = 0; i < old->field_count; i++) { unsigned long *new_lt; memcpy(dst, src, offsetof(struct nft_pipapo_field, lt)); new_lt = kvzalloc(src->groups * NFT_PIPAPO_BUCKETS(src->bb) * src->bsize * sizeof(*dst->lt) + NFT_PIPAPO_ALIGN_HEADROOM, GFP_KERNEL); if (!new_lt) goto out_lt; dst->lt = new_lt; memcpy(NFT_PIPAPO_LT_ALIGN(new_lt), NFT_PIPAPO_LT_ALIGN(src->lt), src->bsize * sizeof(*dst->lt) * src->groups * NFT_PIPAPO_BUCKETS(src->bb)); if (src->rules > 0) { dst->mt = kvmalloc_array(src->rules_alloc, sizeof(*src->mt), GFP_KERNEL); if (!dst->mt) goto out_mt; memcpy(dst->mt, src->mt, src->rules * sizeof(*src->mt)); } else { dst->mt = NULL; dst->rules_alloc = 0; } src++; dst++; } return new; out_mt: kvfree(dst->lt); out_lt: for (dst--; i > 0; i--) { kvfree(dst->mt); kvfree(dst->lt); dst--; } out_scratch_realloc: for_each_possible_cpu(i) pipapo_free_scratch(new, i); out_scratch: free_percpu(new->scratch); kfree(new); return ERR_PTR(-ENOMEM); } /** * pipapo_rules_same_key() - Get number of rules originated from the same entry * @f: Field containing mapping table * @first: Index of first rule in set of rules mapping to same entry * * Using the fact that all rules in a field that originated from the same entry * will map to the same set of rules in the next field, or to the same element * reference, return the cardinality of the set of rules that originated from * the same entry as the rule with index @first, @first rule included. * * In pictures: * rules * field #0 0 1 2 3 4 * map to: 0 1 2-4 2-4 5-9 * . . ....... . ... * | | | | \ \ * | | | | \ \ * | | | | \ \ * ' ' ' ' ' \ * in field #1 0 1 2 3 4 5 ... * * if this is called for rule 2 on field #0, it will return 3, as also rules 2 * and 3 in field 0 map to the same set of rules (2, 3, 4) in the next field. * * For the last field in a set, we can rely on associated entries to map to the * same element references. * * Return: Number of rules that originated from the same entry as @first. */ static unsigned int pipapo_rules_same_key(struct nft_pipapo_field *f, unsigned int first) { struct nft_pipapo_elem *e = NULL; /* Keep gcc happy */ unsigned int r; for (r = first; r < f->rules; r++) { if (r != first && e != f->mt[r].e) return r - first; e = f->mt[r].e; } if (r != first) return r - first; return 0; } /** * pipapo_unmap() - Remove rules from mapping tables, renumber remaining ones * @mt: Mapping array * @rules: Original amount of rules in mapping table * @start: First rule index to be removed * @n: Amount of rules to be removed * @to_offset: First rule index, in next field, this group of rules maps to * @is_last: If this is the last field, delete reference from mapping array * * This is used to unmap rules from the mapping table for a single field, * maintaining consistency and compactness for the existing ones. * * In pictures: let's assume that we want to delete rules 2 and 3 from the * following mapping array: * * rules * 0 1 2 3 4 * map to: 4-10 4-10 11-15 11-15 16-18 * * the result will be: * * rules * 0 1 2 * map to: 4-10 4-10 11-13 * * for fields before the last one. In case this is the mapping table for the * last field in a set, and rules map to pointers to &struct nft_pipapo_elem: * * rules * 0 1 2 3 4 * element pointers: 0x42 0x42 0x33 0x33 0x44 * * the result will be: * * rules * 0 1 2 * element pointers: 0x42 0x42 0x44 */ static void pipapo_unmap(union nft_pipapo_map_bucket *mt, unsigned int rules, unsigned int start, unsigned int n, unsigned int to_offset, bool is_last) { int i; memmove(mt + start, mt + start + n, (rules - start - n) * sizeof(*mt)); memset(mt + rules - n, 0, n * sizeof(*mt)); if (is_last) return; for (i = start; i < rules - n; i++) mt[i].to -= to_offset; } /** * pipapo_drop() - Delete entry from lookup and mapping tables, given rule map * @m: Matching data * @rulemap: Table of rule maps, arrays of first rule and amount of rules * in next field a given entry maps to, for each field * * For each rule in lookup table buckets mapping to this set of rules, drop * all bits set in lookup table mapping. In pictures, assuming we want to drop * rules 0 and 1 from this lookup table: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 0 1,2 * 1 1,2 0 * 2 0 1,2 * 3 0 1,2 * 4 0,1,2 * 5 0 1 2 * 6 0,1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 * 7 1,2 1,2 1 1 1 0,1 1 1 1 1 1 1 1 1 1 1 * * rule 2 becomes rule 0, and the result will be: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 0 * 1 0 * 2 0 * 3 0 * 4 0 * 5 0 * 6 0 * 7 0 0 * * once this is done, call unmap() to drop all the corresponding rule references * from mapping tables. */ static void pipapo_drop(struct nft_pipapo_match *m, union nft_pipapo_map_bucket rulemap[]) { struct nft_pipapo_field *f; int i; nft_pipapo_for_each_field(f, i, m) { int g; for (g = 0; g < f->groups; g++) { unsigned long *pos; int b; pos = NFT_PIPAPO_LT_ALIGN(f->lt) + g * NFT_PIPAPO_BUCKETS(f->bb) * f->bsize; for (b = 0; b < NFT_PIPAPO_BUCKETS(f->bb); b++) { bitmap_cut(pos, pos, rulemap[i].to, rulemap[i].n, f->bsize * BITS_PER_LONG); pos += f->bsize; } } pipapo_unmap(f->mt, f->rules, rulemap[i].to, rulemap[i].n, rulemap[i + 1].n, i == m->field_count - 1); if (pipapo_resize(f, f->rules, f->rules - rulemap[i].n)) { /* We can ignore this, a failure to shrink tables down * doesn't make tables invalid. */ ; } f->rules -= rulemap[i].n; pipapo_lt_bits_adjust(f); } } static void nft_pipapo_gc_deactivate(struct net *net, struct nft_set *set, struct nft_pipapo_elem *e) { nft_setelem_data_deactivate(net, set, &e->priv); } /** * pipapo_gc() - Drop expired entries from set, destroy start and end elements * @set: nftables API set representation * @m: Matching data */ static void pipapo_gc(struct nft_set *set, struct nft_pipapo_match *m) { struct nft_pipapo *priv = nft_set_priv(set); struct net *net = read_pnet(&set->net); unsigned int rules_f0, first_rule = 0; u64 tstamp = nft_net_tstamp(net); struct nft_pipapo_elem *e; struct nft_trans_gc *gc; gc = nft_trans_gc_alloc(set, 0, GFP_KERNEL); if (!gc) return; while ((rules_f0 = pipapo_rules_same_key(m->f, first_rule))) { union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS]; const struct nft_pipapo_field *f; unsigned int i, start, rules_fx; start = first_rule; rules_fx = rules_f0; nft_pipapo_for_each_field(f, i, m) { rulemap[i].to = start; rulemap[i].n = rules_fx; if (i < m->field_count - 1) { rules_fx = f->mt[start].n; start = f->mt[start].to; } } /* Pick the last field, and its last index */ f--; i--; e = f->mt[rulemap[i].to].e; /* synchronous gc never fails, there is no need to set on * NFT_SET_ELEM_DEAD_BIT. */ if (__nft_set_elem_expired(&e->ext, tstamp)) { priv->dirty = true; gc = nft_trans_gc_queue_sync(gc, GFP_KERNEL); if (!gc) return; nft_pipapo_gc_deactivate(net, set, e); pipapo_drop(m, rulemap); nft_trans_gc_elem_add(gc, e); /* And check again current first rule, which is now the * first we haven't checked. */ } else { first_rule += rules_f0; } } gc = nft_trans_gc_catchall_sync(gc); if (gc) { nft_trans_gc_queue_sync_done(gc); priv->last_gc = jiffies; } } /** * pipapo_free_fields() - Free per-field tables contained in matching data * @m: Matching data */ static void pipapo_free_fields(struct nft_pipapo_match *m) { struct nft_pipapo_field *f; int i; nft_pipapo_for_each_field(f, i, m) { kvfree(f->lt); kvfree(f->mt); } } static void pipapo_free_match(struct nft_pipapo_match *m) { int i; for_each_possible_cpu(i) pipapo_free_scratch(m, i); free_percpu(m->scratch); pipapo_free_fields(m); kfree(m); } /** * pipapo_reclaim_match - RCU callback to free fields from old matching data * @rcu: RCU head */ static void pipapo_reclaim_match(struct rcu_head *rcu) { struct nft_pipapo_match *m; m = container_of(rcu, struct nft_pipapo_match, rcu); pipapo_free_match(m); } /** * nft_pipapo_commit() - Replace lookup data with current working copy * @set: nftables API set representation * * While at it, check if we should perform garbage collection on the working * copy before committing it for lookup, and don't replace the table if the * working copy doesn't have pending changes. * * We also need to create a new working copy for subsequent insertions and * deletions. */ static void nft_pipapo_commit(struct nft_set *set) { struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *new_clone, *old; if (time_after_eq(jiffies, priv->last_gc + nft_set_gc_interval(set))) pipapo_gc(set, priv->clone); if (!priv->dirty) return; new_clone = pipapo_clone(priv->clone); if (IS_ERR(new_clone)) return; priv->dirty = false; old = rcu_access_pointer(priv->match); rcu_assign_pointer(priv->match, priv->clone); if (old) call_rcu(&old->rcu, pipapo_reclaim_match); priv->clone = new_clone; } static bool nft_pipapo_transaction_mutex_held(const struct nft_set *set) { #ifdef CONFIG_PROVE_LOCKING const struct net *net = read_pnet(&set->net); return lockdep_is_held(&nft_pernet(net)->commit_mutex); #else return true; #endif } static void nft_pipapo_abort(const struct nft_set *set) { struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *new_clone, *m; if (!priv->dirty) return; m = rcu_dereference_protected(priv->match, nft_pipapo_transaction_mutex_held(set)); new_clone = pipapo_clone(m); if (IS_ERR(new_clone)) return; priv->dirty = false; pipapo_free_match(priv->clone); priv->clone = new_clone; } /** * nft_pipapo_activate() - Mark element reference as active given key, commit * @net: Network namespace * @set: nftables API set representation * @elem_priv: nftables API element representation containing key data * * On insertion, elements are added to a copy of the matching data currently * in use for lookups, and not directly inserted into current lookup data. Both * nft_pipapo_insert() and nft_pipapo_activate() are called once for each * element, hence we can't purpose either one as a real commit operation. */ static void nft_pipapo_activate(const struct net *net, const struct nft_set *set, struct nft_elem_priv *elem_priv) { struct nft_pipapo_elem *e = nft_elem_priv_cast(elem_priv); nft_set_elem_change_active(net, set, &e->ext); } /** * pipapo_deactivate() - Check that element is in set, mark as inactive * @net: Network namespace * @set: nftables API set representation * @data: Input key data * @ext: nftables API extension pointer, used to check for end element * * This is a convenience function that can be called from both * nft_pipapo_deactivate() and nft_pipapo_flush(), as they are in fact the same * operation. * * Return: deactivated element if found, NULL otherwise. */ static void *pipapo_deactivate(const struct net *net, const struct nft_set *set, const u8 *data, const struct nft_set_ext *ext) { struct nft_pipapo_elem *e; e = pipapo_get(net, set, data, nft_genmask_next(net), nft_net_tstamp(net), GFP_KERNEL); if (IS_ERR(e)) return NULL; nft_set_elem_change_active(net, set, &e->ext); return e; } /** * nft_pipapo_deactivate() - Call pipapo_deactivate() to make element inactive * @net: Network namespace * @set: nftables API set representation * @elem: nftables API element representation containing key data * * Return: deactivated element if found, NULL otherwise. */ static struct nft_elem_priv * nft_pipapo_deactivate(const struct net *net, const struct nft_set *set, const struct nft_set_elem *elem) { const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv); return pipapo_deactivate(net, set, (const u8 *)elem->key.val.data, ext); } /** * nft_pipapo_flush() - Call pipapo_deactivate() to make element inactive * @net: Network namespace * @set: nftables API set representation * @elem_priv: nftables API element representation containing key data * * This is functionally the same as nft_pipapo_deactivate(), with a slightly * different interface, and it's also called once for each element in a set * being flushed, so we can't implement, strictly speaking, a flush operation, * which would otherwise be as simple as allocating an empty copy of the * matching data. * * Note that we could in theory do that, mark the set as flushed, and ignore * subsequent calls, but we would leak all the elements after the first one, * because they wouldn't then be freed as result of API calls. * * Return: true if element was found and deactivated. */ static void nft_pipapo_flush(const struct net *net, const struct nft_set *set, struct nft_elem_priv *elem_priv) { struct nft_pipapo_elem *e = nft_elem_priv_cast(elem_priv); nft_set_elem_change_active(net, set, &e->ext); } /** * pipapo_get_boundaries() - Get byte interval for associated rules * @f: Field including lookup table * @first_rule: First rule (lowest index) * @rule_count: Number of associated rules * @left: Byte expression for left boundary (start of range) * @right: Byte expression for right boundary (end of range) * * Given the first rule and amount of rules that originated from the same entry, * build the original range associated with the entry, and calculate the length * of the originating netmask. * * In pictures: * * bucket * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 * 0 1,2 * 1 1,2 * 2 1,2 * 3 1,2 * 4 1,2 * 5 1 2 * 6 1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 * 7 1,2 1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 * * this is the lookup table corresponding to the IPv4 range * 192.168.1.0-192.168.2.1, which was expanded to the two composing netmasks, * rule #1: 192.168.1.0/24, and rule #2: 192.168.2.0/31. * * This function fills @left and @right with the byte values of the leftmost * and rightmost bucket indices for the lowest and highest rule indices, * respectively. If @first_rule is 1 and @rule_count is 2, we obtain, in * nibbles: * left: < 12, 0, 10, 8, 0, 1, 0, 0 > * right: < 12, 0, 10, 8, 0, 2, 2, 1 > * corresponding to bytes: * left: < 192, 168, 1, 0 > * right: < 192, 168, 2, 1 > * with mask length irrelevant here, unused on return, as the range is already * defined by its start and end points. The mask length is relevant for a single * ranged entry instead: if @first_rule is 1 and @rule_count is 1, we ignore * rule 2 above: @left becomes < 192, 168, 1, 0 >, @right becomes * < 192, 168, 1, 255 >, and the mask length, calculated from the distances * between leftmost and rightmost bucket indices for each group, would be 24. * * Return: mask length, in bits. */ static int pipapo_get_boundaries(struct nft_pipapo_field *f, int first_rule, int rule_count, u8 *left, u8 *right) { int g, mask_len = 0, bit_offset = 0; u8 *l = left, *r = right; for (g = 0; g < f->groups; g++) { int b, x0, x1; x0 = -1; x1 = -1; for (b = 0; b < NFT_PIPAPO_BUCKETS(f->bb); b++) { unsigned long *pos; pos = NFT_PIPAPO_LT_ALIGN(f->lt) + (g * NFT_PIPAPO_BUCKETS(f->bb) + b) * f->bsize; if (test_bit(first_rule, pos) && x0 == -1) x0 = b; if (test_bit(first_rule + rule_count - 1, pos)) x1 = b; } *l |= x0 << (BITS_PER_BYTE - f->bb - bit_offset); *r |= x1 << (BITS_PER_BYTE - f->bb - bit_offset); bit_offset += f->bb; if (bit_offset >= BITS_PER_BYTE) { bit_offset %= BITS_PER_BYTE; l++; r++; } if (x1 - x0 == 0) mask_len += 4; else if (x1 - x0 == 1) mask_len += 3; else if (x1 - x0 == 3) mask_len += 2; else if (x1 - x0 == 7) mask_len += 1; } return mask_len; } /** * pipapo_match_field() - Match rules against byte ranges * @f: Field including the lookup table * @first_rule: First of associated rules originating from same entry * @rule_count: Amount of associated rules * @start: Start of range to be matched * @end: End of range to be matched * * Return: true on match, false otherwise. */ static bool pipapo_match_field(struct nft_pipapo_field *f, int first_rule, int rule_count, const u8 *start, const u8 *end) { u8 right[NFT_PIPAPO_MAX_BYTES] = { 0 }; u8 left[NFT_PIPAPO_MAX_BYTES] = { 0 }; pipapo_get_boundaries(f, first_rule, rule_count, left, right); return !memcmp(start, left, f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f)) && !memcmp(end, right, f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f)); } /** * nft_pipapo_remove() - Remove element given key, commit * @net: Network namespace * @set: nftables API set representation * @elem_priv: nftables API element representation containing key data * * Similarly to nft_pipapo_activate(), this is used as commit operation by the * API, but it's called once per element in the pending transaction, so we can't * implement this as a single commit operation. Closest we can get is to remove * the matched element here, if any, and commit the updated matching data. */ static void nft_pipapo_remove(const struct net *net, const struct nft_set *set, struct nft_elem_priv *elem_priv) { struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *m = priv->clone; unsigned int rules_f0, first_rule = 0; struct nft_pipapo_elem *e; const u8 *data; e = nft_elem_priv_cast(elem_priv); data = (const u8 *)nft_set_ext_key(&e->ext); while ((rules_f0 = pipapo_rules_same_key(m->f, first_rule))) { union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS]; const u8 *match_start, *match_end; struct nft_pipapo_field *f; int i, start, rules_fx; match_start = data; if (nft_set_ext_exists(&e->ext, NFT_SET_EXT_KEY_END)) match_end = (const u8 *)nft_set_ext_key_end(&e->ext)->data; else match_end = data; start = first_rule; rules_fx = rules_f0; nft_pipapo_for_each_field(f, i, m) { if (!pipapo_match_field(f, start, rules_fx, match_start, match_end)) break; rulemap[i].to = start; rulemap[i].n = rules_fx; rules_fx = f->mt[start].n; start = f->mt[start].to; match_start += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); match_end += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); } if (i == m->field_count) { priv->dirty = true; pipapo_drop(m, rulemap); return; } first_rule += rules_f0; } } /** * nft_pipapo_walk() - Walk over elements * @ctx: nftables API context * @set: nftables API set representation * @iter: Iterator * * As elements are referenced in the mapping array for the last field, directly * scan that array: there's no need to follow rule mappings from the first * field. */ static void nft_pipapo_walk(const struct nft_ctx *ctx, struct nft_set *set, struct nft_set_iter *iter) { struct nft_pipapo *priv = nft_set_priv(set); struct net *net = read_pnet(&set->net); const struct nft_pipapo_match *m; const struct nft_pipapo_field *f; unsigned int i, r; rcu_read_lock(); if (iter->genmask == nft_genmask_cur(net)) m = rcu_dereference(priv->match); else m = priv->clone; if (unlikely(!m)) goto out; for (i = 0, f = m->f; i < m->field_count - 1; i++, f++) ; for (r = 0; r < f->rules; r++) { struct nft_pipapo_elem *e; if (r < f->rules - 1 && f->mt[r + 1].e == f->mt[r].e) continue; if (iter->count < iter->skip) goto cont; e = f->mt[r].e; if (!nft_set_elem_active(&e->ext, iter->genmask)) goto cont; iter->err = iter->fn(ctx, set, iter, &e->priv); if (iter->err < 0) goto out; cont: iter->count++; } out: rcu_read_unlock(); } /** * nft_pipapo_privsize() - Return the size of private data for the set * @nla: netlink attributes, ignored as size doesn't depend on them * @desc: Set description, ignored as size doesn't depend on it * * Return: size of private data for this set implementation, in bytes */ static u64 nft_pipapo_privsize(const struct nlattr * const nla[], const struct nft_set_desc *desc) { return sizeof(struct nft_pipapo); } /** * nft_pipapo_estimate() - Set size, space and lookup complexity * @desc: Set description, element count and field description used * @features: Flags: NFT_SET_INTERVAL needs to be there * @est: Storage for estimation data * * Return: true if set description is compatible, false otherwise */ static bool nft_pipapo_estimate(const struct nft_set_desc *desc, u32 features, struct nft_set_estimate *est) { if (!(features & NFT_SET_INTERVAL) || desc->field_count < NFT_PIPAPO_MIN_FIELDS) return false; est->size = pipapo_estimate_size(desc); if (!est->size) return false; est->lookup = NFT_SET_CLASS_O_LOG_N; est->space = NFT_SET_CLASS_O_N; return true; } /** * nft_pipapo_init() - Initialise data for a set instance * @set: nftables API set representation * @desc: Set description * @nla: netlink attributes * * Validate number and size of fields passed as NFTA_SET_DESC_CONCAT netlink * attributes, initialise internal set parameters, current instance of matching * data and a copy for subsequent insertions. * * Return: 0 on success, negative error code on failure. */ static int nft_pipapo_init(const struct nft_set *set, const struct nft_set_desc *desc, const struct nlattr * const nla[]) { struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *m; struct nft_pipapo_field *f; int err, i, field_count; BUILD_BUG_ON(offsetof(struct nft_pipapo_elem, priv) != 0); field_count = desc->field_count ? : 1; BUILD_BUG_ON(NFT_PIPAPO_MAX_FIELDS > 255); BUILD_BUG_ON(NFT_PIPAPO_MAX_FIELDS != NFT_REG32_COUNT); if (field_count > NFT_PIPAPO_MAX_FIELDS) return -EINVAL; m = kmalloc(struct_size(m, f, field_count), GFP_KERNEL); if (!m) return -ENOMEM; m->field_count = field_count; m->bsize_max = 0; m->scratch = alloc_percpu(struct nft_pipapo_scratch *); if (!m->scratch) { err = -ENOMEM; goto out_scratch; } for_each_possible_cpu(i) *per_cpu_ptr(m->scratch, i) = NULL; rcu_head_init(&m->rcu); nft_pipapo_for_each_field(f, i, m) { unsigned int len = desc->field_len[i] ? : set->klen; /* f->groups is u8 */ BUILD_BUG_ON((NFT_PIPAPO_MAX_BYTES * BITS_PER_BYTE / NFT_PIPAPO_GROUP_BITS_LARGE_SET) >= 256); f->bb = NFT_PIPAPO_GROUP_BITS_INIT; f->groups = len * NFT_PIPAPO_GROUPS_PER_BYTE(f); priv->width += round_up(len, sizeof(u32)); f->bsize = 0; f->rules = 0; f->rules_alloc = 0; f->lt = NULL; f->mt = NULL; } /* Create an initial clone of matching data for next insertion */ priv->clone = pipapo_clone(m); if (IS_ERR(priv->clone)) { err = PTR_ERR(priv->clone); goto out_free; } priv->dirty = false; rcu_assign_pointer(priv->match, m); return 0; out_free: free_percpu(m->scratch); out_scratch: kfree(m); return err; } /** * nft_set_pipapo_match_destroy() - Destroy elements from key mapping array * @ctx: context * @set: nftables API set representation * @m: matching data pointing to key mapping array */ static void nft_set_pipapo_match_destroy(const struct nft_ctx *ctx, const struct nft_set *set, struct nft_pipapo_match *m) { struct nft_pipapo_field *f; unsigned int i, r; for (i = 0, f = m->f; i < m->field_count - 1; i++, f++) ; for (r = 0; r < f->rules; r++) { struct nft_pipapo_elem *e; if (r < f->rules - 1 && f->mt[r + 1].e == f->mt[r].e) continue; e = f->mt[r].e; nf_tables_set_elem_destroy(ctx, set, &e->priv); } } /** * nft_pipapo_destroy() - Free private data for set and all committed elements * @ctx: context * @set: nftables API set representation */ static void nft_pipapo_destroy(const struct nft_ctx *ctx, const struct nft_set *set) { struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *m; int cpu; m = rcu_dereference_protected(priv->match, true); if (m) { rcu_barrier(); nft_set_pipapo_match_destroy(ctx, set, m); for_each_possible_cpu(cpu) pipapo_free_scratch(m, cpu); free_percpu(m->scratch); pipapo_free_fields(m); kfree(m); priv->match = NULL; } if (priv->clone) { m = priv->clone; if (priv->dirty) nft_set_pipapo_match_destroy(ctx, set, m); for_each_possible_cpu(cpu) pipapo_free_scratch(priv->clone, cpu); free_percpu(priv->clone->scratch); pipapo_free_fields(priv->clone); kfree(priv->clone); priv->clone = NULL; } } /** * nft_pipapo_gc_init() - Initialise garbage collection * @set: nftables API set representation * * Instead of actually setting up a periodic work for garbage collection, as * this operation requires a swap of matching data with the working copy, we'll * do that opportunistically with other commit operations if the interval is * elapsed, so we just need to set the current jiffies timestamp here. */ static void nft_pipapo_gc_init(const struct nft_set *set) { struct nft_pipapo *priv = nft_set_priv(set); priv->last_gc = jiffies; } const struct nft_set_type nft_set_pipapo_type = { .features = NFT_SET_INTERVAL | NFT_SET_MAP | NFT_SET_OBJECT | NFT_SET_TIMEOUT, .ops = { .lookup = nft_pipapo_lookup, .insert = nft_pipapo_insert, .activate = nft_pipapo_activate, .deactivate = nft_pipapo_deactivate, .flush = nft_pipapo_flush, .remove = nft_pipapo_remove, .walk = nft_pipapo_walk, .get = nft_pipapo_get, .privsize = nft_pipapo_privsize, .estimate = nft_pipapo_estimate, .init = nft_pipapo_init, .destroy = nft_pipapo_destroy, .gc_init = nft_pipapo_gc_init, .commit = nft_pipapo_commit, .abort = nft_pipapo_abort, .elemsize = offsetof(struct nft_pipapo_elem, ext), }, }; #if defined(CONFIG_X86_64) && !defined(CONFIG_UML) const struct nft_set_type nft_set_pipapo_avx2_type = { .features = NFT_SET_INTERVAL | NFT_SET_MAP | NFT_SET_OBJECT | NFT_SET_TIMEOUT, .ops = { .lookup = nft_pipapo_avx2_lookup, .insert = nft_pipapo_insert, .activate = nft_pipapo_activate, .deactivate = nft_pipapo_deactivate, .flush = nft_pipapo_flush, .remove = nft_pipapo_remove, .walk = nft_pipapo_walk, .get = nft_pipapo_get, .privsize = nft_pipapo_privsize, .estimate = nft_pipapo_avx2_estimate, .init = nft_pipapo_init, .destroy = nft_pipapo_destroy, .gc_init = nft_pipapo_gc_init, .commit = nft_pipapo_commit, .abort = nft_pipapo_abort, .elemsize = offsetof(struct nft_pipapo_elem, ext), }, }; #endif |
35 2 2 13 15 12 9 9 6 1 2 3 5 4 2 5 3 3 3 3 2 1 3 3 3 3 3 3 2 3 3 3 5 5 5 1 5 5 4 3 4 9 13 13 13 11 11 10 15 15 5 5 5 1 1 1 1 1 1 1 1 1 1 2 2 3 3 1 1 1 1 1 1 1 1 1 2 1 1 1 1 20 18 2 1 1 17 1 1 2 1 1 13 17 1 3 3 5 5 2 3 2 3 3 2 2 2 11 3 8 11 10 2 2 2 2 5 4 4 4 2 2 2 2 1 6 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 | // SPDX-License-Identifier: GPL-2.0+ /* * Driver core for serial ports * * Based on drivers/char/serial.c, by Linus Torvalds, Theodore Ts'o. * * Copyright 1999 ARM Limited * Copyright (C) 2000-2001 Deep Blue Solutions Ltd. */ #include <linux/module.h> #include <linux/tty.h> #include <linux/tty_flip.h> #include <linux/slab.h> #include <linux/sched/signal.h> #include <linux/init.h> #include <linux/console.h> #include <linux/gpio/consumer.h> #include <linux/kernel.h> #include <linux/of.h> #include <linux/pm_runtime.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> #include <linux/device.h> #include <linux/serial.h> /* for serial_state and serial_icounter_struct */ #include <linux/serial_core.h> #include <linux/sysrq.h> #include <linux/delay.h> #include <linux/mutex.h> #include <linux/math64.h> #include <linux/security.h> #include <linux/irq.h> #include <linux/uaccess.h> #include "serial_base.h" /* * This is used to lock changes in serial line configuration. */ static DEFINE_MUTEX(port_mutex); /* * lockdep: port->lock is initialized in two places, but we * want only one lock-class: */ static struct lock_class_key port_lock_key; #define HIGH_BITS_OFFSET ((sizeof(long)-sizeof(int))*8) /* * Max time with active RTS before/after data is sent. */ #define RS485_MAX_RTS_DELAY 100 /* msecs */ static void uart_change_pm(struct uart_state *state, enum uart_pm_state pm_state); static void uart_port_shutdown(struct tty_port *port); static int uart_dcd_enabled(struct uart_port *uport) { return !!(uport->status & UPSTAT_DCD_ENABLE); } static inline struct uart_port *uart_port_ref(struct uart_state *state) { if (atomic_add_unless(&state->refcount, 1, 0)) return state->uart_port; return NULL; } static inline void uart_port_deref(struct uart_port *uport) { if (atomic_dec_and_test(&uport->state->refcount)) wake_up(&uport->state->remove_wait); } #define uart_port_lock(state, flags) \ ({ \ struct uart_port *__uport = uart_port_ref(state); \ if (__uport) \ uart_port_lock_irqsave(__uport, &flags); \ __uport; \ }) #define uart_port_unlock(uport, flags) \ ({ \ struct uart_port *__uport = uport; \ if (__uport) { \ uart_port_unlock_irqrestore(__uport, flags); \ uart_port_deref(__uport); \ } \ }) static inline struct uart_port *uart_port_check(struct uart_state *state) { lockdep_assert_held(&state->port.mutex); return state->uart_port; } /** * uart_write_wakeup - schedule write processing * @port: port to be processed * * This routine is used by the interrupt handler to schedule processing in the * software interrupt portion of the driver. A driver is expected to call this * function when the number of characters in the transmit buffer have dropped * below a threshold. * * Locking: @port->lock should be held */ void uart_write_wakeup(struct uart_port *port) { struct uart_state *state = port->state; /* * This means you called this function _after_ the port was * closed. No cookie for you. */ BUG_ON(!state); tty_port_tty_wakeup(&state->port); } EXPORT_SYMBOL(uart_write_wakeup); static void uart_stop(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct uart_port *port; unsigned long flags; port = uart_port_lock(state, flags); if (port) port->ops->stop_tx(port); uart_port_unlock(port, flags); } static void __uart_start(struct uart_state *state) { struct uart_port *port = state->uart_port; struct serial_port_device *port_dev; int err; if (!port || port->flags & UPF_DEAD || uart_tx_stopped(port)) return; port_dev = port->port_dev; /* Increment the runtime PM usage count for the active check below */ err = pm_runtime_get(&port_dev->dev); if (err < 0 && err != -EINPROGRESS) { pm_runtime_put_noidle(&port_dev->dev); return; } /* * Start TX if enabled, and kick runtime PM. If the device is not * enabled, serial_port_runtime_resume() calls start_tx() again * after enabling the device. */ if (pm_runtime_active(&port_dev->dev)) port->ops->start_tx(port); pm_runtime_mark_last_busy(&port_dev->dev); pm_runtime_put_autosuspend(&port_dev->dev); } static void uart_start(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct uart_port *port; unsigned long flags; port = uart_port_lock(state, flags); __uart_start(state); uart_port_unlock(port, flags); } static void uart_update_mctrl(struct uart_port *port, unsigned int set, unsigned int clear) { unsigned long flags; unsigned int old; uart_port_lock_irqsave(port, &flags); old = port->mctrl; port->mctrl = (old & ~clear) | set; if (old != port->mctrl && !(port->rs485.flags & SER_RS485_ENABLED)) port->ops->set_mctrl(port, port->mctrl); uart_port_unlock_irqrestore(port, flags); } #define uart_set_mctrl(port, set) uart_update_mctrl(port, set, 0) #define uart_clear_mctrl(port, clear) uart_update_mctrl(port, 0, clear) static void uart_port_dtr_rts(struct uart_port *uport, bool active) { if (active) uart_set_mctrl(uport, TIOCM_DTR | TIOCM_RTS); else uart_clear_mctrl(uport, TIOCM_DTR | TIOCM_RTS); } /* Caller holds port mutex */ static void uart_change_line_settings(struct tty_struct *tty, struct uart_state *state, const struct ktermios *old_termios) { struct uart_port *uport = uart_port_check(state); struct ktermios *termios; bool old_hw_stopped; /* * If we have no tty, termios, or the port does not exist, * then we can't set the parameters for this port. */ if (!tty || uport->type == PORT_UNKNOWN) return; termios = &tty->termios; uport->ops->set_termios(uport, termios, old_termios); /* * Set modem status enables based on termios cflag */ uart_port_lock_irq(uport); if (termios->c_cflag & CRTSCTS) uport->status |= UPSTAT_CTS_ENABLE; else uport->status &= ~UPSTAT_CTS_ENABLE; if (termios->c_cflag & CLOCAL) uport->status &= ~UPSTAT_DCD_ENABLE; else uport->status |= UPSTAT_DCD_ENABLE; /* reset sw-assisted CTS flow control based on (possibly) new mode */ old_hw_stopped = uport->hw_stopped; uport->hw_stopped = uart_softcts_mode(uport) && !(uport->ops->get_mctrl(uport) & TIOCM_CTS); if (uport->hw_stopped != old_hw_stopped) { if (!old_hw_stopped) uport->ops->stop_tx(uport); else __uart_start(state); } uart_port_unlock_irq(uport); } /* * Startup the port. This will be called once per open. All calls * will be serialised by the per-port mutex. */ static int uart_port_startup(struct tty_struct *tty, struct uart_state *state, bool init_hw) { struct uart_port *uport = uart_port_check(state); unsigned long flags; unsigned long page; int retval = 0; if (uport->type == PORT_UNKNOWN) return 1; /* * Make sure the device is in D0 state. */ uart_change_pm(state, UART_PM_STATE_ON); /* * Initialise and allocate the transmit and temporary * buffer. */ page = get_zeroed_page(GFP_KERNEL); if (!page) return -ENOMEM; uart_port_lock(state, flags); if (!state->xmit.buf) { state->xmit.buf = (unsigned char *) page; uart_circ_clear(&state->xmit); uart_port_unlock(uport, flags); } else { uart_port_unlock(uport, flags); /* * Do not free() the page under the port lock, see * uart_shutdown(). */ free_page(page); } retval = uport->ops->startup(uport); if (retval == 0) { if (uart_console(uport) && uport->cons->cflag) { tty->termios.c_cflag = uport->cons->cflag; tty->termios.c_ispeed = uport->cons->ispeed; tty->termios.c_ospeed = uport->cons->ospeed; uport->cons->cflag = 0; uport->cons->ispeed = 0; uport->cons->ospeed = 0; } /* * Initialise the hardware port settings. */ uart_change_line_settings(tty, state, NULL); /* * Setup the RTS and DTR signals once the * port is open and ready to respond. */ if (init_hw && C_BAUD(tty)) uart_port_dtr_rts(uport, true); } /* * This is to allow setserial on this port. People may want to set * port/irq/type and then reconfigure the port properly if it failed * now. */ if (retval && capable(CAP_SYS_ADMIN)) return 1; return retval; } static int uart_startup(struct tty_struct *tty, struct uart_state *state, bool init_hw) { struct tty_port *port = &state->port; int retval; if (tty_port_initialized(port)) return 0; retval = uart_port_startup(tty, state, init_hw); if (retval) set_bit(TTY_IO_ERROR, &tty->flags); return retval; } /* * This routine will shutdown a serial port; interrupts are disabled, and * DTR is dropped if the hangup on close termio flag is on. Calls to * uart_shutdown are serialised by the per-port semaphore. * * uport == NULL if uart_port has already been removed */ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state) { struct uart_port *uport = uart_port_check(state); struct tty_port *port = &state->port; unsigned long flags; char *xmit_buf = NULL; /* * Set the TTY IO error marker */ if (tty) set_bit(TTY_IO_ERROR, &tty->flags); if (tty_port_initialized(port)) { tty_port_set_initialized(port, false); /* * Turn off DTR and RTS early. */ if (uport && uart_console(uport) && tty) { uport->cons->cflag = tty->termios.c_cflag; uport->cons->ispeed = tty->termios.c_ispeed; uport->cons->ospeed = tty->termios.c_ospeed; } if (!tty || C_HUPCL(tty)) uart_port_dtr_rts(uport, false); uart_port_shutdown(port); } /* * It's possible for shutdown to be called after suspend if we get * a DCD drop (hangup) at just the right time. Clear suspended bit so * we don't try to resume a port that has been shutdown. */ tty_port_set_suspended(port, false); /* * Do not free() the transmit buffer page under the port lock since * this can create various circular locking scenarios. For instance, * console driver may need to allocate/free a debug object, which * can endup in printk() recursion. */ uart_port_lock(state, flags); xmit_buf = state->xmit.buf; state->xmit.buf = NULL; uart_port_unlock(uport, flags); free_page((unsigned long)xmit_buf); } /** * uart_update_timeout - update per-port frame timing information * @port: uart_port structure describing the port * @cflag: termios cflag value * @baud: speed of the port * * Set the @port frame timing information from which the FIFO timeout value is * derived. The @cflag value should reflect the actual hardware settings as * number of bits, parity, stop bits and baud rate is taken into account here. * * Locking: caller is expected to take @port->lock */ void uart_update_timeout(struct uart_port *port, unsigned int cflag, unsigned int baud) { u64 temp = tty_get_frame_size(cflag); temp *= NSEC_PER_SEC; port->frame_time = (unsigned int)DIV64_U64_ROUND_UP(temp, baud); } EXPORT_SYMBOL(uart_update_timeout); /** * uart_get_baud_rate - return baud rate for a particular port * @port: uart_port structure describing the port in question. * @termios: desired termios settings * @old: old termios (or %NULL) * @min: minimum acceptable baud rate * @max: maximum acceptable baud rate * * Decode the termios structure into a numeric baud rate, taking account of the * magic 38400 baud rate (with spd_* flags), and mapping the %B0 rate to 9600 * baud. * * If the new baud rate is invalid, try the @old termios setting. If it's still * invalid, we try 9600 baud. If that is also invalid 0 is returned. * * The @termios structure is updated to reflect the baud rate we're actually * going to be using. Don't do this for the case where B0 is requested ("hang * up"). * * Locking: caller dependent */ unsigned int uart_get_baud_rate(struct uart_port *port, struct ktermios *termios, const struct ktermios *old, unsigned int min, unsigned int max) { unsigned int try; unsigned int baud; unsigned int altbaud; int hung_up = 0; upf_t flags = port->flags & UPF_SPD_MASK; switch (flags) { case UPF_SPD_HI: altbaud = 57600; break; case UPF_SPD_VHI: altbaud = 115200; break; case UPF_SPD_SHI: altbaud = 230400; break; case UPF_SPD_WARP: altbaud = 460800; break; default: altbaud = 38400; break; } for (try = 0; try < 2; try++) { baud = tty_termios_baud_rate(termios); /* * The spd_hi, spd_vhi, spd_shi, spd_warp kludge... * Die! Die! Die! */ if (try == 0 && baud == 38400) baud = altbaud; /* * Special case: B0 rate. */ if (baud == 0) { hung_up = 1; baud = 9600; } if (baud >= min && baud <= max) return baud; /* * Oops, the quotient was zero. Try again with * the old baud rate if possible. */ termios->c_cflag &= ~CBAUD; if (old) { baud = tty_termios_baud_rate(old); if (!hung_up) tty_termios_encode_baud_rate(termios, baud, baud); old = NULL; continue; } /* * As a last resort, if the range cannot be met then clip to * the nearest chip supported rate. */ if (!hung_up) { if (baud <= min) tty_termios_encode_baud_rate(termios, min + 1, min + 1); else tty_termios_encode_baud_rate(termios, max - 1, max - 1); } } return 0; } EXPORT_SYMBOL(uart_get_baud_rate); /** * uart_get_divisor - return uart clock divisor * @port: uart_port structure describing the port * @baud: desired baud rate * * Calculate the divisor (baud_base / baud) for the specified @baud, * appropriately rounded. * * If 38400 baud and custom divisor is selected, return the custom divisor * instead. * * Locking: caller dependent */ unsigned int uart_get_divisor(struct uart_port *port, unsigned int baud) { unsigned int quot; /* * Old custom speed handling. */ if (baud == 38400 && (port->flags & UPF_SPD_MASK) == UPF_SPD_CUST) quot = port->custom_divisor; else quot = DIV_ROUND_CLOSEST(port->uartclk, 16 * baud); return quot; } EXPORT_SYMBOL(uart_get_divisor); static int uart_put_char(struct tty_struct *tty, u8 c) { struct uart_state *state = tty->driver_data; struct uart_port *port; struct circ_buf *circ; unsigned long flags; int ret = 0; circ = &state->xmit; port = uart_port_lock(state, flags); if (!circ->buf) { uart_port_unlock(port, flags); return 0; } if (port && uart_circ_chars_free(circ) != 0) { circ->buf[circ->head] = c; circ->head = (circ->head + 1) & (UART_XMIT_SIZE - 1); ret = 1; } uart_port_unlock(port, flags); return ret; } static void uart_flush_chars(struct tty_struct *tty) { uart_start(tty); } static ssize_t uart_write(struct tty_struct *tty, const u8 *buf, size_t count) { struct uart_state *state = tty->driver_data; struct uart_port *port; struct circ_buf *circ; unsigned long flags; int c, ret = 0; /* * This means you called this function _after_ the port was * closed. No cookie for you. */ if (WARN_ON(!state)) return -EL3HLT; port = uart_port_lock(state, flags); circ = &state->xmit; if (!circ->buf) { uart_port_unlock(port, flags); return 0; } while (port) { c = CIRC_SPACE_TO_END(circ->head, circ->tail, UART_XMIT_SIZE); if (count < c) c = count; if (c <= 0) break; memcpy(circ->buf + circ->head, buf, c); circ->head = (circ->head + c) & (UART_XMIT_SIZE - 1); buf += c; count -= c; ret += c; } __uart_start(state); uart_port_unlock(port, flags); return ret; } static unsigned int uart_write_room(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct uart_port *port; unsigned long flags; unsigned int ret; port = uart_port_lock(state, flags); ret = uart_circ_chars_free(&state->xmit); uart_port_unlock(port, flags); return ret; } static unsigned int uart_chars_in_buffer(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct uart_port *port; unsigned long flags; unsigned int ret; port = uart_port_lock(state, flags); ret = uart_circ_chars_pending(&state->xmit); uart_port_unlock(port, flags); return ret; } static void uart_flush_buffer(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct uart_port *port; unsigned long flags; /* * This means you called this function _after_ the port was * closed. No cookie for you. */ if (WARN_ON(!state)) return; pr_debug("uart_flush_buffer(%d) called\n", tty->index); port = uart_port_lock(state, flags); if (!port) return; uart_circ_clear(&state->xmit); if (port->ops->flush_buffer) port->ops->flush_buffer(port); uart_port_unlock(port, flags); tty_port_tty_wakeup(&state->port); } /* * This function performs low-level write of high-priority XON/XOFF * character and accounting for it. * * Requires uart_port to implement .serial_out(). */ void uart_xchar_out(struct uart_port *uport, int offset) { serial_port_out(uport, offset, uport->x_char); uport->icount.tx++; uport->x_char = 0; } EXPORT_SYMBOL_GPL(uart_xchar_out); /* * This function is used to send a high-priority XON/XOFF character to * the device */ static void uart_send_xchar(struct tty_struct *tty, u8 ch) { struct uart_state *state = tty->driver_data; struct uart_port *port; unsigned long flags; port = uart_port_ref(state); if (!port) return; if (port->ops->send_xchar) port->ops->send_xchar(port, ch); else { uart_port_lock_irqsave(port, &flags); port->x_char = ch; if (ch) port->ops->start_tx(port); uart_port_unlock_irqrestore(port, flags); } uart_port_deref(port); } static void uart_throttle(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; upstat_t mask = UPSTAT_SYNC_FIFO; struct uart_port *port; port = uart_port_ref(state); if (!port) return; if (I_IXOFF(tty)) mask |= UPSTAT_AUTOXOFF; if (C_CRTSCTS(tty)) mask |= UPSTAT_AUTORTS; if (port->status & mask) { port->ops->throttle(port); mask &= ~port->status; } if (mask & UPSTAT_AUTORTS) uart_clear_mctrl(port, TIOCM_RTS); if (mask & UPSTAT_AUTOXOFF) uart_send_xchar(tty, STOP_CHAR(tty)); uart_port_deref(port); } static void uart_unthrottle(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; upstat_t mask = UPSTAT_SYNC_FIFO; struct uart_port *port; port = uart_port_ref(state); if (!port) return; if (I_IXOFF(tty)) mask |= UPSTAT_AUTOXOFF; if (C_CRTSCTS(tty)) mask |= UPSTAT_AUTORTS; if (port->status & mask) { port->ops->unthrottle(port); mask &= ~port->status; } if (mask & UPSTAT_AUTORTS) uart_set_mctrl(port, TIOCM_RTS); if (mask & UPSTAT_AUTOXOFF) uart_send_xchar(tty, START_CHAR(tty)); uart_port_deref(port); } static int uart_get_info(struct tty_port *port, struct serial_struct *retinfo) { struct uart_state *state = container_of(port, struct uart_state, port); struct uart_port *uport; int ret = -ENODEV; /* Initialize structure in case we error out later to prevent any stack info leakage. */ *retinfo = (struct serial_struct){}; /* * Ensure the state we copy is consistent and no hardware changes * occur as we go */ mutex_lock(&port->mutex); uport = uart_port_check(state); if (!uport) goto out; retinfo->type = uport->type; retinfo->line = uport->line; retinfo->port = uport->iobase; if (HIGH_BITS_OFFSET) retinfo->port_high = (long) uport->iobase >> HIGH_BITS_OFFSET; retinfo->irq = uport->irq; retinfo->flags = (__force int)uport->flags; retinfo->xmit_fifo_size = uport->fifosize; retinfo->baud_base = uport->uartclk / 16; retinfo->close_delay = jiffies_to_msecs(port->close_delay) / 10; retinfo->closing_wait = port->closing_wait == ASYNC_CLOSING_WAIT_NONE ? ASYNC_CLOSING_WAIT_NONE : jiffies_to_msecs(port->closing_wait) / 10; retinfo->custom_divisor = uport->custom_divisor; retinfo->hub6 = uport->hub6; retinfo->io_type = uport->iotype; retinfo->iomem_reg_shift = uport->regshift; retinfo->iomem_base = (void *)(unsigned long)uport->mapbase; ret = 0; out: mutex_unlock(&port->mutex); return ret; } static int uart_get_info_user(struct tty_struct *tty, struct serial_struct *ss) { struct uart_state *state = tty->driver_data; struct tty_port *port = &state->port; return uart_get_info(port, ss) < 0 ? -EIO : 0; } static int uart_set_info(struct tty_struct *tty, struct tty_port *port, struct uart_state *state, struct serial_struct *new_info) { struct uart_port *uport = uart_port_check(state); unsigned long new_port; unsigned int change_irq, change_port, closing_wait; unsigned int old_custom_divisor, close_delay; upf_t old_flags, new_flags; int retval = 0; if (!uport) return -EIO; new_port = new_info->port; if (HIGH_BITS_OFFSET) new_port += (unsigned long) new_info->port_high << HIGH_BITS_OFFSET; new_info->irq = irq_canonicalize(new_info->irq); close_delay = msecs_to_jiffies(new_info->close_delay * 10); closing_wait = new_info->closing_wait == ASYNC_CLOSING_WAIT_NONE ? ASYNC_CLOSING_WAIT_NONE : msecs_to_jiffies(new_info->closing_wait * 10); change_irq = !(uport->flags & UPF_FIXED_PORT) && new_info->irq != uport->irq; /* * Since changing the 'type' of the port changes its resource * allocations, we should treat type changes the same as * IO port changes. */ change_port = !(uport->flags & UPF_FIXED_PORT) && (new_port != uport->iobase || (unsigned long)new_info->iomem_base != uport->mapbase || new_info->hub6 != uport->hub6 || new_info->io_type != uport->iotype || new_info->iomem_reg_shift != uport->regshift || new_info->type != uport->type); old_flags = uport->flags; new_flags = (__force upf_t)new_info->flags; old_custom_divisor = uport->custom_divisor; if (!capable(CAP_SYS_ADMIN)) { retval = -EPERM; if (change_irq || change_port || (new_info->baud_base != uport->uartclk / 16) || (close_delay != port->close_delay) || (closing_wait != port->closing_wait) || (new_info->xmit_fifo_size && new_info->xmit_fifo_size != uport->fifosize) || (((new_flags ^ old_flags) & ~UPF_USR_MASK) != 0)) goto exit; uport->flags = ((uport->flags & ~UPF_USR_MASK) | (new_flags & UPF_USR_MASK)); uport->custom_divisor = new_info->custom_divisor; goto check_and_exit; } if (change_irq || change_port) { retval = security_locked_down(LOCKDOWN_TIOCSSERIAL); if (retval) goto exit; } /* * Ask the low level driver to verify the settings. */ if (uport->ops->verify_port) retval = uport->ops->verify_port(uport, new_info); if ((new_info->irq >= nr_irqs) || (new_info->irq < 0) || (new_info->baud_base < 9600)) retval = -EINVAL; if (retval) goto exit; if (change_port || change_irq) { retval = -EBUSY; /* * Make sure that we are the sole user of this port. */ if (tty_port_users(port) > 1) goto exit; /* * We need to shutdown the serial port at the old * port/type/irq combination. */ uart_shutdown(tty, state); } if (change_port) { unsigned long old_iobase, old_mapbase; unsigned int old_type, old_iotype, old_hub6, old_shift; old_iobase = uport->iobase; old_mapbase = uport->mapbase; old_type = uport->type; old_hub6 = uport->hub6; old_iotype = uport->iotype; old_shift = uport->regshift; /* * Free and release old regions */ if (old_type != PORT_UNKNOWN && uport->ops->release_port) uport->ops->release_port(uport); uport->iobase = new_port; uport->type = new_info->type; uport->hub6 = new_info->hub6; uport->iotype = new_info->io_type; uport->regshift = new_info->iomem_reg_shift; uport->mapbase = (unsigned long)new_info->iomem_base; /* * Claim and map the new regions */ if (uport->type != PORT_UNKNOWN && uport->ops->request_port) { retval = uport->ops->request_port(uport); } else { /* Always success - Jean II */ retval = 0; } /* * If we fail to request resources for the * new port, try to restore the old settings. */ if (retval) { uport->iobase = old_iobase; uport->type = old_type; uport->hub6 = old_hub6; uport->iotype = old_iotype; uport->regshift = old_shift; uport->mapbase = old_mapbase; if (old_type != PORT_UNKNOWN) { retval = uport->ops->request_port(uport); /* * If we failed to restore the old settings, * we fail like this. */ if (retval) uport->type = PORT_UNKNOWN; /* * We failed anyway. */ retval = -EBUSY; } /* Added to return the correct error -Ram Gupta */ goto exit; } } if (change_irq) uport->irq = new_info->irq; if (!(uport->flags & UPF_FIXED_PORT)) uport->uartclk = new_info->baud_base * 16; uport->flags = (uport->flags & ~UPF_CHANGE_MASK) | (new_flags & UPF_CHANGE_MASK); uport->custom_divisor = new_info->custom_divisor; port->close_delay = close_delay; port->closing_wait = closing_wait; if (new_info->xmit_fifo_size) uport->fifosize = new_info->xmit_fifo_size; check_and_exit: retval = 0; if (uport->type == PORT_UNKNOWN) goto exit; if (tty_port_initialized(port)) { if (((old_flags ^ uport->flags) & UPF_SPD_MASK) || old_custom_divisor != uport->custom_divisor) { /* * If they're setting up a custom divisor or speed, * instead of clearing it, then bitch about it. */ if (uport->flags & UPF_SPD_MASK) { dev_notice_ratelimited(uport->dev, "%s sets custom speed on %s. This is deprecated.\n", current->comm, tty_name(port->tty)); } uart_change_line_settings(tty, state, NULL); } } else { retval = uart_startup(tty, state, true); if (retval == 0) tty_port_set_initialized(port, true); if (retval > 0) retval = 0; } exit: return retval; } static int uart_set_info_user(struct tty_struct *tty, struct serial_struct *ss) { struct uart_state *state = tty->driver_data; struct tty_port *port = &state->port; int retval; down_write(&tty->termios_rwsem); /* * This semaphore protects port->count. It is also * very useful to prevent opens. Also, take the * port configuration semaphore to make sure that a * module insertion/removal doesn't change anything * under us. */ mutex_lock(&port->mutex); retval = uart_set_info(tty, port, state, ss); mutex_unlock(&port->mutex); up_write(&tty->termios_rwsem); return retval; } /** * uart_get_lsr_info - get line status register info * @tty: tty associated with the UART * @state: UART being queried * @value: returned modem value */ static int uart_get_lsr_info(struct tty_struct *tty, struct uart_state *state, unsigned int __user *value) { struct uart_port *uport = uart_port_check(state); unsigned int result; result = uport->ops->tx_empty(uport); /* * If we're about to load something into the transmit * register, we'll pretend the transmitter isn't empty to * avoid a race condition (depending on when the transmit * interrupt happens). */ if (uport->x_char || ((uart_circ_chars_pending(&state->xmit) > 0) && !uart_tx_stopped(uport))) result &= ~TIOCSER_TEMT; return put_user(result, value); } static int uart_tiocmget(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct tty_port *port = &state->port; struct uart_port *uport; int result = -EIO; mutex_lock(&port->mutex); uport = uart_port_check(state); if (!uport) goto out; if (!tty_io_error(tty)) { uart_port_lock_irq(uport); result = uport->mctrl; result |= uport->ops->get_mctrl(uport); uart_port_unlock_irq(uport); } out: mutex_unlock(&port->mutex); return result; } static int uart_tiocmset(struct tty_struct *tty, unsigned int set, unsigned int clear) { struct uart_state *state = tty->driver_data; struct tty_port *port = &state->port; struct uart_port *uport; int ret = -EIO; mutex_lock(&port->mutex); uport = uart_port_check(state); if (!uport) goto out; if (!tty_io_error(tty)) { uart_update_mctrl(uport, set, clear); ret = 0; } out: mutex_unlock(&port->mutex); return ret; } static int uart_break_ctl(struct tty_struct *tty, int break_state) { struct uart_state *state = tty->driver_data; struct tty_port *port = &state->port; struct uart_port *uport; int ret = -EIO; mutex_lock(&port->mutex); uport = uart_port_check(state); if (!uport) goto out; if (uport->type != PORT_UNKNOWN && uport->ops->break_ctl) uport->ops->break_ctl(uport, break_state); ret = 0; out: mutex_unlock(&port->mutex); return ret; } static int uart_do_autoconfig(struct tty_struct *tty, struct uart_state *state) { struct tty_port *port = &state->port; struct uart_port *uport; int flags, ret; if (!capable(CAP_SYS_ADMIN)) return -EPERM; /* * Take the per-port semaphore. This prevents count from * changing, and hence any extra opens of the port while * we're auto-configuring. */ if (mutex_lock_interruptible(&port->mutex)) return -ERESTARTSYS; uport = uart_port_check(state); if (!uport) { ret = -EIO; goto out; } ret = -EBUSY; if (tty_port_users(port) == 1) { uart_shutdown(tty, state); /* * If we already have a port type configured, * we must release its resources. */ if (uport->type != PORT_UNKNOWN && uport->ops->release_port) uport->ops->release_port(uport); flags = UART_CONFIG_TYPE; if (uport->flags & UPF_AUTO_IRQ) flags |= UART_CONFIG_IRQ; /* * This will claim the ports resources if * a port is found. */ uport->ops->config_port(uport, flags); ret = uart_startup(tty, state, true); if (ret == 0) tty_port_set_initialized(port, true); if (ret > 0) ret = 0; } out: mutex_unlock(&port->mutex); return ret; } static void uart_enable_ms(struct uart_port *uport) { /* * Force modem status interrupts on */ if (uport->ops->enable_ms) uport->ops->enable_ms(uport); } /* * Wait for any of the 4 modem inputs (DCD,RI,DSR,CTS) to change * - mask passed in arg for lines of interest * (use |'ed TIOCM_RNG/DSR/CD/CTS for masking) * Caller should use TIOCGICOUNT to see which one it was * * FIXME: This wants extracting into a common all driver implementation * of TIOCMWAIT using tty_port. */ static int uart_wait_modem_status(struct uart_state *state, unsigned long arg) { struct uart_port *uport; struct tty_port *port = &state->port; DECLARE_WAITQUEUE(wait, current); struct uart_icount cprev, cnow; int ret; /* * note the counters on entry */ uport = uart_port_ref(state); if (!uport) return -EIO; uart_port_lock_irq(uport); memcpy(&cprev, &uport->icount, sizeof(struct uart_icount)); uart_enable_ms(uport); uart_port_unlock_irq(uport); add_wait_queue(&port->delta_msr_wait, &wait); for (;;) { uart_port_lock_irq(uport); memcpy(&cnow, &uport->icount, sizeof(struct uart_icount)); uart_port_unlock_irq(uport); set_current_state(TASK_INTERRUPTIBLE); if (((arg & TIOCM_RNG) && (cnow.rng != cprev.rng)) || ((arg & TIOCM_DSR) && (cnow.dsr != cprev.dsr)) || ((arg & TIOCM_CD) && (cnow.dcd != cprev.dcd)) || ((arg & TIOCM_CTS) && (cnow.cts != cprev.cts))) { ret = 0; break; } schedule(); /* see if a signal did it */ if (signal_pending(current)) { ret = -ERESTARTSYS; break; } cprev = cnow; } __set_current_state(TASK_RUNNING); remove_wait_queue(&port->delta_msr_wait, &wait); uart_port_deref(uport); return ret; } /* * Get counter of input serial line interrupts (DCD,RI,DSR,CTS) * Return: write counters to the user passed counter struct * NB: both 1->0 and 0->1 transitions are counted except for * RI where only 0->1 is counted. */ static int uart_get_icount(struct tty_struct *tty, struct serial_icounter_struct *icount) { struct uart_state *state = tty->driver_data; struct uart_icount cnow; struct uart_port *uport; uport = uart_port_ref(state); if (!uport) return -EIO; uart_port_lock_irq(uport); memcpy(&cnow, &uport->icount, sizeof(struct uart_icount)); uart_port_unlock_irq(uport); uart_port_deref(uport); icount->cts = cnow.cts; icount->dsr = cnow.dsr; icount->rng = cnow.rng; icount->dcd = cnow.dcd; icount->rx = cnow.rx; icount->tx = cnow.tx; icount->frame = cnow.frame; icount->overrun = cnow.overrun; icount->parity = cnow.parity; icount->brk = cnow.brk; icount->buf_overrun = cnow.buf_overrun; return 0; } #define SER_RS485_LEGACY_FLAGS (SER_RS485_ENABLED | SER_RS485_RTS_ON_SEND | \ SER_RS485_RTS_AFTER_SEND | SER_RS485_RX_DURING_TX | \ SER_RS485_TERMINATE_BUS) static int uart_check_rs485_flags(struct uart_port *port, struct serial_rs485 *rs485) { u32 flags = rs485->flags; /* Don't return -EINVAL for unsupported legacy flags */ flags &= ~SER_RS485_LEGACY_FLAGS; /* * For any bit outside of the legacy ones that is not supported by * the driver, return -EINVAL. */ if (flags & ~port->rs485_supported.flags) return -EINVAL; /* Asking for address w/o addressing mode? */ if (!(rs485->flags & SER_RS485_ADDRB) && (rs485->flags & (SER_RS485_ADDR_RECV|SER_RS485_ADDR_DEST))) return -EINVAL; /* Address given but not enabled? */ if (!(rs485->flags & SER_RS485_ADDR_RECV) && rs485->addr_recv) return -EINVAL; if (!(rs485->flags & SER_RS485_ADDR_DEST) && rs485->addr_dest) return -EINVAL; return 0; } static void uart_sanitize_serial_rs485_delays(struct uart_port *port, struct serial_rs485 *rs485) { if (!port->rs485_supported.delay_rts_before_send) { if (rs485->delay_rts_before_send) { dev_warn_ratelimited(port->dev, "%s (%d): RTS delay before sending not supported\n", port->name, port->line); } rs485->delay_rts_before_send = 0; } else if (rs485->delay_rts_before_send > RS485_MAX_RTS_DELAY) { rs485->delay_rts_before_send = RS485_MAX_RTS_DELAY; dev_warn_ratelimited(port->dev, "%s (%d): RTS delay before sending clamped to %u ms\n", port->name, port->line, rs485->delay_rts_before_send); } if (!port->rs485_supported.delay_rts_after_send) { if (rs485->delay_rts_after_send) { dev_warn_ratelimited(port->dev, "%s (%d): RTS delay after sending not supported\n", port->name, port->line); } rs485->delay_rts_after_send = 0; } else if (rs485->delay_rts_after_send > RS485_MAX_RTS_DELAY) { rs485->delay_rts_after_send = RS485_MAX_RTS_DELAY; dev_warn_ratelimited(port->dev, "%s (%d): RTS delay after sending clamped to %u ms\n", port->name, port->line, rs485->delay_rts_after_send); } } static void uart_sanitize_serial_rs485(struct uart_port *port, struct serial_rs485 *rs485) { u32 supported_flags = port->rs485_supported.flags; if (!(rs485->flags & SER_RS485_ENABLED)) { memset(rs485, 0, sizeof(*rs485)); return; } /* Clear other RS485 flags but SER_RS485_TERMINATE_BUS and return if enabling RS422 */ if (rs485->flags & SER_RS485_MODE_RS422) { rs485->flags &= (SER_RS485_ENABLED | SER_RS485_MODE_RS422 | SER_RS485_TERMINATE_BUS); return; } rs485->flags &= supported_flags; /* Pick sane settings if the user hasn't */ if (!(rs485->flags & SER_RS485_RTS_ON_SEND) == !(rs485->flags & SER_RS485_RTS_AFTER_SEND)) { if (supported_flags & SER_RS485_RTS_ON_SEND) { rs485->flags |= SER_RS485_RTS_ON_SEND; rs485->flags &= ~SER_RS485_RTS_AFTER_SEND; dev_warn_ratelimited(port->dev, "%s (%d): invalid RTS setting, using RTS_ON_SEND instead\n", port->name, port->line); } else { rs485->flags |= SER_RS485_RTS_AFTER_SEND; rs485->flags &= ~SER_RS485_RTS_ON_SEND; dev_warn_ratelimited(port->dev, "%s (%d): invalid RTS setting, using RTS_AFTER_SEND instead\n", port->name, port->line); } } uart_sanitize_serial_rs485_delays(port, rs485); /* Return clean padding area to userspace */ memset(rs485->padding0, 0, sizeof(rs485->padding0)); memset(rs485->padding1, 0, sizeof(rs485->padding1)); } static void uart_set_rs485_termination(struct uart_port *port, const struct serial_rs485 *rs485) { if (!(rs485->flags & SER_RS485_ENABLED)) return; gpiod_set_value_cansleep(port->rs485_term_gpio, !!(rs485->flags & SER_RS485_TERMINATE_BUS)); } static void uart_set_rs485_rx_during_tx(struct uart_port *port, const struct serial_rs485 *rs485) { if (!(rs485->flags & SER_RS485_ENABLED)) return; gpiod_set_value_cansleep(port->rs485_rx_during_tx_gpio, !!(rs485->flags & SER_RS485_RX_DURING_TX)); } static int uart_rs485_config(struct uart_port *port) { struct serial_rs485 *rs485 = &port->rs485; unsigned long flags; int ret; if (!(rs485->flags & SER_RS485_ENABLED)) return 0; uart_sanitize_serial_rs485(port, rs485); uart_set_rs485_termination(port, rs485); uart_set_rs485_rx_during_tx(port, rs485); uart_port_lock_irqsave(port, &flags); ret = port->rs485_config(port, NULL, rs485); uart_port_unlock_irqrestore(port, flags); if (ret) { memset(rs485, 0, sizeof(*rs485)); /* unset GPIOs */ gpiod_set_value_cansleep(port->rs485_term_gpio, 0); gpiod_set_value_cansleep(port->rs485_rx_during_tx_gpio, 0); } return ret; } static int uart_get_rs485_config(struct uart_port *port, struct serial_rs485 __user *rs485) { unsigned long flags; struct serial_rs485 aux; uart_port_lock_irqsave(port, &flags); aux = port->rs485; uart_port_unlock_irqrestore(port, flags); if (copy_to_user(rs485, &aux, sizeof(aux))) return -EFAULT; return 0; } static int uart_set_rs485_config(struct tty_struct *tty, struct uart_port *port, struct serial_rs485 __user *rs485_user) { struct serial_rs485 rs485; int ret; unsigned long flags; if (!(port->rs485_supported.flags & SER_RS485_ENABLED)) return -ENOTTY; if (copy_from_user(&rs485, rs485_user, sizeof(*rs485_user))) return -EFAULT; ret = uart_check_rs485_flags(port, &rs485); if (ret) return ret; uart_sanitize_serial_rs485(port, &rs485); uart_set_rs485_termination(port, &rs485); uart_set_rs485_rx_during_tx(port, &rs485); uart_port_lock_irqsave(port, &flags); ret = port->rs485_config(port, &tty->termios, &rs485); if (!ret) { port->rs485 = rs485; /* Reset RTS and other mctrl lines when disabling RS485 */ if (!(rs485.flags & SER_RS485_ENABLED)) port->ops->set_mctrl(port, port->mctrl); } uart_port_unlock_irqrestore(port, flags); if (ret) { /* restore old GPIO settings */ gpiod_set_value_cansleep(port->rs485_term_gpio, !!(port->rs485.flags & SER_RS485_TERMINATE_BUS)); gpiod_set_value_cansleep(port->rs485_rx_during_tx_gpio, !!(port->rs485.flags & SER_RS485_RX_DURING_TX)); return ret; } if (copy_to_user(rs485_user, &port->rs485, sizeof(port->rs485))) return -EFAULT; return 0; } static int uart_get_iso7816_config(struct uart_port *port, struct serial_iso7816 __user *iso7816) { unsigned long flags; struct serial_iso7816 aux; if (!port->iso7816_config) return -ENOTTY; uart_port_lock_irqsave(port, &flags); aux = port->iso7816; uart_port_unlock_irqrestore(port, flags); if (copy_to_user(iso7816, &aux, sizeof(aux))) return -EFAULT; return 0; } static int uart_set_iso7816_config(struct uart_port *port, struct serial_iso7816 __user *iso7816_user) { struct serial_iso7816 iso7816; int i, ret; unsigned long flags; if (!port->iso7816_config) return -ENOTTY; if (copy_from_user(&iso7816, iso7816_user, sizeof(*iso7816_user))) return -EFAULT; /* * There are 5 words reserved for future use. Check that userspace * doesn't put stuff in there to prevent breakages in the future. */ for (i = 0; i < ARRAY_SIZE(iso7816.reserved); i++) if (iso7816.reserved[i]) return -EINVAL; uart_port_lock_irqsave(port, &flags); ret = port->iso7816_config(port, &iso7816); uart_port_unlock_irqrestore(port, flags); if (ret) return ret; if (copy_to_user(iso7816_user, &port->iso7816, sizeof(port->iso7816))) return -EFAULT; return 0; } /* * Called via sys_ioctl. We can use spin_lock_irq() here. */ static int uart_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) { struct uart_state *state = tty->driver_data; struct tty_port *port = &state->port; struct uart_port *uport; void __user *uarg = (void __user *)arg; int ret = -ENOIOCTLCMD; /* * These ioctls don't rely on the hardware to be present. */ switch (cmd) { case TIOCSERCONFIG: down_write(&tty->termios_rwsem); ret = uart_do_autoconfig(tty, state); up_write(&tty->termios_rwsem); break; } if (ret != -ENOIOCTLCMD) goto out; if (tty_io_error(tty)) { ret = -EIO; goto out; } /* * The following should only be used when hardware is present. */ switch (cmd) { case TIOCMIWAIT: ret = uart_wait_modem_status(state, arg); break; } if (ret != -ENOIOCTLCMD) goto out; /* rs485_config requires more locking than others */ if (cmd == TIOCSRS485) down_write(&tty->termios_rwsem); mutex_lock(&port->mutex); uport = uart_port_check(state); if (!uport || tty_io_error(tty)) { ret = -EIO; goto out_up; } /* * All these rely on hardware being present and need to be * protected against the tty being hung up. */ switch (cmd) { case TIOCSERGETLSR: /* Get line status register */ ret = uart_get_lsr_info(tty, state, uarg); break; case TIOCGRS485: ret = uart_get_rs485_config(uport, uarg); break; case TIOCSRS485: ret = uart_set_rs485_config(tty, uport, uarg); break; case TIOCSISO7816: ret = uart_set_iso7816_config(state->uart_port, uarg); break; case TIOCGISO7816: ret = uart_get_iso7816_config(state->uart_port, uarg); break; default: if (uport->ops->ioctl) ret = uport->ops->ioctl(uport, cmd, arg); break; } out_up: mutex_unlock(&port->mutex); if (cmd == TIOCSRS485) up_write(&tty->termios_rwsem); out: return ret; } static void uart_set_ldisc(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct uart_port *uport; struct tty_port *port = &state->port; if (!tty_port_initialized(port)) return; mutex_lock(&state->port.mutex); uport = uart_port_check(state); if (uport && uport->ops->set_ldisc) uport->ops->set_ldisc(uport, &tty->termios); mutex_unlock(&state->port.mutex); } static void uart_set_termios(struct tty_struct *tty, const struct ktermios *old_termios) { struct uart_state *state = tty->driver_data; struct uart_port *uport; unsigned int cflag = tty->termios.c_cflag; unsigned int iflag_mask = IGNBRK|BRKINT|IGNPAR|PARMRK|INPCK; bool sw_changed = false; mutex_lock(&state->port.mutex); uport = uart_port_check(state); if (!uport) goto out; /* * Drivers doing software flow control also need to know * about changes to these input settings. */ if (uport->flags & UPF_SOFT_FLOW) { iflag_mask |= IXANY|IXON|IXOFF; sw_changed = tty->termios.c_cc[VSTART] != old_termios->c_cc[VSTART] || tty->termios.c_cc[VSTOP] != old_termios->c_cc[VSTOP]; } /* * These are the bits that are used to setup various * flags in the low level driver. We can ignore the Bfoo * bits in c_cflag; c_[io]speed will always be set * appropriately by set_termios() in tty_ioctl.c */ if ((cflag ^ old_termios->c_cflag) == 0 && tty->termios.c_ospeed == old_termios->c_ospeed && tty->termios.c_ispeed == old_termios->c_ispeed && ((tty->termios.c_iflag ^ old_termios->c_iflag) & iflag_mask) == 0 && !sw_changed) { goto out; } uart_change_line_settings(tty, state, old_termios); /* reload cflag from termios; port driver may have overridden flags */ cflag = tty->termios.c_cflag; /* Handle transition to B0 status */ if (((old_termios->c_cflag & CBAUD) != B0) && ((cflag & CBAUD) == B0)) uart_clear_mctrl(uport, TIOCM_RTS | TIOCM_DTR); /* Handle transition away from B0 status */ else if (((old_termios->c_cflag & CBAUD) == B0) && ((cflag & CBAUD) != B0)) { unsigned int mask = TIOCM_DTR; if (!(cflag & CRTSCTS) || !tty_throttled(tty)) mask |= TIOCM_RTS; uart_set_mctrl(uport, mask); } out: mutex_unlock(&state->port.mutex); } /* * Calls to uart_close() are serialised via the tty_lock in * drivers/tty/tty_io.c:tty_release() * drivers/tty/tty_io.c:do_tty_hangup() */ static void uart_close(struct tty_struct *tty, struct file *filp) { struct uart_state *state = tty->driver_data; if (!state) { struct uart_driver *drv = tty->driver->driver_state; struct tty_port *port; state = drv->state + tty->index; port = &state->port; spin_lock_irq(&port->lock); --port->count; spin_unlock_irq(&port->lock); return; } pr_debug("uart_close(%d) called\n", tty->index); tty_port_close(tty->port, tty, filp); } static void uart_tty_port_shutdown(struct tty_port *port) { struct uart_state *state = container_of(port, struct uart_state, port); struct uart_port *uport = uart_port_check(state); char *buf; /* * At this point, we stop accepting input. To do this, we * disable the receive line status interrupts. */ if (WARN(!uport, "detached port still initialized!\n")) return; uart_port_lock_irq(uport); uport->ops->stop_rx(uport); uart_port_unlock_irq(uport); uart_port_shutdown(port); /* * It's possible for shutdown to be called after suspend if we get * a DCD drop (hangup) at just the right time. Clear suspended bit so * we don't try to resume a port that has been shutdown. */ tty_port_set_suspended(port, false); /* * Free the transmit buffer. */ uart_port_lock_irq(uport); buf = state->xmit.buf; state->xmit.buf = NULL; uart_port_unlock_irq(uport); free_page((unsigned long)buf); uart_change_pm(state, UART_PM_STATE_OFF); } static void uart_wait_until_sent(struct tty_struct *tty, int timeout) { struct uart_state *state = tty->driver_data; struct uart_port *port; unsigned long char_time, expire, fifo_timeout; port = uart_port_ref(state); if (!port) return; if (port->type == PORT_UNKNOWN || port->fifosize == 0) { uart_port_deref(port); return; } /* * Set the check interval to be 1/5 of the estimated time to * send a single character, and make it at least 1. The check * interval should also be less than the timeout. * * Note: we have to use pretty tight timings here to satisfy * the NIST-PCTS. */ char_time = max(nsecs_to_jiffies(port->frame_time / 5), 1UL); if (timeout && timeout < char_time) char_time = timeout; if (!uart_cts_enabled(port)) { /* * If the transmitter hasn't cleared in twice the approximate * amount of time to send the entire FIFO, it probably won't * ever clear. This assumes the UART isn't doing flow * control, which is currently the case. Hence, if it ever * takes longer than FIFO timeout, this is probably due to a * UART bug of some kind. So, we clamp the timeout parameter at * 2 * FIFO timeout. */ fifo_timeout = uart_fifo_timeout(port); if (timeout == 0 || timeout > 2 * fifo_timeout) timeout = 2 * fifo_timeout; } expire = jiffies + timeout; pr_debug("uart_wait_until_sent(%d), jiffies=%lu, expire=%lu...\n", port->line, jiffies, expire); /* * Check whether the transmitter is empty every 'char_time'. * 'timeout' / 'expire' give us the maximum amount of time * we wait. */ while (!port->ops->tx_empty(port)) { msleep_interruptible(jiffies_to_msecs(char_time)); if (signal_pending(current)) break; if (timeout && time_after(jiffies, expire)) break; } uart_port_deref(port); } /* * Calls to uart_hangup() are serialised by the tty_lock in * drivers/tty/tty_io.c:do_tty_hangup() * This runs from a workqueue and can sleep for a _short_ time only. */ static void uart_hangup(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct tty_port *port = &state->port; struct uart_port *uport; unsigned long flags; pr_debug("uart_hangup(%d)\n", tty->index); mutex_lock(&port->mutex); uport = uart_port_check(state); WARN(!uport, "hangup of detached port!\n"); if (tty_port_active(port)) { uart_flush_buffer(tty); uart_shutdown(tty, state); spin_lock_irqsave(&port->lock, flags); port->count = 0; spin_unlock_irqrestore(&port->lock, flags); tty_port_set_active(port, false); tty_port_tty_set(port, NULL); if (uport && !uart_console(uport)) uart_change_pm(state, UART_PM_STATE_OFF); wake_up_interruptible(&port->open_wait); wake_up_interruptible(&port->delta_msr_wait); } mutex_unlock(&port->mutex); } /* uport == NULL if uart_port has already been removed */ static void uart_port_shutdown(struct tty_port *port) { struct uart_state *state = container_of(port, struct uart_state, port); struct uart_port *uport = uart_port_check(state); /* * clear delta_msr_wait queue to avoid mem leaks: we may free * the irq here so the queue might never be woken up. Note * that we won't end up waiting on delta_msr_wait again since * any outstanding file descriptors should be pointing at * hung_up_tty_fops now. */ wake_up_interruptible(&port->delta_msr_wait); if (uport) { /* Free the IRQ and disable the port. */ uport->ops->shutdown(uport); /* Ensure that the IRQ handler isn't running on another CPU. */ synchronize_irq(uport->irq); } } static bool uart_carrier_raised(struct tty_port *port) { struct uart_state *state = container_of(port, struct uart_state, port); struct uart_port *uport; int mctrl; uport = uart_port_ref(state); /* * Should never observe uport == NULL since checks for hangup should * abort the tty_port_block_til_ready() loop before checking for carrier * raised -- but report carrier raised if it does anyway so open will * continue and not sleep */ if (WARN_ON(!uport)) return true; uart_port_lock_irq(uport); uart_enable_ms(uport); mctrl = uport->ops->get_mctrl(uport); uart_port_unlock_irq(uport); uart_port_deref(uport); return mctrl & TIOCM_CAR; } static void uart_dtr_rts(struct tty_port *port, bool active) { struct uart_state *state = container_of(port, struct uart_state, port); struct uart_port *uport; uport = uart_port_ref(state); if (!uport) return; uart_port_dtr_rts(uport, active); uart_port_deref(uport); } static int uart_install(struct tty_driver *driver, struct tty_struct *tty) { struct uart_driver *drv = driver->driver_state; struct uart_state *state = drv->state + tty->index; tty->driver_data = state; return tty_standard_install(driver, tty); } /* * Calls to uart_open are serialised by the tty_lock in * drivers/tty/tty_io.c:tty_open() * Note that if this fails, then uart_close() _will_ be called. * * In time, we want to scrap the "opening nonpresent ports" * behaviour and implement an alternative way for setserial * to set base addresses/ports/types. This will allow us to * get rid of a certain amount of extra tests. */ static int uart_open(struct tty_struct *tty, struct file *filp) { struct uart_state *state = tty->driver_data; int retval; retval = tty_port_open(&state->port, tty, filp); if (retval > 0) retval = 0; return retval; } static int uart_port_activate(struct tty_port *port, struct tty_struct *tty) { struct uart_state *state = container_of(port, struct uart_state, port); struct uart_port *uport; int ret; uport = uart_port_check(state); if (!uport || uport->flags & UPF_DEAD) return -ENXIO; /* * Start up the serial port. */ ret = uart_startup(tty, state, false); if (ret > 0) tty_port_set_active(port, true); return ret; } static const char *uart_type(struct uart_port *port) { const char *str = NULL; if (port->ops->type) str = port->ops->type(port); if (!str) str = "unknown"; return str; } #ifdef CONFIG_PROC_FS static void uart_line_info(struct seq_file *m, struct uart_driver *drv, int i) { struct uart_state *state = drv->state + i; struct tty_port *port = &state->port; enum uart_pm_state pm_state; struct uart_port *uport; char stat_buf[32]; unsigned int status; int mmio; mutex_lock(&port->mutex); uport = uart_port_check(state); if (!uport) goto out; mmio = uport->iotype >= UPIO_MEM; seq_printf(m, "%d: uart:%s %s%08llX irq:%d", uport->line, uart_type(uport), mmio ? "mmio:0x" : "port:", mmio ? (unsigned long long)uport->mapbase : (unsigned long long)uport->iobase, uport->irq); if (uport->type == PORT_UNKNOWN) { seq_putc(m, '\n'); goto out; } if (capable(CAP_SYS_ADMIN)) { pm_state = state->pm_state; if (pm_state != UART_PM_STATE_ON) uart_change_pm(state, UART_PM_STATE_ON); uart_port_lock_irq(uport); status = uport->ops->get_mctrl(uport); uart_port_unlock_irq(uport); if (pm_state != UART_PM_STATE_ON) uart_change_pm(state, pm_state); seq_printf(m, " tx:%d rx:%d", uport->icount.tx, uport->icount.rx); if (uport->icount.frame) seq_printf(m, " fe:%d", uport->icount.frame); if (uport->icount.parity) seq_printf(m, " pe:%d", uport->icount.parity); if (uport->icount.brk) seq_printf(m, " brk:%d", uport->icount.brk); if (uport->icount.overrun) seq_printf(m, " oe:%d", uport->icount.overrun); if (uport->icount.buf_overrun) seq_printf(m, " bo:%d", uport->icount.buf_overrun); #define INFOBIT(bit, str) \ if (uport->mctrl & (bit)) \ strncat(stat_buf, (str), sizeof(stat_buf) - \ strlen(stat_buf) - 2) #define STATBIT(bit, str) \ if (status & (bit)) \ strncat(stat_buf, (str), sizeof(stat_buf) - \ strlen(stat_buf) - 2) stat_buf[0] = '\0'; stat_buf[1] = '\0'; INFOBIT(TIOCM_RTS, "|RTS"); STATBIT(TIOCM_CTS, "|CTS"); INFOBIT(TIOCM_DTR, "|DTR"); STATBIT(TIOCM_DSR, "|DSR"); STATBIT(TIOCM_CAR, "|CD"); STATBIT(TIOCM_RNG, "|RI"); if (stat_buf[0]) stat_buf[0] = ' '; seq_puts(m, stat_buf); } seq_putc(m, '\n'); #undef STATBIT #undef INFOBIT out: mutex_unlock(&port->mutex); } static int uart_proc_show(struct seq_file *m, void *v) { struct tty_driver *ttydrv = m->private; struct uart_driver *drv = ttydrv->driver_state; int i; seq_printf(m, "serinfo:1.0 driver%s%s revision:%s\n", "", "", ""); for (i = 0; i < drv->nr; i++) uart_line_info(m, drv, i); return 0; } #endif static void uart_port_spin_lock_init(struct uart_port *port) { spin_lock_init(&port->lock); lockdep_set_class(&port->lock, &port_lock_key); } #if defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL) /** * uart_console_write - write a console message to a serial port * @port: the port to write the message * @s: array of characters * @count: number of characters in string to write * @putchar: function to write character to port */ void uart_console_write(struct uart_port *port, const char *s, unsigned int count, void (*putchar)(struct uart_port *, unsigned char)) { unsigned int i; for (i = 0; i < count; i++, s++) { if (*s == '\n') putchar(port, '\r'); putchar(port, *s); } } EXPORT_SYMBOL_GPL(uart_console_write); /** * uart_get_console - get uart port for console * @ports: ports to search in * @nr: number of @ports * @co: console to search for * Returns: uart_port for the console @co * * Check whether an invalid uart number has been specified (as @co->index), and * if so, search for the first available port that does have console support. */ struct uart_port * __init uart_get_console(struct uart_port *ports, int nr, struct console *co) { int idx = co->index; if (idx < 0 || idx >= nr || (ports[idx].iobase == 0 && ports[idx].membase == NULL)) for (idx = 0; idx < nr; idx++) if (ports[idx].iobase != 0 || ports[idx].membase != NULL) break; co->index = idx; return ports + idx; } /** * uart_parse_earlycon - Parse earlycon options * @p: ptr to 2nd field (ie., just beyond '<name>,') * @iotype: ptr for decoded iotype (out) * @addr: ptr for decoded mapbase/iobase (out) * @options: ptr for <options> field; %NULL if not present (out) * * Decodes earlycon kernel command line parameters of the form: * * earlycon=<name>,io|mmio|mmio16|mmio32|mmio32be|mmio32native,<addr>,<options> * * console=<name>,io|mmio|mmio16|mmio32|mmio32be|mmio32native,<addr>,<options> * * The optional form: * * earlycon=<name>,0x<addr>,<options> * * console=<name>,0x<addr>,<options> * * is also accepted; the returned @iotype will be %UPIO_MEM. * * Returns: 0 on success or -%EINVAL on failure */ int uart_parse_earlycon(char *p, unsigned char *iotype, resource_size_t *addr, char **options) { if (strncmp(p, "mmio,", 5) == 0) { *iotype = UPIO_MEM; p += 5; } else if (strncmp(p, "mmio16,", 7) == 0) { *iotype = UPIO_MEM16; p += 7; } else if (strncmp(p, "mmio32,", 7) == 0) { *iotype = UPIO_MEM32; p += 7; } else if (strncmp(p, "mmio32be,", 9) == 0) { *iotype = UPIO_MEM32BE; p += 9; } else if (strncmp(p, "mmio32native,", 13) == 0) { *iotype = IS_ENABLED(CONFIG_CPU_BIG_ENDIAN) ? UPIO_MEM32BE : UPIO_MEM32; p += 13; } else if (strncmp(p, "io,", 3) == 0) { *iotype = UPIO_PORT; p += 3; } else if (strncmp(p, "0x", 2) == 0) { *iotype = UPIO_MEM; } else { return -EINVAL; } /* * Before you replace it with kstrtoull(), think about options separator * (',') it will not tolerate */ *addr = simple_strtoull(p, NULL, 0); p = strchr(p, ','); if (p) p++; *options = p; return 0; } EXPORT_SYMBOL_GPL(uart_parse_earlycon); /** * uart_parse_options - Parse serial port baud/parity/bits/flow control. * @options: pointer to option string * @baud: pointer to an 'int' variable for the baud rate. * @parity: pointer to an 'int' variable for the parity. * @bits: pointer to an 'int' variable for the number of data bits. * @flow: pointer to an 'int' variable for the flow control character. * * uart_parse_options() decodes a string containing the serial console * options. The format of the string is <baud><parity><bits><flow>, * eg: 115200n8r */ void uart_parse_options(const char *options, int *baud, int *parity, int *bits, int *flow) { const char *s = options; *baud = simple_strtoul(s, NULL, 10); while (*s >= '0' && *s <= '9') s++; if (*s) *parity = *s++; if (*s) *bits = *s++ - '0'; if (*s) *flow = *s; } EXPORT_SYMBOL_GPL(uart_parse_options); /** * uart_set_options - setup the serial console parameters * @port: pointer to the serial ports uart_port structure * @co: console pointer * @baud: baud rate * @parity: parity character - 'n' (none), 'o' (odd), 'e' (even) * @bits: number of data bits * @flow: flow control character - 'r' (rts) * * Locking: Caller must hold console_list_lock in order to serialize * early initialization of the serial-console lock. */ int uart_set_options(struct uart_port *port, struct console *co, int baud, int parity, int bits, int flow) { struct ktermios termios; static struct ktermios dummy; /* * Ensure that the serial-console lock is initialised early. * * Note that the console-registered check is needed because * kgdboc can call uart_set_options() for an already registered * console via tty_find_polling_driver() and uart_poll_init(). */ if (!uart_console_registered_locked(port) && !port->console_reinit) uart_port_spin_lock_init(port); memset(&termios, 0, sizeof(struct ktermios)); termios.c_cflag |= CREAD | HUPCL | CLOCAL; tty_termios_encode_baud_rate(&termios, baud, baud); if (bits == 7) termios.c_cflag |= CS7; else termios.c_cflag |= CS8; switch (parity) { case 'o': case 'O': termios.c_cflag |= PARODD; fallthrough; case 'e': case 'E': termios.c_cflag |= PARENB; break; } if (flow == 'r') termios.c_cflag |= CRTSCTS; /* * some uarts on other side don't support no flow control. * So we set * DTR in host uart to make them happy */ port->mctrl |= TIOCM_DTR; port->ops->set_termios(port, &termios, &dummy); /* * Allow the setting of the UART parameters with a NULL console * too: */ if (co) { co->cflag = termios.c_cflag; co->ispeed = termios.c_ispeed; co->ospeed = termios.c_ospeed; } return 0; } EXPORT_SYMBOL_GPL(uart_set_options); #endif /* CONFIG_SERIAL_CORE_CONSOLE */ /** * uart_change_pm - set power state of the port * * @state: port descriptor * @pm_state: new state * * Locking: port->mutex has to be held */ static void uart_change_pm(struct uart_state *state, enum uart_pm_state pm_state) { struct uart_port *port = uart_port_check(state); if (state->pm_state != pm_state) { if (port && port->ops->pm) port->ops->pm(port, pm_state, state->pm_state); state->pm_state = pm_state; } } struct uart_match { struct uart_port *port; struct uart_driver *driver; }; static int serial_match_port(struct device *dev, void *data) { struct uart_match *match = data; struct tty_driver *tty_drv = match->driver->tty_driver; dev_t devt = MKDEV(tty_drv->major, tty_drv->minor_start) + match->port->line; return dev->devt == devt; /* Actually, only one tty per port */ } int uart_suspend_port(struct uart_driver *drv, struct uart_port *uport) { struct uart_state *state = drv->state + uport->line; struct tty_port *port = &state->port; struct device *tty_dev; struct uart_match match = {uport, drv}; mutex_lock(&port->mutex); tty_dev = device_find_child(&uport->port_dev->dev, &match, serial_match_port); if (tty_dev && device_may_wakeup(tty_dev)) { enable_irq_wake(uport->irq); put_device(tty_dev); mutex_unlock(&port->mutex); return 0; } put_device(tty_dev); /* * Nothing to do if the console is not suspending * except stop_rx to prevent any asynchronous data * over RX line. However ensure that we will be * able to Re-start_rx later. */ if (!console_suspend_enabled && uart_console(uport)) { if (uport->ops->start_rx) { uart_port_lock_irq(uport); uport->ops->stop_rx(uport); uart_port_unlock_irq(uport); } goto unlock; } uport->suspended = 1; if (tty_port_initialized(port)) { const struct uart_ops *ops = uport->ops; int tries; unsigned int mctrl; tty_port_set_suspended(port, true); tty_port_set_initialized(port, false); uart_port_lock_irq(uport); ops->stop_tx(uport); if (!(uport->rs485.flags & SER_RS485_ENABLED)) ops->set_mctrl(uport, 0); /* save mctrl so it can be restored on resume */ mctrl = uport->mctrl; uport->mctrl = 0; ops->stop_rx(uport); uart_port_unlock_irq(uport); /* * Wait for the transmitter to empty. */ for (tries = 3; !ops->tx_empty(uport) && tries; tries--) msleep(10); if (!tries) dev_err(uport->dev, "%s: Unable to drain transmitter\n", uport->name); ops->shutdown(uport); uport->mctrl = mctrl; } /* * Disable the console device before suspending. */ if (uart_console(uport)) console_stop(uport->cons); uart_change_pm(state, UART_PM_STATE_OFF); unlock: mutex_unlock(&port->mutex); return 0; } EXPORT_SYMBOL(uart_suspend_port); int uart_resume_port(struct uart_driver *drv, struct uart_port *uport) { struct uart_state *state = drv->state + uport->line; struct tty_port *port = &state->port; struct device *tty_dev; struct uart_match match = {uport, drv}; struct ktermios termios; mutex_lock(&port->mutex); tty_dev = device_find_child(&uport->port_dev->dev, &match, serial_match_port); if (!uport->suspended && device_may_wakeup(tty_dev)) { if (irqd_is_wakeup_set(irq_get_irq_data((uport->irq)))) disable_irq_wake(uport->irq); put_device(tty_dev); mutex_unlock(&port->mutex); return 0; } put_device(tty_dev); uport->suspended = 0; /* * Re-enable the console device after suspending. */ if (uart_console(uport)) { /* * First try to use the console cflag setting. */ memset(&termios, 0, sizeof(struct ktermios)); termios.c_cflag = uport->cons->cflag; termios.c_ispeed = uport->cons->ispeed; termios.c_ospeed = uport->cons->ospeed; /* * If that's unset, use the tty termios setting. */ if (port->tty && termios.c_cflag == 0) termios = port->tty->termios; if (console_suspend_enabled) uart_change_pm(state, UART_PM_STATE_ON); uport->ops->set_termios(uport, &termios, NULL); if (!console_suspend_enabled && uport->ops->start_rx) { uart_port_lock_irq(uport); uport->ops->start_rx(uport); uart_port_unlock_irq(uport); } if (console_suspend_enabled) console_start(uport->cons); } if (tty_port_suspended(port)) { const struct uart_ops *ops = uport->ops; int ret; uart_change_pm(state, UART_PM_STATE_ON); uart_port_lock_irq(uport); if (!(uport->rs485.flags & SER_RS485_ENABLED)) ops->set_mctrl(uport, 0); uart_port_unlock_irq(uport); if (console_suspend_enabled || !uart_console(uport)) { /* Protected by port mutex for now */ struct tty_struct *tty = port->tty; ret = ops->startup(uport); if (ret == 0) { if (tty) uart_change_line_settings(tty, state, NULL); uart_rs485_config(uport); uart_port_lock_irq(uport); if (!(uport->rs485.flags & SER_RS485_ENABLED)) ops->set_mctrl(uport, uport->mctrl); ops->start_tx(uport); uart_port_unlock_irq(uport); tty_port_set_initialized(port, true); } else { /* * Failed to resume - maybe hardware went away? * Clear the "initialized" flag so we won't try * to call the low level drivers shutdown method. */ uart_shutdown(tty, state); } } tty_port_set_suspended(port, false); } mutex_unlock(&port->mutex); return 0; } EXPORT_SYMBOL(uart_resume_port); static inline void uart_report_port(struct uart_driver *drv, struct uart_port *port) { char address[64]; switch (port->iotype) { case UPIO_PORT: snprintf(address, sizeof(address), "I/O 0x%lx", port->iobase); break; case UPIO_HUB6: snprintf(address, sizeof(address), "I/O 0x%lx offset 0x%x", port->iobase, port->hub6); break; case UPIO_MEM: case UPIO_MEM16: case UPIO_MEM32: case UPIO_MEM32BE: case UPIO_AU: case UPIO_TSI: snprintf(address, sizeof(address), "MMIO 0x%llx", (unsigned long long)port->mapbase); break; default: strscpy(address, "*unknown*", sizeof(address)); break; } pr_info("%s%s%s at %s (irq = %d, base_baud = %d) is a %s\n", port->dev ? dev_name(port->dev) : "", port->dev ? ": " : "", port->name, address, port->irq, port->uartclk / 16, uart_type(port)); /* The magic multiplier feature is a bit obscure, so report it too. */ if (port->flags & UPF_MAGIC_MULTIPLIER) pr_info("%s%s%s extra baud rates supported: %d, %d", port->dev ? dev_name(port->dev) : "", port->dev ? ": " : "", port->name, port->uartclk / 8, port->uartclk / 4); } static void uart_configure_port(struct uart_driver *drv, struct uart_state *state, struct uart_port *port) { unsigned int flags; /* * If there isn't a port here, don't do anything further. */ if (!port->iobase && !port->mapbase && !port->membase) return; /* * Now do the auto configuration stuff. Note that config_port * is expected to claim the resources and map the port for us. */ flags = 0; if (port->flags & UPF_AUTO_IRQ) flags |= UART_CONFIG_IRQ; if (port->flags & UPF_BOOT_AUTOCONF) { if (!(port->flags & UPF_FIXED_TYPE)) { port->type = PORT_UNKNOWN; flags |= UART_CONFIG_TYPE; } port->ops->config_port(port, flags); } if (port->type != PORT_UNKNOWN) { unsigned long flags; uart_report_port(drv, port); /* Power up port for set_mctrl() */ uart_change_pm(state, UART_PM_STATE_ON); /* * Ensure that the modem control lines are de-activated. * keep the DTR setting that is set in uart_set_options() * We probably don't need a spinlock around this, but */ uart_port_lock_irqsave(port, &flags); port->mctrl &= TIOCM_DTR; if (!(port->rs485.flags & SER_RS485_ENABLED)) port->ops->set_mctrl(port, port->mctrl); uart_port_unlock_irqrestore(port, flags); uart_rs485_config(port); /* * If this driver supports console, and it hasn't been * successfully registered yet, try to re-register it. * It may be that the port was not available. */ if (port->cons && !console_is_registered(port->cons)) register_console(port->cons); /* * Power down all ports by default, except the * console if we have one. */ if (!uart_console(port)) uart_change_pm(state, UART_PM_STATE_OFF); } } #ifdef CONFIG_CONSOLE_POLL static int uart_poll_init(struct tty_driver *driver, int line, char *options) { struct uart_driver *drv = driver->driver_state; struct uart_state *state = drv->state + line; enum uart_pm_state pm_state; struct tty_port *tport; struct uart_port *port; int baud = 9600; int bits = 8; int parity = 'n'; int flow = 'n'; int ret = 0; tport = &state->port; mutex_lock(&tport->mutex); port = uart_port_check(state); if (!port || port->type == PORT_UNKNOWN || !(port->ops->poll_get_char && port->ops->poll_put_char)) { ret = -1; goto out; } pm_state = state->pm_state; uart_change_pm(state, UART_PM_STATE_ON); if (port->ops->poll_init) { /* * We don't set initialized as we only initialized the hw, * e.g. state->xmit is still uninitialized. */ if (!tty_port_initialized(tport)) ret = port->ops->poll_init(port); } if (!ret && options) { uart_parse_options(options, &baud, &parity, &bits, &flow); console_list_lock(); ret = uart_set_options(port, NULL, baud, parity, bits, flow); console_list_unlock(); } out: if (ret) uart_change_pm(state, pm_state); mutex_unlock(&tport->mutex); return ret; } static int uart_poll_get_char(struct tty_driver *driver, int line) { struct uart_driver *drv = driver->driver_state; struct uart_state *state = drv->state + line; struct uart_port *port; int ret = -1; port = uart_port_ref(state); if (port) { ret = port->ops->poll_get_char(port); uart_port_deref(port); } return ret; } static void uart_poll_put_char(struct tty_driver *driver, int line, char ch) { struct uart_driver *drv = driver->driver_state; struct uart_state *state = drv->state + line; struct uart_port *port; port = uart_port_ref(state); if (!port) return; if (ch == '\n') port->ops->poll_put_char(port, '\r'); port->ops->poll_put_char(port, ch); uart_port_deref(port); } #endif static const struct tty_operations uart_ops = { .install = uart_install, .open = uart_open, .close = uart_close, .write = uart_write, .put_char = uart_put_char, .flush_chars = uart_flush_chars, .write_room = uart_write_room, .chars_in_buffer= uart_chars_in_buffer, .flush_buffer = uart_flush_buffer, .ioctl = uart_ioctl, .throttle = uart_throttle, .unthrottle = uart_unthrottle, .send_xchar = uart_send_xchar, .set_termios = uart_set_termios, .set_ldisc = uart_set_ldisc, .stop = uart_stop, .start = uart_start, .hangup = uart_hangup, .break_ctl = uart_break_ctl, .wait_until_sent= uart_wait_until_sent, #ifdef CONFIG_PROC_FS .proc_show = uart_proc_show, #endif .tiocmget = uart_tiocmget, .tiocmset = uart_tiocmset, .set_serial = uart_set_info_user, .get_serial = uart_get_info_user, .get_icount = uart_get_icount, #ifdef CONFIG_CONSOLE_POLL .poll_init = uart_poll_init, .poll_get_char = uart_poll_get_char, .poll_put_char = uart_poll_put_char, #endif }; static const struct tty_port_operations uart_port_ops = { .carrier_raised = uart_carrier_raised, .dtr_rts = uart_dtr_rts, .activate = uart_port_activate, .shutdown = uart_tty_port_shutdown, }; /** * uart_register_driver - register a driver with the uart core layer * @drv: low level driver structure * * Register a uart driver with the core driver. We in turn register with the * tty layer, and initialise the core driver per-port state. * * We have a proc file in /proc/tty/driver which is named after the normal * driver. * * @drv->port should be %NULL, and the per-port structures should be registered * using uart_add_one_port() after this call has succeeded. * * Locking: none, Interrupts: enabled */ int uart_register_driver(struct uart_driver *drv) { struct tty_driver *normal; int i, retval = -ENOMEM; BUG_ON(drv->state); /* * Maybe we should be using a slab cache for this, especially if * we have a large number of ports to handle. */ drv->state = kcalloc(drv->nr, sizeof(struct uart_state), GFP_KERNEL); if (!drv->state) goto out; normal = tty_alloc_driver(drv->nr, TTY_DRIVER_REAL_RAW | TTY_DRIVER_DYNAMIC_DEV); if (IS_ERR(normal)) { retval = PTR_ERR(normal); goto out_kfree; } drv->tty_driver = normal; normal->driver_name = drv->driver_name; normal->name = drv->dev_name; normal->major = drv->major; normal->minor_start = drv->minor; normal->type = TTY_DRIVER_TYPE_SERIAL; normal->subtype = SERIAL_TYPE_NORMAL; normal->init_termios = tty_std_termios; normal->init_termios.c_cflag = B9600 | CS8 | CREAD | HUPCL | CLOCAL; normal->init_termios.c_ispeed = normal->init_termios.c_ospeed = 9600; normal->driver_state = drv; tty_set_operations(normal, &uart_ops); /* * Initialise the UART state(s). */ for (i = 0; i < drv->nr; i++) { struct uart_state *state = drv->state + i; struct tty_port *port = &state->port; tty_port_init(port); port->ops = &uart_port_ops; } retval = tty_register_driver(normal); if (retval >= 0) return retval; for (i = 0; i < drv->nr; i++) tty_port_destroy(&drv->state[i].port); tty_driver_kref_put(normal); out_kfree: kfree(drv->state); out: return retval; } EXPORT_SYMBOL(uart_register_driver); /** * uart_unregister_driver - remove a driver from the uart core layer * @drv: low level driver structure * * Remove all references to a driver from the core driver. The low level * driver must have removed all its ports via the uart_remove_one_port() if it * registered them with uart_add_one_port(). (I.e. @drv->port is %NULL.) * * Locking: none, Interrupts: enabled */ void uart_unregister_driver(struct uart_driver *drv) { struct tty_driver *p = drv->tty_driver; unsigned int i; tty_unregister_driver(p); tty_driver_kref_put(p); for (i = 0; i < drv->nr; i++) tty_port_destroy(&drv->state[i].port); kfree(drv->state); drv->state = NULL; drv->tty_driver = NULL; } EXPORT_SYMBOL(uart_unregister_driver); struct tty_driver *uart_console_device(struct console *co, int *index) { struct uart_driver *p = co->data; *index = co->index; return p->tty_driver; } EXPORT_SYMBOL_GPL(uart_console_device); static ssize_t uartclk_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.baud_base * 16); } static ssize_t type_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.type); } static ssize_t line_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.line); } static ssize_t port_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); unsigned long ioaddr; uart_get_info(port, &tmp); ioaddr = tmp.port; if (HIGH_BITS_OFFSET) ioaddr |= (unsigned long)tmp.port_high << HIGH_BITS_OFFSET; return sprintf(buf, "0x%lX\n", ioaddr); } static ssize_t irq_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.irq); } static ssize_t flags_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "0x%X\n", tmp.flags); } static ssize_t xmit_fifo_size_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.xmit_fifo_size); } static ssize_t close_delay_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.close_delay); } static ssize_t closing_wait_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.closing_wait); } static ssize_t custom_divisor_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.custom_divisor); } static ssize_t io_type_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.io_type); } static ssize_t iomem_base_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "0x%lX\n", (unsigned long)tmp.iomem_base); } static ssize_t iomem_reg_shift_show(struct device *dev, struct device_attribute *attr, char *buf) { struct serial_struct tmp; struct tty_port *port = dev_get_drvdata(dev); uart_get_info(port, &tmp); return sprintf(buf, "%d\n", tmp.iomem_reg_shift); } static ssize_t console_show(struct device *dev, struct device_attribute *attr, char *buf) { struct tty_port *port = dev_get_drvdata(dev); struct uart_state *state = container_of(port, struct uart_state, port); struct uart_port *uport; bool console = false; mutex_lock(&port->mutex); uport = uart_port_check(state); if (uport) console = uart_console_registered(uport); mutex_unlock(&port->mutex); return sprintf(buf, "%c\n", console ? 'Y' : 'N'); } static ssize_t console_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { struct tty_port *port = dev_get_drvdata(dev); struct uart_state *state = container_of(port, struct uart_state, port); struct uart_port *uport; bool oldconsole, newconsole; int ret; ret = kstrtobool(buf, &newconsole); if (ret) return ret; mutex_lock(&port->mutex); uport = uart_port_check(state); if (uport) { oldconsole = uart_console_registered(uport); if (oldconsole && !newconsole) { ret = unregister_console(uport->cons); } else if (!oldconsole && newconsole) { if (uart_console(uport)) { uport->console_reinit = 1; register_console(uport->cons); } else { ret = -ENOENT; } } } else { ret = -ENXIO; } mutex_unlock(&port->mutex); return ret < 0 ? ret : count; } static DEVICE_ATTR_RO(uartclk); static DEVICE_ATTR_RO(type); static DEVICE_ATTR_RO(line); static DEVICE_ATTR_RO(port); static DEVICE_ATTR_RO(irq); static DEVICE_ATTR_RO(flags); static DEVICE_ATTR_RO(xmit_fifo_size); static DEVICE_ATTR_RO(close_delay); static DEVICE_ATTR_RO(closing_wait); static DEVICE_ATTR_RO(custom_divisor); static DEVICE_ATTR_RO(io_type); static DEVICE_ATTR_RO(iomem_base); static DEVICE_ATTR_RO(iomem_reg_shift); static DEVICE_ATTR_RW(console); static struct attribute *tty_dev_attrs[] = { &dev_attr_uartclk.attr, &dev_attr_type.attr, &dev_attr_line.attr, &dev_attr_port.attr, &dev_attr_irq.attr, &dev_attr_flags.attr, &dev_attr_xmit_fifo_size.attr, &dev_attr_close_delay.attr, &dev_attr_closing_wait.attr, &dev_attr_custom_divisor.attr, &dev_attr_io_type.attr, &dev_attr_iomem_base.attr, &dev_attr_iomem_reg_shift.attr, &dev_attr_console.attr, NULL }; static const struct attribute_group tty_dev_attr_group = { .attrs = tty_dev_attrs, }; /** * serial_core_add_one_port - attach a driver-defined port structure * @drv: pointer to the uart low level driver structure for this port * @uport: uart port structure to use for this port. * * Context: task context, might sleep * * This allows the driver @drv to register its own uart_port structure with the * core driver. The main purpose is to allow the low level uart drivers to * expand uart_port, rather than having yet more levels of structures. * Caller must hold port_mutex. */ static int serial_core_add_one_port(struct uart_driver *drv, struct uart_port *uport) { struct uart_state *state; struct tty_port *port; int ret = 0; struct device *tty_dev; int num_groups; if (uport->line >= drv->nr) return -EINVAL; state = drv->state + uport->line; port = &state->port; mutex_lock(&port->mutex); if (state->uart_port) { ret = -EINVAL; goto out; } /* Link the port to the driver state table and vice versa */ atomic_set(&state->refcount, 1); init_waitqueue_head(&state->remove_wait); state->uart_port = uport; uport->state = state; state->pm_state = UART_PM_STATE_UNDEFINED; uport->cons = drv->cons; uport->minor = drv->tty_driver->minor_start + uport->line; uport->name = kasprintf(GFP_KERNEL, "%s%d", drv->dev_name, drv->tty_driver->name_base + uport->line); if (!uport->name) { ret = -ENOMEM; goto out; } /* * If this port is in use as a console then the spinlock is already * initialised. */ if (!uart_console_registered(uport)) uart_port_spin_lock_init(uport); if (uport->cons && uport->dev) of_console_check(uport->dev->of_node, uport->cons->name, uport->line); tty_port_link_device(port, drv->tty_driver, uport->line); uart_configure_port(drv, state, uport); port->console = uart_console(uport); num_groups = 2; if (uport->attr_group) num_groups++; uport->tty_groups = kcalloc(num_groups, sizeof(*uport->tty_groups), GFP_KERNEL); if (!uport->tty_groups) { ret = -ENOMEM; goto out; } uport->tty_groups[0] = &tty_dev_attr_group; if (uport->attr_group) uport->tty_groups[1] = uport->attr_group; /* * Register the port whether it's detected or not. This allows * setserial to be used to alter this port's parameters. */ tty_dev = tty_port_register_device_attr_serdev(port, drv->tty_driver, uport->line, uport->dev, &uport->port_dev->dev, port, uport->tty_groups); if (!IS_ERR(tty_dev)) { device_set_wakeup_capable(tty_dev, 1); } else { dev_err(uport->dev, "Cannot register tty device on line %d\n", uport->line); } out: mutex_unlock(&port->mutex); return ret; } /** * serial_core_remove_one_port - detach a driver defined port structure * @drv: pointer to the uart low level driver structure for this port * @uport: uart port structure for this port * * Context: task context, might sleep * * This unhooks (and hangs up) the specified port structure from the core * driver. No further calls will be made to the low-level code for this port. * Caller must hold port_mutex. */ static void serial_core_remove_one_port(struct uart_driver *drv, struct uart_port *uport) { struct uart_state *state = drv->state + uport->line; struct tty_port *port = &state->port; struct uart_port *uart_port; struct tty_struct *tty; mutex_lock(&port->mutex); uart_port = uart_port_check(state); if (uart_port != uport) dev_alert(uport->dev, "Removing wrong port: %p != %p\n", uart_port, uport); if (!uart_port) { mutex_unlock(&port->mutex); return; } mutex_unlock(&port->mutex); /* * Remove the devices from the tty layer */ tty_port_unregister_device(port, drv->tty_driver, uport->line); tty = tty_port_tty_get(port); if (tty) { tty_vhangup(port->tty); tty_kref_put(tty); } /* * If the port is used as a console, unregister it */ if (uart_console(uport)) unregister_console(uport->cons); /* * Free the port IO and memory resources, if any. */ if (uport->type != PORT_UNKNOWN && uport->ops->release_port) uport->ops->release_port(uport); kfree(uport->tty_groups); kfree(uport->name); /* * Indicate that there isn't a port here anymore. */ uport->type = PORT_UNKNOWN; uport->port_dev = NULL; mutex_lock(&port->mutex); WARN_ON(atomic_dec_return(&state->refcount) < 0); wait_event(state->remove_wait, !atomic_read(&state->refcount)); state->uart_port = NULL; mutex_unlock(&port->mutex); } /** * uart_match_port - are the two ports equivalent? * @port1: first port * @port2: second port * * This utility function can be used to determine whether two uart_port * structures describe the same port. */ bool uart_match_port(const struct uart_port *port1, const struct uart_port *port2) { if (port1->iotype != port2->iotype) return false; switch (port1->iotype) { case UPIO_PORT: return port1->iobase == port2->iobase; case UPIO_HUB6: return port1->iobase == port2->iobase && port1->hub6 == port2->hub6; case UPIO_MEM: case UPIO_MEM16: case UPIO_MEM32: case UPIO_MEM32BE: case UPIO_AU: case UPIO_TSI: return port1->mapbase == port2->mapbase; } return false; } EXPORT_SYMBOL(uart_match_port); static struct serial_ctrl_device * serial_core_get_ctrl_dev(struct serial_port_device *port_dev) { struct device *dev = &port_dev->dev; return to_serial_base_ctrl_device(dev->parent); } /* * Find a registered serial core controller device if one exists. Returns * the first device matching the ctrl_id. Caller must hold port_mutex. */ static struct serial_ctrl_device *serial_core_ctrl_find(struct uart_driver *drv, struct device *phys_dev, int ctrl_id) { struct uart_state *state; int i; lockdep_assert_held(&port_mutex); for (i = 0; i < drv->nr; i++) { state = drv->state + i; if (!state->uart_port || !state->uart_port->port_dev) continue; if (state->uart_port->dev == phys_dev && state->uart_port->ctrl_id == ctrl_id) return serial_core_get_ctrl_dev(state->uart_port->port_dev); } return NULL; } static struct serial_ctrl_device *serial_core_ctrl_device_add(struct uart_port *port) { return serial_base_ctrl_add(port, port->dev); } static int serial_core_port_device_add(struct serial_ctrl_device *ctrl_dev, struct uart_port *port) { struct serial_port_device *port_dev; port_dev = serial_base_port_add(port, ctrl_dev); if (IS_ERR(port_dev)) return PTR_ERR(port_dev); port->port_dev = port_dev; return 0; } /* * Initialize a serial core port device, and a controller device if needed. */ int serial_core_register_port(struct uart_driver *drv, struct uart_port *port) { struct serial_ctrl_device *ctrl_dev, *new_ctrl_dev = NULL; int ret; mutex_lock(&port_mutex); /* * Prevent serial_port_runtime_resume() from trying to use the port * until serial_core_add_one_port() has completed */ port->flags |= UPF_DEAD; /* Inititalize a serial core controller device if needed */ ctrl_dev = serial_core_ctrl_find(drv, port->dev, port->ctrl_id); if (!ctrl_dev) { new_ctrl_dev = serial_core_ctrl_device_add(port); if (IS_ERR(new_ctrl_dev)) { ret = PTR_ERR(new_ctrl_dev); goto err_unlock; } ctrl_dev = new_ctrl_dev; } /* * Initialize a serial core port device. Tag the port dead to prevent * serial_port_runtime_resume() trying to do anything until port has * been registered. It gets cleared by serial_core_add_one_port(). */ ret = serial_core_port_device_add(ctrl_dev, port); if (ret) goto err_unregister_ctrl_dev; ret = serial_core_add_one_port(drv, port); if (ret) goto err_unregister_port_dev; port->flags &= ~UPF_DEAD; mutex_unlock(&port_mutex); return 0; err_unregister_port_dev: serial_base_port_device_remove(port->port_dev); err_unregister_ctrl_dev: serial_base_ctrl_device_remove(new_ctrl_dev); err_unlock: mutex_unlock(&port_mutex); return ret; } /* * Removes a serial core port device, and the related serial core controller * device if the last instance. */ void serial_core_unregister_port(struct uart_driver *drv, struct uart_port *port) { struct device *phys_dev = port->dev; struct serial_port_device *port_dev = port->port_dev; struct serial_ctrl_device *ctrl_dev = serial_core_get_ctrl_dev(port_dev); int ctrl_id = port->ctrl_id; mutex_lock(&port_mutex); port->flags |= UPF_DEAD; serial_core_remove_one_port(drv, port); /* Note that struct uart_port *port is no longer valid at this point */ serial_base_port_device_remove(port_dev); /* Drop the serial core controller device if no ports are using it */ if (!serial_core_ctrl_find(drv, phys_dev, ctrl_id)) serial_base_ctrl_device_remove(ctrl_dev); mutex_unlock(&port_mutex); } /** * uart_handle_dcd_change - handle a change of carrier detect state * @uport: uart_port structure for the open port * @active: new carrier detect status * * Caller must hold uport->lock. */ void uart_handle_dcd_change(struct uart_port *uport, bool active) { struct tty_port *port = &uport->state->port; struct tty_struct *tty = port->tty; struct tty_ldisc *ld; lockdep_assert_held_once(&uport->lock); if (tty) { ld = tty_ldisc_ref(tty); if (ld) { if (ld->ops->dcd_change) ld->ops->dcd_change(tty, active); tty_ldisc_deref(ld); } } uport->icount.dcd++; if (uart_dcd_enabled(uport)) { if (active) wake_up_interruptible(&port->open_wait); else if (tty) tty_hangup(tty); } } EXPORT_SYMBOL_GPL(uart_handle_dcd_change); /** * uart_handle_cts_change - handle a change of clear-to-send state * @uport: uart_port structure for the open port * @active: new clear-to-send status * * Caller must hold uport->lock. */ void uart_handle_cts_change(struct uart_port *uport, bool active) { lockdep_assert_held_once(&uport->lock); uport->icount.cts++; if (uart_softcts_mode(uport)) { if (uport->hw_stopped) { if (active) { uport->hw_stopped = false; uport->ops->start_tx(uport); uart_write_wakeup(uport); } } else { if (!active) { uport->hw_stopped = true; uport->ops->stop_tx(uport); } } } } EXPORT_SYMBOL_GPL(uart_handle_cts_change); /** * uart_insert_char - push a char to the uart layer * * User is responsible to call tty_flip_buffer_push when they are done with * insertion. * * @port: corresponding port * @status: state of the serial port RX buffer (LSR for 8250) * @overrun: mask of overrun bits in @status * @ch: character to push * @flag: flag for the character (see TTY_NORMAL and friends) */ void uart_insert_char(struct uart_port *port, unsigned int status, unsigned int overrun, u8 ch, u8 flag) { struct tty_port *tport = &port->state->port; if ((status & port->ignore_status_mask & ~overrun) == 0) if (tty_insert_flip_char(tport, ch, flag) == 0) ++port->icount.buf_overrun; /* * Overrun is special. Since it's reported immediately, * it doesn't affect the current character. */ if (status & ~port->ignore_status_mask & overrun) if (tty_insert_flip_char(tport, 0, TTY_OVERRUN) == 0) ++port->icount.buf_overrun; } EXPORT_SYMBOL_GPL(uart_insert_char); #ifdef CONFIG_MAGIC_SYSRQ_SERIAL static const u8 sysrq_toggle_seq[] = CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE; static void uart_sysrq_on(struct work_struct *w) { int sysrq_toggle_seq_len = strlen(sysrq_toggle_seq); sysrq_toggle_support(1); pr_info("SysRq is enabled by magic sequence '%*pE' on serial\n", sysrq_toggle_seq_len, sysrq_toggle_seq); } static DECLARE_WORK(sysrq_enable_work, uart_sysrq_on); /** * uart_try_toggle_sysrq - Enables SysRq from serial line * @port: uart_port structure where char(s) after BREAK met * @ch: new character in the sequence after received BREAK * * Enables magic SysRq when the required sequence is met on port * (see CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE). * * Returns: %false if @ch is out of enabling sequence and should be * handled some other way, %true if @ch was consumed. */ bool uart_try_toggle_sysrq(struct uart_port *port, u8 ch) { int sysrq_toggle_seq_len = strlen(sysrq_toggle_seq); if (!sysrq_toggle_seq_len) return false; BUILD_BUG_ON(ARRAY_SIZE(sysrq_toggle_seq) >= U8_MAX); if (sysrq_toggle_seq[port->sysrq_seq] != ch) { port->sysrq_seq = 0; return false; } if (++port->sysrq_seq < sysrq_toggle_seq_len) { port->sysrq = jiffies + SYSRQ_TIMEOUT; return true; } schedule_work(&sysrq_enable_work); port->sysrq = 0; return true; } EXPORT_SYMBOL_GPL(uart_try_toggle_sysrq); #endif /** * uart_get_rs485_mode() - retrieve rs485 properties for given uart * @port: uart device's target port * * This function implements the device tree binding described in * Documentation/devicetree/bindings/serial/rs485.txt. */ int uart_get_rs485_mode(struct uart_port *port) { struct serial_rs485 *rs485conf = &port->rs485; struct device *dev = port->dev; enum gpiod_flags dflags; struct gpio_desc *desc; u32 rs485_delay[2]; int ret; if (!(port->rs485_supported.flags & SER_RS485_ENABLED)) return 0; ret = device_property_read_u32_array(dev, "rs485-rts-delay", rs485_delay, 2); if (!ret) { rs485conf->delay_rts_before_send = rs485_delay[0]; rs485conf->delay_rts_after_send = rs485_delay[1]; } else { rs485conf->delay_rts_before_send = 0; rs485conf->delay_rts_after_send = 0; } uart_sanitize_serial_rs485_delays(port, rs485conf); /* * Clear full-duplex and enabled flags, set RTS polarity to active high * to get to a defined state with the following properties: */ rs485conf->flags &= ~(SER_RS485_RX_DURING_TX | SER_RS485_ENABLED | SER_RS485_TERMINATE_BUS | SER_RS485_RTS_AFTER_SEND); rs485conf->flags |= SER_RS485_RTS_ON_SEND; if (device_property_read_bool(dev, "rs485-rx-during-tx")) rs485conf->flags |= SER_RS485_RX_DURING_TX; if (device_property_read_bool(dev, "linux,rs485-enabled-at-boot-time")) rs485conf->flags |= SER_RS485_ENABLED; if (device_property_read_bool(dev, "rs485-rts-active-low")) { rs485conf->flags &= ~SER_RS485_RTS_ON_SEND; rs485conf->flags |= SER_RS485_RTS_AFTER_SEND; } /* * Disabling termination by default is the safe choice: Else if many * bus participants enable it, no communication is possible at all. * Works fine for short cables and users may enable for longer cables. */ desc = devm_gpiod_get_optional(dev, "rs485-term", GPIOD_OUT_LOW); if (IS_ERR(desc)) return dev_err_probe(dev, PTR_ERR(desc), "Cannot get rs485-term-gpios\n"); port->rs485_term_gpio = desc; if (port->rs485_term_gpio) port->rs485_supported.flags |= SER_RS485_TERMINATE_BUS; dflags = (rs485conf->flags & SER_RS485_RX_DURING_TX) ? GPIOD_OUT_HIGH : GPIOD_OUT_LOW; desc = devm_gpiod_get_optional(dev, "rs485-rx-during-tx", dflags); if (IS_ERR(desc)) return dev_err_probe(dev, PTR_ERR(desc), "Cannot get rs485-rx-during-tx-gpios\n"); port->rs485_rx_during_tx_gpio = desc; if (port->rs485_rx_during_tx_gpio) port->rs485_supported.flags |= SER_RS485_RX_DURING_TX; return 0; } EXPORT_SYMBOL_GPL(uart_get_rs485_mode); /* Compile-time assertions for serial_rs485 layout */ static_assert(offsetof(struct serial_rs485, padding) == (offsetof(struct serial_rs485, delay_rts_after_send) + sizeof(__u32))); static_assert(offsetof(struct serial_rs485, padding1) == offsetof(struct serial_rs485, padding[1])); static_assert((offsetof(struct serial_rs485, padding[4]) + sizeof(__u32)) == sizeof(struct serial_rs485)); MODULE_DESCRIPTION("Serial driver core"); MODULE_LICENSE("GPL"); |
6 1 1 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (c) 2006 Patrick McHardy <kaber@trash.net> */ #include <linux/module.h> #include <linux/init.h> #include <linux/skbuff.h> #include <linux/netfilter/x_tables.h> #include <linux/netfilter/xt_NFLOG.h> #include <net/netfilter/nf_log.h> MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>"); MODULE_DESCRIPTION("Xtables: packet logging to netlink using NFLOG"); MODULE_LICENSE("GPL"); MODULE_ALIAS("ipt_NFLOG"); MODULE_ALIAS("ip6t_NFLOG"); static unsigned int nflog_tg(struct sk_buff *skb, const struct xt_action_param *par) { const struct xt_nflog_info *info = par->targinfo; struct net *net = xt_net(par); struct nf_loginfo li; li.type = NF_LOG_TYPE_ULOG; li.u.ulog.copy_len = info->len; li.u.ulog.group = info->group; li.u.ulog.qthreshold = info->threshold; li.u.ulog.flags = 0; if (info->flags & XT_NFLOG_F_COPY_LEN) li.u.ulog.flags |= NF_LOG_F_COPY_LEN; nf_log_packet(net, xt_family(par), xt_hooknum(par), skb, xt_in(par), xt_out(par), &li, "%s", info->prefix); return XT_CONTINUE; } static int nflog_tg_check(const struct xt_tgchk_param *par) { const struct xt_nflog_info *info = par->targinfo; int ret; if (info->flags & ~XT_NFLOG_MASK) return -EINVAL; if (info->prefix[sizeof(info->prefix) - 1] != '\0') return -EINVAL; ret = nf_logger_find_get(par->family, NF_LOG_TYPE_ULOG); if (ret != 0 && !par->nft_compat) { request_module("%s", "nfnetlink_log"); ret = nf_logger_find_get(par->family, NF_LOG_TYPE_ULOG); } return ret; } static void nflog_tg_destroy(const struct xt_tgdtor_param *par) { nf_logger_put(par->family, NF_LOG_TYPE_ULOG); } static struct xt_target nflog_tg_reg __read_mostly = { .name = "NFLOG", .revision = 0, .family = NFPROTO_UNSPEC, .checkentry = nflog_tg_check, .destroy = nflog_tg_destroy, .target = nflog_tg, .targetsize = sizeof(struct xt_nflog_info), .me = THIS_MODULE, }; static int __init nflog_tg_init(void) { return xt_register_target(&nflog_tg_reg); } static void __exit nflog_tg_exit(void) { xt_unregister_target(&nflog_tg_reg); } module_init(nflog_tg_init); module_exit(nflog_tg_exit); MODULE_SOFTDEP("pre: nfnetlink_log"); |
1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5250 5251 5252 5253 5254 5255 5256 5257 5258 5259 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309 5310 5311 5312 5313 5314 5315 5316 5317 5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334 5335 5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378 5379 5380 5381 5382 5383 5384 5385 5386 5387 5388 5389 5390 5391 5392 5393 5394 5395 5396 5397 5398 5399 5400 5401 5402 5403 5404 5405 5406 5407 5408 5409 5410 5411 5412 5413 5414 5415 5416 5417 5418 5419 5420 5421 5422 5423 5424 5425 5426 5427 5428 5429 5430 5431 5432 5433 5434 5435 5436 5437 5438 5439 5440 5441 5442 5443 5444 5445 5446 5447 5448 5449 5450 5451 5452 5453 5454 5455 5456 5457 5458 5459 5460 5461 5462 5463 5464 5465 5466 5467 5468 5469 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 5497 5498 5499 5500 5501 5502 5503 5504 5505 5506 5507 5508 5509 5510 5511 5512 5513 5514 5515 5516 5517 5518 5519 5520 5521 5522 5523 5524 5525 5526 5527 5528 5529 5530 5531 5532 5533 5534 5535 5536 5537 5538 5539 5540 5541 5542 5543 5544 5545 5546 5547 5548 5549 5550 5551 5552 5553 5554 5555 5556 5557 5558 5559 5560 5561 5562 5563 5564 5565 5566 5567 5568 5569 5570 5571 5572 5573 5574 5575 5576 5577 5578 5579 5580 5581 5582 5583 5584 5585 5586 5587 5588 5589 5590 5591 5592 5593 5594 5595 5596 5597 5598 5599 5600 5601 5602 5603 5604 5605 5606 5607 5608 5609 5610 5611 5612 5613 5614 5615 5616 5617 5618 5619 5620 5621 5622 5623 5624 5625 5626 5627 5628 5629 5630 5631 5632 5633 5634 5635 5636 5637 5638 5639 5640 5641 5642 5643 5644 5645 5646 5647 5648 5649 5650 5651 5652 5653 5654 5655 5656 5657 5658 5659 5660 5661 5662 5663 5664 5665 5666 5667 5668 5669 5670 5671 5672 5673 5674 5675 5676 5677 5678 5679 5680 5681 5682 5683 5684 5685 5686 5687 5688 5689 5690 5691 5692 5693 5694 5695 5696 5697 5698 5699 5700 5701 5702 5703 5704 5705 5706 5707 5708 5709 5710 5711 5712 5713 5714 5715 5716 5717 5718 5719 5720 5721 5722 5723 5724 5725 5726 5727 5728 5729 5730 5731 5732 5733 5734 5735 5736 5737 5738 5739 5740 5741 5742 5743 5744 5745 5746 5747 5748 5749 5750 5751 5752 5753 5754 5755 5756 5757 5758 5759 5760 5761 5762 5763 5764 5765 5766 5767 5768 5769 5770 5771 5772 5773 5774 5775 5776 5777 5778 5779 5780 5781 5782 5783 5784 5785 5786 5787 5788 5789 5790 5791 5792 5793 5794 5795 5796 5797 5798 5799 5800 5801 5802 5803 5804 5805 5806 5807 5808 5809 5810 5811 5812 5813 5814 5815 5816 5817 5818 5819 5820 5821 5822 5823 5824 5825 5826 5827 5828 5829 5830 5831 5832 5833 5834 5835 5836 5837 5838 5839 5840 5841 5842 5843 5844 5845 5846 5847 5848 5849 5850 5851 5852 5853 5854 5855 5856 5857 5858 5859 5860 5861 5862 5863 5864 5865 5866 5867 5868 5869 5870 5871 5872 5873 5874 5875 5876 5877 5878 5879 5880 5881 5882 5883 5884 5885 5886 5887 5888 5889 5890 5891 5892 5893 5894 5895 5896 5897 5898 5899 5900 5901 5902 5903 5904 5905 5906 5907 5908 5909 5910 5911 5912 5913 5914 5915 5916 5917 5918 5919 5920 5921 5922 5923 5924 5925 5926 5927 5928 5929 5930 5931 5932 5933 5934 5935 5936 5937 5938 5939 5940 5941 5942 5943 5944 5945 5946 5947 5948 5949 5950 5951 5952 5953 5954 5955 5956 5957 5958 5959 5960 5961 5962 5963 5964 5965 5966 5967 5968 5969 5970 5971 5972 5973 5974 5975 5976 5977 5978 5979 5980 5981 5982 5983 5984 5985 5986 5987 5988 5989 5990 5991 5992 5993 5994 5995 5996 5997 5998 5999 6000 6001 6002 6003 6004 6005 6006 6007 6008 6009 6010 6011 6012 6013 6014 6015 6016 6017 6018 6019 6020 6021 6022 6023 6024 6025 6026 6027 6028 6029 6030 6031 6032 6033 6034 6035 6036 6037 6038 6039 6040 6041 6042 6043 6044 6045 6046 6047 6048 6049 6050 6051 6052 6053 6054 6055 6056 6057 6058 6059 6060 6061 6062 6063 6064 6065 6066 6067 6068 6069 6070 6071 6072 6073 6074 6075 6076 6077 6078 6079 6080 6081 6082 6083 6084 6085 6086 6087 6088 6089 6090 6091 6092 6093 6094 6095 6096 6097 6098 6099 6100 6101 6102 6103 6104 6105 6106 6107 6108 6109 6110 6111 6112 6113 6114 6115 6116 6117 6118 6119 6120 6121 6122 6123 6124 6125 6126 6127 6128 6129 6130 6131 6132 6133 6134 6135 6136 6137 6138 6139 6140 6141 6142 6143 6144 6145 6146 6147 6148 6149 6150 6151 6152 6153 6154 6155 6156 6157 6158 6159 6160 6161 6162 6163 6164 6165 6166 6167 6168 6169 6170 6171 6172 6173 6174 6175 6176 6177 6178 6179 6180 6181 6182 6183 6184 6185 6186 6187 6188 6189 6190 6191 6192 6193 6194 6195 6196 6197 6198 6199 6200 6201 6202 6203 6204 6205 6206 6207 6208 6209 6210 6211 6212 6213 6214 6215 6216 6217 6218 6219 6220 6221 6222 6223 6224 6225 6226 6227 6228 6229 6230 6231 6232 6233 6234 6235 6236 6237 6238 6239 6240 6241 6242 6243 6244 6245 6246 6247 6248 6249 6250 6251 6252 6253 6254 6255 6256 6257 6258 6259 6260 6261 6262 6263 6264 6265 6266 6267 6268 6269 6270 6271 6272 6273 6274 6275 6276 6277 6278 6279 6280 6281 6282 6283 6284 6285 6286 6287 6288 6289 6290 6291 6292 6293 6294 6295 6296 6297 6298 6299 6300 6301 6302 6303 6304 6305 6306 6307 6308 6309 6310 6311 6312 6313 6314 6315 6316 6317 6318 6319 6320 6321 6322 6323 6324 6325 6326 6327 6328 6329 6330 6331 6332 6333 6334 6335 6336 6337 6338 6339 6340 6341 6342 6343 6344 6345 6346 6347 6348 6349 6350 6351 6352 6353 6354 6355 6356 6357 6358 6359 6360 6361 6362 6363 | // SPDX-License-Identifier: GPL-2.0-or-later // // core.c -- Voltage/Current Regulator framework. // // Copyright 2007, 2008 Wolfson Microelectronics PLC. // Copyright 2008 SlimLogic Ltd. // // Author: Liam Girdwood <lrg@slimlogic.co.uk> #include <linux/kernel.h> #include <linux/init.h> #include <linux/debugfs.h> #include <linux/device.h> #include <linux/slab.h> #include <linux/async.h> #include <linux/err.h> #include <linux/mutex.h> #include <linux/suspend.h> #include <linux/delay.h> #include <linux/gpio/consumer.h> #include <linux/of.h> #include <linux/reboot.h> #include <linux/regmap.h> #include <linux/regulator/of_regulator.h> #include <linux/regulator/consumer.h> #include <linux/regulator/coupler.h> #include <linux/regulator/driver.h> #include <linux/regulator/machine.h> #include <linux/module.h> #define CREATE_TRACE_POINTS #include <trace/events/regulator.h> #include "dummy.h" #include "internal.h" #include "regnl.h" static DEFINE_WW_CLASS(regulator_ww_class); static DEFINE_MUTEX(regulator_nesting_mutex); static DEFINE_MUTEX(regulator_list_mutex); static LIST_HEAD(regulator_map_list); static LIST_HEAD(regulator_ena_gpio_list); static LIST_HEAD(regulator_supply_alias_list); static LIST_HEAD(regulator_coupler_list); static bool has_full_constraints; static struct dentry *debugfs_root; /* * struct regulator_map * * Used to provide symbolic supply names to devices. */ struct regulator_map { struct list_head list; const char *dev_name; /* The dev_name() for the consumer */ const char *supply; struct regulator_dev *regulator; }; /* * struct regulator_enable_gpio * * Management for shared enable GPIO pin */ struct regulator_enable_gpio { struct list_head list; struct gpio_desc *gpiod; u32 enable_count; /* a number of enabled shared GPIO */ u32 request_count; /* a number of requested shared GPIO */ }; /* * struct regulator_supply_alias * * Used to map lookups for a supply onto an alternative device. */ struct regulator_supply_alias { struct list_head list; struct device *src_dev; const char *src_supply; struct device *alias_dev; const char *alias_supply; }; static int _regulator_is_enabled(struct regulator_dev *rdev); static int _regulator_disable(struct regulator *regulator); static int _regulator_get_error_flags(struct regulator_dev *rdev, unsigned int *flags); static int _regulator_get_current_limit(struct regulator_dev *rdev); static unsigned int _regulator_get_mode(struct regulator_dev *rdev); static int _notifier_call_chain(struct regulator_dev *rdev, unsigned long event, void *data); static int _regulator_do_set_voltage(struct regulator_dev *rdev, int min_uV, int max_uV); static int regulator_balance_voltage(struct regulator_dev *rdev, suspend_state_t state); static struct regulator *create_regulator(struct regulator_dev *rdev, struct device *dev, const char *supply_name); static void destroy_regulator(struct regulator *regulator); static void _regulator_put(struct regulator *regulator); const char *rdev_get_name(struct regulator_dev *rdev) { if (rdev->constraints && rdev->constraints->name) return rdev->constraints->name; else if (rdev->desc->name) return rdev->desc->name; else return ""; } EXPORT_SYMBOL_GPL(rdev_get_name); static bool have_full_constraints(void) { return has_full_constraints || of_have_populated_dt(); } static bool regulator_ops_is_valid(struct regulator_dev *rdev, int ops) { if (!rdev->constraints) { rdev_err(rdev, "no constraints\n"); return false; } if (rdev->constraints->valid_ops_mask & ops) return true; return false; } /** * regulator_lock_nested - lock a single regulator * @rdev: regulator source * @ww_ctx: w/w mutex acquire context * * This function can be called many times by one task on * a single regulator and its mutex will be locked only * once. If a task, which is calling this function is other * than the one, which initially locked the mutex, it will * wait on mutex. */ static inline int regulator_lock_nested(struct regulator_dev *rdev, struct ww_acquire_ctx *ww_ctx) { bool lock = false; int ret = 0; mutex_lock(®ulator_nesting_mutex); if (!ww_mutex_trylock(&rdev->mutex, ww_ctx)) { if (rdev->mutex_owner == current) rdev->ref_cnt++; else lock = true; if (lock) { mutex_unlock(®ulator_nesting_mutex); ret = ww_mutex_lock(&rdev->mutex, ww_ctx); mutex_lock(®ulator_nesting_mutex); } } else { lock = true; } if (lock && ret != -EDEADLK) { rdev->ref_cnt++; rdev->mutex_owner = current; } mutex_unlock(®ulator_nesting_mutex); return ret; } /** * regulator_lock - lock a single regulator * @rdev: regulator source * * This function can be called many times by one task on * a single regulator and its mutex will be locked only * once. If a task, which is calling this function is other * than the one, which initially locked the mutex, it will * wait on mutex. */ static void regulator_lock(struct regulator_dev *rdev) { regulator_lock_nested(rdev, NULL); } /** * regulator_unlock - unlock a single regulator * @rdev: regulator_source * * This function unlocks the mutex when the * reference counter reaches 0. */ static void regulator_unlock(struct regulator_dev *rdev) { mutex_lock(®ulator_nesting_mutex); if (--rdev->ref_cnt == 0) { rdev->mutex_owner = NULL; ww_mutex_unlock(&rdev->mutex); } WARN_ON_ONCE(rdev->ref_cnt < 0); mutex_unlock(®ulator_nesting_mutex); } /** * regulator_lock_two - lock two regulators * @rdev1: first regulator * @rdev2: second regulator * @ww_ctx: w/w mutex acquire context * * Locks both rdevs using the regulator_ww_class. */ static void regulator_lock_two(struct regulator_dev *rdev1, struct regulator_dev *rdev2, struct ww_acquire_ctx *ww_ctx) { struct regulator_dev *held, *contended; int ret; ww_acquire_init(ww_ctx, ®ulator_ww_class); /* Try to just grab both of them */ ret = regulator_lock_nested(rdev1, ww_ctx); WARN_ON(ret); ret = regulator_lock_nested(rdev2, ww_ctx); if (ret != -EDEADLOCK) { WARN_ON(ret); goto exit; } held = rdev1; contended = rdev2; while (true) { regulator_unlock(held); ww_mutex_lock_slow(&contended->mutex, ww_ctx); contended->ref_cnt++; contended->mutex_owner = current; swap(held, contended); ret = regulator_lock_nested(contended, ww_ctx); if (ret != -EDEADLOCK) { WARN_ON(ret); break; } } exit: ww_acquire_done(ww_ctx); } /** * regulator_unlock_two - unlock two regulators * @rdev1: first regulator * @rdev2: second regulator * @ww_ctx: w/w mutex acquire context * * The inverse of regulator_lock_two(). */ static void regulator_unlock_two(struct regulator_dev *rdev1, struct regulator_dev *rdev2, struct ww_acquire_ctx *ww_ctx) { regulator_unlock(rdev2); regulator_unlock(rdev1); ww_acquire_fini(ww_ctx); } static bool regulator_supply_is_couple(struct regulator_dev *rdev) { struct regulator_dev *c_rdev; int i; for (i = 1; i < rdev->coupling_desc.n_coupled; i++) { c_rdev = rdev->coupling_desc.coupled_rdevs[i]; if (rdev->supply->rdev == c_rdev) return true; } return false; } static void regulator_unlock_recursive(struct regulator_dev *rdev, unsigned int n_coupled) { struct regulator_dev *c_rdev, *supply_rdev; int i, supply_n_coupled; for (i = n_coupled; i > 0; i--) { c_rdev = rdev->coupling_desc.coupled_rdevs[i - 1]; if (!c_rdev) continue; if (c_rdev->supply && !regulator_supply_is_couple(c_rdev)) { supply_rdev = c_rdev->supply->rdev; supply_n_coupled = supply_rdev->coupling_desc.n_coupled; regulator_unlock_recursive(supply_rdev, supply_n_coupled); } regulator_unlock(c_rdev); } } static int regulator_lock_recursive(struct regulator_dev *rdev, struct regulator_dev **new_contended_rdev, struct regulator_dev **old_contended_rdev, struct ww_acquire_ctx *ww_ctx) { struct regulator_dev *c_rdev; int i, err; for (i = 0; i < rdev->coupling_desc.n_coupled; i++) { c_rdev = rdev->coupling_desc.coupled_rdevs[i]; if (!c_rdev) continue; if (c_rdev != *old_contended_rdev) { err = regulator_lock_nested(c_rdev, ww_ctx); if (err) { if (err == -EDEADLK) { *new_contended_rdev = c_rdev; goto err_unlock; } /* shouldn't happen */ WARN_ON_ONCE(err != -EALREADY); } } else { *old_contended_rdev = NULL; } if (c_rdev->supply && !regulator_supply_is_couple(c_rdev)) { err = regulator_lock_recursive(c_rdev->supply->rdev, new_contended_rdev, old_contended_rdev, ww_ctx); if (err) { regulator_unlock(c_rdev); goto err_unlock; } } } return 0; err_unlock: regulator_unlock_recursive(rdev, i); return err; } /** * regulator_unlock_dependent - unlock regulator's suppliers and coupled * regulators * @rdev: regulator source * @ww_ctx: w/w mutex acquire context * * Unlock all regulators related with rdev by coupling or supplying. */ static void regulator_unlock_dependent(struct regulator_dev *rdev, struct ww_acquire_ctx *ww_ctx) { regulator_unlock_recursive(rdev, rdev->coupling_desc.n_coupled); ww_acquire_fini(ww_ctx); } /** * regulator_lock_dependent - lock regulator's suppliers and coupled regulators * @rdev: regulator source * @ww_ctx: w/w mutex acquire context * * This function as a wrapper on regulator_lock_recursive(), which locks * all regulators related with rdev by coupling or supplying. */ static void regulator_lock_dependent(struct regulator_dev *rdev, struct ww_acquire_ctx *ww_ctx) { struct regulator_dev *new_contended_rdev = NULL; struct regulator_dev *old_contended_rdev = NULL; int err; mutex_lock(®ulator_list_mutex); ww_acquire_init(ww_ctx, ®ulator_ww_class); do { if (new_contended_rdev) { ww_mutex_lock_slow(&new_contended_rdev->mutex, ww_ctx); old_contended_rdev = new_contended_rdev; old_contended_rdev->ref_cnt++; old_contended_rdev->mutex_owner = current; } err = regulator_lock_recursive(rdev, &new_contended_rdev, &old_contended_rdev, ww_ctx); if (old_contended_rdev) regulator_unlock(old_contended_rdev); } while (err == -EDEADLK); ww_acquire_done(ww_ctx); mutex_unlock(®ulator_list_mutex); } /** * of_get_child_regulator - get a child regulator device node * based on supply name * @parent: Parent device node * @prop_name: Combination regulator supply name and "-supply" * * Traverse all child nodes. * Extract the child regulator device node corresponding to the supply name. * returns the device node corresponding to the regulator if found, else * returns NULL. */ static struct device_node *of_get_child_regulator(struct device_node *parent, const char *prop_name) { struct device_node *regnode = NULL; struct device_node *child = NULL; for_each_child_of_node(parent, child) { regnode = of_parse_phandle(child, prop_name, 0); if (!regnode) { regnode = of_get_child_regulator(child, prop_name); if (regnode) goto err_node_put; } else { goto err_node_put; } } return NULL; err_node_put: of_node_put(child); return regnode; } /** * of_get_regulator - get a regulator device node based on supply name * @dev: Device pointer for the consumer (of regulator) device * @supply: regulator supply name * * Extract the regulator device node corresponding to the supply name. * returns the device node corresponding to the regulator if found, else * returns NULL. */ static struct device_node *of_get_regulator(struct device *dev, const char *supply) { struct device_node *regnode = NULL; char prop_name[64]; /* 64 is max size of property name */ dev_dbg(dev, "Looking up %s-supply from device tree\n", supply); snprintf(prop_name, 64, "%s-supply", supply); regnode = of_parse_phandle(dev->of_node, prop_name, 0); if (!regnode) { regnode = of_get_child_regulator(dev->of_node, prop_name); if (regnode) return regnode; dev_dbg(dev, "Looking up %s property in node %pOF failed\n", prop_name, dev->of_node); return NULL; } return regnode; } /* Platform voltage constraint check */ int regulator_check_voltage(struct regulator_dev *rdev, int *min_uV, int *max_uV) { BUG_ON(*min_uV > *max_uV); if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_VOLTAGE)) { rdev_err(rdev, "voltage operation not allowed\n"); return -EPERM; } if (*max_uV > rdev->constraints->max_uV) *max_uV = rdev->constraints->max_uV; if (*min_uV < rdev->constraints->min_uV) *min_uV = rdev->constraints->min_uV; if (*min_uV > *max_uV) { rdev_err(rdev, "unsupportable voltage range: %d-%duV\n", *min_uV, *max_uV); return -EINVAL; } return 0; } /* return 0 if the state is valid */ static int regulator_check_states(suspend_state_t state) { return (state > PM_SUSPEND_MAX || state == PM_SUSPEND_TO_IDLE); } /* Make sure we select a voltage that suits the needs of all * regulator consumers */ int regulator_check_consumers(struct regulator_dev *rdev, int *min_uV, int *max_uV, suspend_state_t state) { struct regulator *regulator; struct regulator_voltage *voltage; list_for_each_entry(regulator, &rdev->consumer_list, list) { voltage = ®ulator->voltage[state]; /* * Assume consumers that didn't say anything are OK * with anything in the constraint range. */ if (!voltage->min_uV && !voltage->max_uV) continue; if (*max_uV > voltage->max_uV) *max_uV = voltage->max_uV; if (*min_uV < voltage->min_uV) *min_uV = voltage->min_uV; } if (*min_uV > *max_uV) { rdev_err(rdev, "Restricting voltage, %u-%uuV\n", *min_uV, *max_uV); return -EINVAL; } return 0; } /* current constraint check */ static int regulator_check_current_limit(struct regulator_dev *rdev, int *min_uA, int *max_uA) { BUG_ON(*min_uA > *max_uA); if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_CURRENT)) { rdev_err(rdev, "current operation not allowed\n"); return -EPERM; } if (*max_uA > rdev->constraints->max_uA) *max_uA = rdev->constraints->max_uA; if (*min_uA < rdev->constraints->min_uA) *min_uA = rdev->constraints->min_uA; if (*min_uA > *max_uA) { rdev_err(rdev, "unsupportable current range: %d-%duA\n", *min_uA, *max_uA); return -EINVAL; } return 0; } /* operating mode constraint check */ static int regulator_mode_constrain(struct regulator_dev *rdev, unsigned int *mode) { switch (*mode) { case REGULATOR_MODE_FAST: case REGULATOR_MODE_NORMAL: case REGULATOR_MODE_IDLE: case REGULATOR_MODE_STANDBY: break; default: rdev_err(rdev, "invalid mode %x specified\n", *mode); return -EINVAL; } if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_MODE)) { rdev_err(rdev, "mode operation not allowed\n"); return -EPERM; } /* The modes are bitmasks, the most power hungry modes having * the lowest values. If the requested mode isn't supported * try higher modes. */ while (*mode) { if (rdev->constraints->valid_modes_mask & *mode) return 0; *mode /= 2; } return -EINVAL; } static inline struct regulator_state * regulator_get_suspend_state(struct regulator_dev *rdev, suspend_state_t state) { if (rdev->constraints == NULL) return NULL; switch (state) { case PM_SUSPEND_STANDBY: return &rdev->constraints->state_standby; case PM_SUSPEND_MEM: return &rdev->constraints->state_mem; case PM_SUSPEND_MAX: return &rdev->constraints->state_disk; default: return NULL; } } static const struct regulator_state * regulator_get_suspend_state_check(struct regulator_dev *rdev, suspend_state_t state) { const struct regulator_state *rstate; rstate = regulator_get_suspend_state(rdev, state); if (rstate == NULL) return NULL; /* If we have no suspend mode configuration don't set anything; * only warn if the driver implements set_suspend_voltage or * set_suspend_mode callback. */ if (rstate->enabled != ENABLE_IN_SUSPEND && rstate->enabled != DISABLE_IN_SUSPEND) { if (rdev->desc->ops->set_suspend_voltage || rdev->desc->ops->set_suspend_mode) rdev_warn(rdev, "No configuration\n"); return NULL; } return rstate; } static ssize_t microvolts_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); int uV; regulator_lock(rdev); uV = regulator_get_voltage_rdev(rdev); regulator_unlock(rdev); if (uV < 0) return uV; return sprintf(buf, "%d\n", uV); } static DEVICE_ATTR_RO(microvolts); static ssize_t microamps_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return sprintf(buf, "%d\n", _regulator_get_current_limit(rdev)); } static DEVICE_ATTR_RO(microamps); static ssize_t name_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return sprintf(buf, "%s\n", rdev_get_name(rdev)); } static DEVICE_ATTR_RO(name); static const char *regulator_opmode_to_str(int mode) { switch (mode) { case REGULATOR_MODE_FAST: return "fast"; case REGULATOR_MODE_NORMAL: return "normal"; case REGULATOR_MODE_IDLE: return "idle"; case REGULATOR_MODE_STANDBY: return "standby"; } return "unknown"; } static ssize_t regulator_print_opmode(char *buf, int mode) { return sprintf(buf, "%s\n", regulator_opmode_to_str(mode)); } static ssize_t opmode_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return regulator_print_opmode(buf, _regulator_get_mode(rdev)); } static DEVICE_ATTR_RO(opmode); static ssize_t regulator_print_state(char *buf, int state) { if (state > 0) return sprintf(buf, "enabled\n"); else if (state == 0) return sprintf(buf, "disabled\n"); else return sprintf(buf, "unknown\n"); } static ssize_t state_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); ssize_t ret; regulator_lock(rdev); ret = regulator_print_state(buf, _regulator_is_enabled(rdev)); regulator_unlock(rdev); return ret; } static DEVICE_ATTR_RO(state); static ssize_t status_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); int status; char *label; status = rdev->desc->ops->get_status(rdev); if (status < 0) return status; switch (status) { case REGULATOR_STATUS_OFF: label = "off"; break; case REGULATOR_STATUS_ON: label = "on"; break; case REGULATOR_STATUS_ERROR: label = "error"; break; case REGULATOR_STATUS_FAST: label = "fast"; break; case REGULATOR_STATUS_NORMAL: label = "normal"; break; case REGULATOR_STATUS_IDLE: label = "idle"; break; case REGULATOR_STATUS_STANDBY: label = "standby"; break; case REGULATOR_STATUS_BYPASS: label = "bypass"; break; case REGULATOR_STATUS_UNDEFINED: label = "undefined"; break; default: return -ERANGE; } return sprintf(buf, "%s\n", label); } static DEVICE_ATTR_RO(status); static ssize_t min_microamps_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); if (!rdev->constraints) return sprintf(buf, "constraint not defined\n"); return sprintf(buf, "%d\n", rdev->constraints->min_uA); } static DEVICE_ATTR_RO(min_microamps); static ssize_t max_microamps_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); if (!rdev->constraints) return sprintf(buf, "constraint not defined\n"); return sprintf(buf, "%d\n", rdev->constraints->max_uA); } static DEVICE_ATTR_RO(max_microamps); static ssize_t min_microvolts_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); if (!rdev->constraints) return sprintf(buf, "constraint not defined\n"); return sprintf(buf, "%d\n", rdev->constraints->min_uV); } static DEVICE_ATTR_RO(min_microvolts); static ssize_t max_microvolts_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); if (!rdev->constraints) return sprintf(buf, "constraint not defined\n"); return sprintf(buf, "%d\n", rdev->constraints->max_uV); } static DEVICE_ATTR_RO(max_microvolts); static ssize_t requested_microamps_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); struct regulator *regulator; int uA = 0; regulator_lock(rdev); list_for_each_entry(regulator, &rdev->consumer_list, list) { if (regulator->enable_count) uA += regulator->uA_load; } regulator_unlock(rdev); return sprintf(buf, "%d\n", uA); } static DEVICE_ATTR_RO(requested_microamps); static ssize_t num_users_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return sprintf(buf, "%d\n", rdev->use_count); } static DEVICE_ATTR_RO(num_users); static ssize_t type_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); switch (rdev->desc->type) { case REGULATOR_VOLTAGE: return sprintf(buf, "voltage\n"); case REGULATOR_CURRENT: return sprintf(buf, "current\n"); } return sprintf(buf, "unknown\n"); } static DEVICE_ATTR_RO(type); static ssize_t suspend_mem_microvolts_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return sprintf(buf, "%d\n", rdev->constraints->state_mem.uV); } static DEVICE_ATTR_RO(suspend_mem_microvolts); static ssize_t suspend_disk_microvolts_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return sprintf(buf, "%d\n", rdev->constraints->state_disk.uV); } static DEVICE_ATTR_RO(suspend_disk_microvolts); static ssize_t suspend_standby_microvolts_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return sprintf(buf, "%d\n", rdev->constraints->state_standby.uV); } static DEVICE_ATTR_RO(suspend_standby_microvolts); static ssize_t suspend_mem_mode_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return regulator_print_opmode(buf, rdev->constraints->state_mem.mode); } static DEVICE_ATTR_RO(suspend_mem_mode); static ssize_t suspend_disk_mode_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return regulator_print_opmode(buf, rdev->constraints->state_disk.mode); } static DEVICE_ATTR_RO(suspend_disk_mode); static ssize_t suspend_standby_mode_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return regulator_print_opmode(buf, rdev->constraints->state_standby.mode); } static DEVICE_ATTR_RO(suspend_standby_mode); static ssize_t suspend_mem_state_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return regulator_print_state(buf, rdev->constraints->state_mem.enabled); } static DEVICE_ATTR_RO(suspend_mem_state); static ssize_t suspend_disk_state_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return regulator_print_state(buf, rdev->constraints->state_disk.enabled); } static DEVICE_ATTR_RO(suspend_disk_state); static ssize_t suspend_standby_state_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); return regulator_print_state(buf, rdev->constraints->state_standby.enabled); } static DEVICE_ATTR_RO(suspend_standby_state); static ssize_t bypass_show(struct device *dev, struct device_attribute *attr, char *buf) { struct regulator_dev *rdev = dev_get_drvdata(dev); const char *report; bool bypass; int ret; ret = rdev->desc->ops->get_bypass(rdev, &bypass); if (ret != 0) report = "unknown"; else if (bypass) report = "enabled"; else report = "disabled"; return sprintf(buf, "%s\n", report); } static DEVICE_ATTR_RO(bypass); #define REGULATOR_ERROR_ATTR(name, bit) \ static ssize_t name##_show(struct device *dev, struct device_attribute *attr, \ char *buf) \ { \ int ret; \ unsigned int flags; \ struct regulator_dev *rdev = dev_get_drvdata(dev); \ ret = _regulator_get_error_flags(rdev, &flags); \ if (ret) \ return ret; \ return sysfs_emit(buf, "%d\n", !!(flags & (bit))); \ } \ static DEVICE_ATTR_RO(name) REGULATOR_ERROR_ATTR(under_voltage, REGULATOR_ERROR_UNDER_VOLTAGE); REGULATOR_ERROR_ATTR(over_current, REGULATOR_ERROR_OVER_CURRENT); REGULATOR_ERROR_ATTR(regulation_out, REGULATOR_ERROR_REGULATION_OUT); REGULATOR_ERROR_ATTR(fail, REGULATOR_ERROR_FAIL); REGULATOR_ERROR_ATTR(over_temp, REGULATOR_ERROR_OVER_TEMP); REGULATOR_ERROR_ATTR(under_voltage_warn, REGULATOR_ERROR_UNDER_VOLTAGE_WARN); REGULATOR_ERROR_ATTR(over_current_warn, REGULATOR_ERROR_OVER_CURRENT_WARN); REGULATOR_ERROR_ATTR(over_voltage_warn, REGULATOR_ERROR_OVER_VOLTAGE_WARN); REGULATOR_ERROR_ATTR(over_temp_warn, REGULATOR_ERROR_OVER_TEMP_WARN); /* Calculate the new optimum regulator operating mode based on the new total * consumer load. All locks held by caller */ static int drms_uA_update(struct regulator_dev *rdev) { struct regulator *sibling; int current_uA = 0, output_uV, input_uV, err; unsigned int mode; /* * first check to see if we can set modes at all, otherwise just * tell the consumer everything is OK. */ if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_DRMS)) { rdev_dbg(rdev, "DRMS operation not allowed\n"); return 0; } if (!rdev->desc->ops->get_optimum_mode && !rdev->desc->ops->set_load) return 0; if (!rdev->desc->ops->set_mode && !rdev->desc->ops->set_load) return -EINVAL; /* calc total requested load */ list_for_each_entry(sibling, &rdev->consumer_list, list) { if (sibling->enable_count) current_uA += sibling->uA_load; } current_uA += rdev->constraints->system_load; if (rdev->desc->ops->set_load) { /* set the optimum mode for our new total regulator load */ err = rdev->desc->ops->set_load(rdev, current_uA); if (err < 0) rdev_err(rdev, "failed to set load %d: %pe\n", current_uA, ERR_PTR(err)); } else { /* * Unfortunately in some cases the constraints->valid_ops has * REGULATOR_CHANGE_DRMS but there are no valid modes listed. * That's not really legit but we won't consider it a fatal * error here. We'll treat it as if REGULATOR_CHANGE_DRMS * wasn't set. */ if (!rdev->constraints->valid_modes_mask) { rdev_dbg(rdev, "Can change modes; but no valid mode\n"); return 0; } /* get output voltage */ output_uV = regulator_get_voltage_rdev(rdev); /* * Don't return an error; if regulator driver cares about * output_uV then it's up to the driver to validate. */ if (output_uV <= 0) rdev_dbg(rdev, "invalid output voltage found\n"); /* get input voltage */ input_uV = 0; if (rdev->supply) input_uV = regulator_get_voltage_rdev(rdev->supply->rdev); if (input_uV <= 0) input_uV = rdev->constraints->input_uV; /* * Don't return an error; if regulator driver cares about * input_uV then it's up to the driver to validate. */ if (input_uV <= 0) rdev_dbg(rdev, "invalid input voltage found\n"); /* now get the optimum mode for our new total regulator load */ mode = rdev->desc->ops->get_optimum_mode(rdev, input_uV, output_uV, current_uA); /* check the new mode is allowed */ err = regulator_mode_constrain(rdev, &mode); if (err < 0) { rdev_err(rdev, "failed to get optimum mode @ %d uA %d -> %d uV: %pe\n", current_uA, input_uV, output_uV, ERR_PTR(err)); return err; } err = rdev->desc->ops->set_mode(rdev, mode); if (err < 0) rdev_err(rdev, "failed to set optimum mode %x: %pe\n", mode, ERR_PTR(err)); } return err; } static int __suspend_set_state(struct regulator_dev *rdev, const struct regulator_state *rstate) { int ret = 0; if (rstate->enabled == ENABLE_IN_SUSPEND && rdev->desc->ops->set_suspend_enable) ret = rdev->desc->ops->set_suspend_enable(rdev); else if (rstate->enabled == DISABLE_IN_SUSPEND && rdev->desc->ops->set_suspend_disable) ret = rdev->desc->ops->set_suspend_disable(rdev); else /* OK if set_suspend_enable or set_suspend_disable is NULL */ ret = 0; if (ret < 0) { rdev_err(rdev, "failed to enabled/disable: %pe\n", ERR_PTR(ret)); return ret; } if (rdev->desc->ops->set_suspend_voltage && rstate->uV > 0) { ret = rdev->desc->ops->set_suspend_voltage(rdev, rstate->uV); if (ret < 0) { rdev_err(rdev, "failed to set voltage: %pe\n", ERR_PTR(ret)); return ret; } } if (rdev->desc->ops->set_suspend_mode && rstate->mode > 0) { ret = rdev->desc->ops->set_suspend_mode(rdev, rstate->mode); if (ret < 0) { rdev_err(rdev, "failed to set mode: %pe\n", ERR_PTR(ret)); return ret; } } return ret; } static int suspend_set_initial_state(struct regulator_dev *rdev) { const struct regulator_state *rstate; rstate = regulator_get_suspend_state_check(rdev, rdev->constraints->initial_state); if (!rstate) return 0; return __suspend_set_state(rdev, rstate); } #if defined(DEBUG) || defined(CONFIG_DYNAMIC_DEBUG) static void print_constraints_debug(struct regulator_dev *rdev) { struct regulation_constraints *constraints = rdev->constraints; char buf[160] = ""; size_t len = sizeof(buf) - 1; int count = 0; int ret; if (constraints->min_uV && constraints->max_uV) { if (constraints->min_uV == constraints->max_uV) count += scnprintf(buf + count, len - count, "%d mV ", constraints->min_uV / 1000); else count += scnprintf(buf + count, len - count, "%d <--> %d mV ", constraints->min_uV / 1000, constraints->max_uV / 1000); } if (!constraints->min_uV || constraints->min_uV != constraints->max_uV) { ret = regulator_get_voltage_rdev(rdev); if (ret > 0) count += scnprintf(buf + count, len - count, "at %d mV ", ret / 1000); } if (constraints->uV_offset) count += scnprintf(buf + count, len - count, "%dmV offset ", constraints->uV_offset / 1000); if (constraints->min_uA && constraints->max_uA) { if (constraints->min_uA == constraints->max_uA) count += scnprintf(buf + count, len - count, "%d mA ", constraints->min_uA / 1000); else count += scnprintf(buf + count, len - count, "%d <--> %d mA ", constraints->min_uA / 1000, constraints->max_uA / 1000); } if (!constraints->min_uA || constraints->min_uA != constraints->max_uA) { ret = _regulator_get_current_limit(rdev); if (ret > 0) count += scnprintf(buf + count, len - count, "at %d mA ", ret / 1000); } if (constraints->valid_modes_mask & REGULATOR_MODE_FAST) count += scnprintf(buf + count, len - count, "fast "); if (constraints->valid_modes_mask & REGULATOR_MODE_NORMAL) count += scnprintf(buf + count, len - count, "normal "); if (constraints->valid_modes_mask & REGULATOR_MODE_IDLE) count += scnprintf(buf + count, len - count, "idle "); if (constraints->valid_modes_mask & REGULATOR_MODE_STANDBY) count += scnprintf(buf + count, len - count, "standby "); if (!count) count = scnprintf(buf, len, "no parameters"); else --count; count += scnprintf(buf + count, len - count, ", %s", _regulator_is_enabled(rdev) ? "enabled" : "disabled"); rdev_dbg(rdev, "%s\n", buf); } #else /* !DEBUG && !CONFIG_DYNAMIC_DEBUG */ static inline void print_constraints_debug(struct regulator_dev *rdev) {} #endif /* !DEBUG && !CONFIG_DYNAMIC_DEBUG */ static void print_constraints(struct regulator_dev *rdev) { struct regulation_constraints *constraints = rdev->constraints; print_constraints_debug(rdev); if ((constraints->min_uV != constraints->max_uV) && !regulator_ops_is_valid(rdev, REGULATOR_CHANGE_VOLTAGE)) rdev_warn(rdev, "Voltage range but no REGULATOR_CHANGE_VOLTAGE\n"); } static int machine_constraints_voltage(struct regulator_dev *rdev, struct regulation_constraints *constraints) { const struct regulator_ops *ops = rdev->desc->ops; int ret; /* do we need to apply the constraint voltage */ if (rdev->constraints->apply_uV && rdev->constraints->min_uV && rdev->constraints->max_uV) { int target_min, target_max; int current_uV = regulator_get_voltage_rdev(rdev); if (current_uV == -ENOTRECOVERABLE) { /* This regulator can't be read and must be initialized */ rdev_info(rdev, "Setting %d-%duV\n", rdev->constraints->min_uV, rdev->constraints->max_uV); _regulator_do_set_voltage(rdev, rdev->constraints->min_uV, rdev->constraints->max_uV); current_uV = regulator_get_voltage_rdev(rdev); } if (current_uV < 0) { if (current_uV != -EPROBE_DEFER) rdev_err(rdev, "failed to get the current voltage: %pe\n", ERR_PTR(current_uV)); return current_uV; } /* * If we're below the minimum voltage move up to the * minimum voltage, if we're above the maximum voltage * then move down to the maximum. */ target_min = current_uV; target_max = current_uV; if (current_uV < rdev->constraints->min_uV) { target_min = rdev->constraints->min_uV; target_max = rdev->constraints->min_uV; } if (current_uV > rdev->constraints->max_uV) { target_min = rdev->constraints->max_uV; target_max = rdev->constraints->max_uV; } if (target_min != current_uV || target_max != current_uV) { rdev_info(rdev, "Bringing %duV into %d-%duV\n", current_uV, target_min, target_max); ret = _regulator_do_set_voltage( rdev, target_min, target_max); if (ret < 0) { rdev_err(rdev, "failed to apply %d-%duV constraint: %pe\n", target_min, target_max, ERR_PTR(ret)); return ret; } } } /* constrain machine-level voltage specs to fit * the actual range supported by this regulator. */ if (ops->list_voltage && rdev->desc->n_voltages) { int count = rdev->desc->n_voltages; int i; int min_uV = INT_MAX; int max_uV = INT_MIN; int cmin = constraints->min_uV; int cmax = constraints->max_uV; /* it's safe to autoconfigure fixed-voltage supplies * and the constraints are used by list_voltage. */ if (count == 1 && !cmin) { cmin = 1; cmax = INT_MAX; constraints->min_uV = cmin; constraints->max_uV = cmax; } /* voltage constraints are optional */ if ((cmin == 0) && (cmax == 0)) return 0; /* else require explicit machine-level constraints */ if (cmin <= 0 || cmax <= 0 || cmax < cmin) { rdev_err(rdev, "invalid voltage constraints\n"); return -EINVAL; } /* no need to loop voltages if range is continuous */ if (rdev->desc->continuous_voltage_range) return 0; /* initial: [cmin..cmax] valid, [min_uV..max_uV] not */ for (i = 0; i < count; i++) { int value; value = ops->list_voltage(rdev, i); if (value <= 0) continue; /* maybe adjust [min_uV..max_uV] */ if (value >= cmin && value < min_uV) min_uV = value; if (value <= cmax && value > max_uV) max_uV = value; } /* final: [min_uV..max_uV] valid iff constraints valid */ if (max_uV < min_uV) { rdev_err(rdev, "unsupportable voltage constraints %u-%uuV\n", min_uV, max_uV); return -EINVAL; } /* use regulator's subset of machine constraints */ if (constraints->min_uV < min_uV) { rdev_dbg(rdev, "override min_uV, %d -> %d\n", constraints->min_uV, min_uV); constraints->min_uV = min_uV; } if (constraints->max_uV > max_uV) { rdev_dbg(rdev, "override max_uV, %d -> %d\n", constraints->max_uV, max_uV); constraints->max_uV = max_uV; } } return 0; } static int machine_constraints_current(struct regulator_dev *rdev, struct regulation_constraints *constraints) { const struct regulator_ops *ops = rdev->desc->ops; int ret; if (!constraints->min_uA && !constraints->max_uA) return 0; if (constraints->min_uA > constraints->max_uA) { rdev_err(rdev, "Invalid current constraints\n"); return -EINVAL; } if (!ops->set_current_limit || !ops->get_current_limit) { rdev_warn(rdev, "Operation of current configuration missing\n"); return 0; } /* Set regulator current in constraints range */ ret = ops->set_current_limit(rdev, constraints->min_uA, constraints->max_uA); if (ret < 0) { rdev_err(rdev, "Failed to set current constraint, %d\n", ret); return ret; } return 0; } static int _regulator_do_enable(struct regulator_dev *rdev); static int notif_set_limit(struct regulator_dev *rdev, int (*set)(struct regulator_dev *, int, int, bool), int limit, int severity) { bool enable; if (limit == REGULATOR_NOTIF_LIMIT_DISABLE) { enable = false; limit = 0; } else { enable = true; } if (limit == REGULATOR_NOTIF_LIMIT_ENABLE) limit = 0; return set(rdev, limit, severity, enable); } static int handle_notify_limits(struct regulator_dev *rdev, int (*set)(struct regulator_dev *, int, int, bool), struct notification_limit *limits) { int ret = 0; if (!set) return -EOPNOTSUPP; if (limits->prot) ret = notif_set_limit(rdev, set, limits->prot, REGULATOR_SEVERITY_PROT); if (ret) return ret; if (limits->err) ret = notif_set_limit(rdev, set, limits->err, REGULATOR_SEVERITY_ERR); if (ret) return ret; if (limits->warn) ret = notif_set_limit(rdev, set, limits->warn, REGULATOR_SEVERITY_WARN); return ret; } /** * set_machine_constraints - sets regulator constraints * @rdev: regulator source * * Allows platform initialisation code to define and constrain * regulator circuits e.g. valid voltage/current ranges, etc. NOTE: * Constraints *must* be set by platform code in order for some * regulator operations to proceed i.e. set_voltage, set_current_limit, * set_mode. */ static int set_machine_constraints(struct regulator_dev *rdev) { int ret = 0; const struct regulator_ops *ops = rdev->desc->ops; ret = machine_constraints_voltage(rdev, rdev->constraints); if (ret != 0) return ret; ret = machine_constraints_current(rdev, rdev->constraints); if (ret != 0) return ret; if (rdev->constraints->ilim_uA && ops->set_input_current_limit) { ret = ops->set_input_current_limit(rdev, rdev->constraints->ilim_uA); if (ret < 0) { rdev_err(rdev, "failed to set input limit: %pe\n", ERR_PTR(ret)); return ret; } } /* do we need to setup our suspend state */ if (rdev->constraints->initial_state) { ret = suspend_set_initial_state(rdev); if (ret < 0) { rdev_err(rdev, "failed to set suspend state: %pe\n", ERR_PTR(ret)); return ret; } } if (rdev->constraints->initial_mode) { if (!ops->set_mode) { rdev_err(rdev, "no set_mode operation\n"); return -EINVAL; } ret = ops->set_mode(rdev, rdev->constraints->initial_mode); if (ret < 0) { rdev_err(rdev, "failed to set initial mode: %pe\n", ERR_PTR(ret)); return ret; } } else if (rdev->constraints->system_load) { /* * We'll only apply the initial system load if an * initial mode wasn't specified. */ drms_uA_update(rdev); } if ((rdev->constraints->ramp_delay || rdev->constraints->ramp_disable) && ops->set_ramp_delay) { ret = ops->set_ramp_delay(rdev, rdev->constraints->ramp_delay); if (ret < 0) { rdev_err(rdev, "failed to set ramp_delay: %pe\n", ERR_PTR(ret)); return ret; } } if (rdev->constraints->pull_down && ops->set_pull_down) { ret = ops->set_pull_down(rdev); if (ret < 0) { rdev_err(rdev, "failed to set pull down: %pe\n", ERR_PTR(ret)); return ret; } } if (rdev->constraints->soft_start && ops->set_soft_start) { ret = ops->set_soft_start(rdev); if (ret < 0) { rdev_err(rdev, "failed to set soft start: %pe\n", ERR_PTR(ret)); return ret; } } /* * Existing logic does not warn if over_current_protection is given as * a constraint but driver does not support that. I think we should * warn about this type of issues as it is possible someone changes * PMIC on board to another type - and the another PMIC's driver does * not support setting protection. Board composer may happily believe * the DT limits are respected - especially if the new PMIC HW also * supports protection but the driver does not. I won't change the logic * without hearing more experienced opinion on this though. * * If warning is seen as a good idea then we can merge handling the * over-curret protection and detection and get rid of this special * handling. */ if (rdev->constraints->over_current_protection && ops->set_over_current_protection) { int lim = rdev->constraints->over_curr_limits.prot; ret = ops->set_over_current_protection(rdev, lim, REGULATOR_SEVERITY_PROT, true); if (ret < 0) { rdev_err(rdev, "failed to set over current protection: %pe\n", ERR_PTR(ret)); return ret; } } if (rdev->constraints->over_current_detection) ret = handle_notify_limits(rdev, ops->set_over_current_protection, &rdev->constraints->over_curr_limits); if (ret) { if (ret != -EOPNOTSUPP) { rdev_err(rdev, "failed to set over current limits: %pe\n", ERR_PTR(ret)); return ret; } rdev_warn(rdev, "IC does not support requested over-current limits\n"); } if (rdev->constraints->over_voltage_detection) ret = handle_notify_limits(rdev, ops->set_over_voltage_protection, &rdev->constraints->over_voltage_limits); if (ret) { if (ret != -EOPNOTSUPP) { rdev_err(rdev, "failed to set over voltage limits %pe\n", ERR_PTR(ret)); return ret; } rdev_warn(rdev, "IC does not support requested over voltage limits\n"); } if (rdev->constraints->under_voltage_detection) ret = handle_notify_limits(rdev, ops->set_under_voltage_protection, &rdev->constraints->under_voltage_limits); if (ret) { if (ret != -EOPNOTSUPP) { rdev_err(rdev, "failed to set under voltage limits %pe\n", ERR_PTR(ret)); return ret; } rdev_warn(rdev, "IC does not support requested under voltage limits\n"); } if (rdev->constraints->over_temp_detection) ret = handle_notify_limits(rdev, ops->set_thermal_protection, &rdev->constraints->temp_limits); if (ret) { if (ret != -EOPNOTSUPP) { rdev_err(rdev, "failed to set temperature limits %pe\n", ERR_PTR(ret)); return ret; } rdev_warn(rdev, "IC does not support requested temperature limits\n"); } if (rdev->constraints->active_discharge && ops->set_active_discharge) { bool ad_state = (rdev->constraints->active_discharge == REGULATOR_ACTIVE_DISCHARGE_ENABLE) ? true : false; ret = ops->set_active_discharge(rdev, ad_state); if (ret < 0) { rdev_err(rdev, "failed to set active discharge: %pe\n", ERR_PTR(ret)); return ret; } } /* * If there is no mechanism for controlling the regulator then * flag it as always_on so we don't end up duplicating checks * for this so much. Note that we could control the state of * a supply to control the output on a regulator that has no * direct control. */ if (!rdev->ena_pin && !ops->enable) { if (rdev->supply_name && !rdev->supply) return -EPROBE_DEFER; if (rdev->supply) rdev->constraints->always_on = rdev->supply->rdev->constraints->always_on; else rdev->constraints->always_on = true; } /* If the constraints say the regulator should be on at this point * and we have control then make sure it is enabled. */ if (rdev->constraints->always_on || rdev->constraints->boot_on) { /* If we want to enable this regulator, make sure that we know * the supplying regulator. */ if (rdev->supply_name && !rdev->supply) return -EPROBE_DEFER; /* If supplying regulator has already been enabled, * it's not intended to have use_count increment * when rdev is only boot-on. */ if (rdev->supply && (rdev->constraints->always_on || !regulator_is_enabled(rdev->supply))) { ret = regulator_enable(rdev->supply); if (ret < 0) { _regulator_put(rdev->supply); rdev->supply = NULL; return ret; } } ret = _regulator_do_enable(rdev); if (ret < 0 && ret != -EINVAL) { rdev_err(rdev, "failed to enable: %pe\n", ERR_PTR(ret)); return ret; } if (rdev->constraints->always_on) rdev->use_count++; } else if (rdev->desc->off_on_delay) { rdev->last_off = ktime_get(); } print_constraints(rdev); return 0; } /** * set_supply - set regulator supply regulator * @rdev: regulator (locked) * @supply_rdev: supply regulator (locked)) * * Called by platform initialisation code to set the supply regulator for this * regulator. This ensures that a regulators supply will also be enabled by the * core if it's child is enabled. */ static int set_supply(struct regulator_dev *rdev, struct regulator_dev *supply_rdev) { int err; rdev_dbg(rdev, "supplied by %s\n", rdev_get_name(supply_rdev)); if (!try_module_get(supply_rdev->owner)) return -ENODEV; rdev->supply = create_regulator(supply_rdev, &rdev->dev, "SUPPLY"); if (rdev->supply == NULL) { module_put(supply_rdev->owner); err = -ENOMEM; return err; } supply_rdev->open_count++; return 0; } /** * set_consumer_device_supply - Bind a regulator to a symbolic supply * @rdev: regulator source * @consumer_dev_name: dev_name() string for device supply applies to * @supply: symbolic name for supply * * Allows platform initialisation code to map physical regulator * sources to symbolic names for supplies for use by devices. Devices * should use these symbolic names to request regulators, avoiding the * need to provide board-specific regulator names as platform data. */ static int set_consumer_device_supply(struct regulator_dev *rdev, const char *consumer_dev_name, const char *supply) { struct regulator_map *node, *new_node; int has_dev; if (supply == NULL) return -EINVAL; if (consumer_dev_name != NULL) has_dev = 1; else has_dev = 0; new_node = kzalloc(sizeof(struct regulator_map), GFP_KERNEL); if (new_node == NULL) return -ENOMEM; new_node->regulator = rdev; new_node->supply = supply; if (has_dev) { new_node->dev_name = kstrdup(consumer_dev_name, GFP_KERNEL); if (new_node->dev_name == NULL) { kfree(new_node); return -ENOMEM; } } mutex_lock(®ulator_list_mutex); list_for_each_entry(node, ®ulator_map_list, list) { if (node->dev_name && consumer_dev_name) { if (strcmp(node->dev_name, consumer_dev_name) != 0) continue; } else if (node->dev_name || consumer_dev_name) { continue; } if (strcmp(node->supply, supply) != 0) continue; pr_debug("%s: %s/%s is '%s' supply; fail %s/%s\n", consumer_dev_name, dev_name(&node->regulator->dev), node->regulator->desc->name, supply, dev_name(&rdev->dev), rdev_get_name(rdev)); goto fail; } list_add(&new_node->list, ®ulator_map_list); mutex_unlock(®ulator_list_mutex); return 0; fail: mutex_unlock(®ulator_list_mutex); kfree(new_node->dev_name); kfree(new_node); return -EBUSY; } static void unset_regulator_supplies(struct regulator_dev *rdev) { struct regulator_map *node, *n; list_for_each_entry_safe(node, n, ®ulator_map_list, list) { if (rdev == node->regulator) { list_del(&node->list); kfree(node->dev_name); kfree(node); } } } #ifdef CONFIG_DEBUG_FS static ssize_t constraint_flags_read_file(struct file *file, char __user *user_buf, size_t count, loff_t *ppos) { const struct regulator *regulator = file->private_data; const struct regulation_constraints *c = regulator->rdev->constraints; char *buf; ssize_t ret; if (!c) return 0; buf = kmalloc(PAGE_SIZE, GFP_KERNEL); if (!buf) return -ENOMEM; ret = snprintf(buf, PAGE_SIZE, "always_on: %u\n" "boot_on: %u\n" "apply_uV: %u\n" "ramp_disable: %u\n" "soft_start: %u\n" "pull_down: %u\n" "over_current_protection: %u\n", c->always_on, c->boot_on, c->apply_uV, c->ramp_disable, c->soft_start, c->pull_down, c->over_current_protection); ret = simple_read_from_buffer(user_buf, count, ppos, buf, ret); kfree(buf); return ret; } #endif static const struct file_operations constraint_flags_fops = { #ifdef CONFIG_DEBUG_FS .open = simple_open, .read = constraint_flags_read_file, .llseek = default_llseek, #endif }; #define REG_STR_SIZE 64 static struct regulator *create_regulator(struct regulator_dev *rdev, struct device *dev, const char *supply_name) { struct regulator *regulator; int err = 0; lockdep_assert_held_once(&rdev->mutex.base); if (dev) { char buf[REG_STR_SIZE]; int size; size = snprintf(buf, REG_STR_SIZE, "%s-%s", dev->kobj.name, supply_name); if (size >= REG_STR_SIZE) return NULL; supply_name = kstrdup(buf, GFP_KERNEL); if (supply_name == NULL) return NULL; } else { supply_name = kstrdup_const(supply_name, GFP_KERNEL); if (supply_name == NULL) return NULL; } regulator = kzalloc(sizeof(*regulator), GFP_KERNEL); if (regulator == NULL) { kfree_const(supply_name); return NULL; } regulator->rdev = rdev; regulator->supply_name = supply_name; list_add(®ulator->list, &rdev->consumer_list); if (dev) { regulator->dev = dev; /* Add a link to the device sysfs entry */ err = sysfs_create_link_nowarn(&rdev->dev.kobj, &dev->kobj, supply_name); if (err) { rdev_dbg(rdev, "could not add device link %s: %pe\n", dev->kobj.name, ERR_PTR(err)); /* non-fatal */ } } if (err != -EEXIST) regulator->debugfs = debugfs_create_dir(supply_name, rdev->debugfs); if (IS_ERR(regulator->debugfs)) rdev_dbg(rdev, "Failed to create debugfs directory\n"); debugfs_create_u32("uA_load", 0444, regulator->debugfs, ®ulator->uA_load); debugfs_create_u32("min_uV", 0444, regulator->debugfs, ®ulator->voltage[PM_SUSPEND_ON].min_uV); debugfs_create_u32("max_uV", 0444, regulator->debugfs, ®ulator->voltage[PM_SUSPEND_ON].max_uV); debugfs_create_file("constraint_flags", 0444, regulator->debugfs, regulator, &constraint_flags_fops); /* * Check now if the regulator is an always on regulator - if * it is then we don't need to do nearly so much work for * enable/disable calls. */ if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_STATUS) && _regulator_is_enabled(rdev)) regulator->always_on = true; return regulator; } static int _regulator_get_enable_time(struct regulator_dev *rdev) { if (rdev->constraints && rdev->constraints->enable_time) return rdev->constraints->enable_time; if (rdev->desc->ops->enable_time) return rdev->desc->ops->enable_time(rdev); return rdev->desc->enable_time; } static struct regulator_supply_alias *regulator_find_supply_alias( struct device *dev, const char *supply) { struct regulator_supply_alias *map; list_for_each_entry(map, ®ulator_supply_alias_list, list) if (map->src_dev == dev && strcmp(map->src_supply, supply) == 0) return map; return NULL; } static void regulator_supply_alias(struct device **dev, const char **supply) { struct regulator_supply_alias *map; map = regulator_find_supply_alias(*dev, *supply); if (map) { dev_dbg(*dev, "Mapping supply %s to %s,%s\n", *supply, map->alias_supply, dev_name(map->alias_dev)); *dev = map->alias_dev; *supply = map->alias_supply; } } static int regulator_match(struct device *dev, const void *data) { struct regulator_dev *r = dev_to_rdev(dev); return strcmp(rdev_get_name(r), data) == 0; } static struct regulator_dev *regulator_lookup_by_name(const char *name) { struct device *dev; dev = class_find_device(®ulator_class, NULL, name, regulator_match); return dev ? dev_to_rdev(dev) : NULL; } /** * regulator_dev_lookup - lookup a regulator device. * @dev: device for regulator "consumer". * @supply: Supply name or regulator ID. * * If successful, returns a struct regulator_dev that corresponds to the name * @supply and with the embedded struct device refcount incremented by one. * The refcount must be dropped by calling put_device(). * On failure one of the following ERR-PTR-encoded values is returned: * -ENODEV if lookup fails permanently, -EPROBE_DEFER if lookup could succeed * in the future. */ static struct regulator_dev *regulator_dev_lookup(struct device *dev, const char *supply) { struct regulator_dev *r = NULL; struct device_node *node; struct regulator_map *map; const char *devname = NULL; regulator_supply_alias(&dev, &supply); /* first do a dt based lookup */ if (dev && dev->of_node) { node = of_get_regulator(dev, supply); if (node) { r = of_find_regulator_by_node(node); of_node_put(node); if (r) return r; /* * We have a node, but there is no device. * assume it has not registered yet. */ return ERR_PTR(-EPROBE_DEFER); } } /* if not found, try doing it non-dt way */ if (dev) devname = dev_name(dev); mutex_lock(®ulator_list_mutex); list_for_each_entry(map, ®ulator_map_list, list) { /* If the mapping has a device set up it must match */ if (map->dev_name && (!devname || strcmp(map->dev_name, devname))) continue; if (strcmp(map->supply, supply) == 0 && get_device(&map->regulator->dev)) { r = map->regulator; break; } } mutex_unlock(®ulator_list_mutex); if (r) return r; r = regulator_lookup_by_name(supply); if (r) return r; return ERR_PTR(-ENODEV); } static int regulator_resolve_supply(struct regulator_dev *rdev) { struct regulator_dev *r; struct device *dev = rdev->dev.parent; struct ww_acquire_ctx ww_ctx; int ret = 0; /* No supply to resolve? */ if (!rdev->supply_name) return 0; /* Supply already resolved? (fast-path without locking contention) */ if (rdev->supply) return 0; r = regulator_dev_lookup(dev, rdev->supply_name); if (IS_ERR(r)) { ret = PTR_ERR(r); /* Did the lookup explicitly defer for us? */ if (ret == -EPROBE_DEFER) goto out; if (have_full_constraints()) { r = dummy_regulator_rdev; get_device(&r->dev); } else { dev_err(dev, "Failed to resolve %s-supply for %s\n", rdev->supply_name, rdev->desc->name); ret = -EPROBE_DEFER; goto out; } } if (r == rdev) { dev_err(dev, "Supply for %s (%s) resolved to itself\n", rdev->desc->name, rdev->supply_name); if (!have_full_constraints()) { ret = -EINVAL; goto out; } r = dummy_regulator_rdev; get_device(&r->dev); } /* * If the supply's parent device is not the same as the * regulator's parent device, then ensure the parent device * is bound before we resolve the supply, in case the parent * device get probe deferred and unregisters the supply. */ if (r->dev.parent && r->dev.parent != rdev->dev.parent) { if (!device_is_bound(r->dev.parent)) { put_device(&r->dev); ret = -EPROBE_DEFER; goto out; } } /* Recursively resolve the supply of the supply */ ret = regulator_resolve_supply(r); if (ret < 0) { put_device(&r->dev); goto out; } /* * Recheck rdev->supply with rdev->mutex lock held to avoid a race * between rdev->supply null check and setting rdev->supply in * set_supply() from concurrent tasks. */ regulator_lock_two(rdev, r, &ww_ctx); /* Supply just resolved by a concurrent task? */ if (rdev->supply) { regulator_unlock_two(rdev, r, &ww_ctx); put_device(&r->dev); goto out; } ret = set_supply(rdev, r); if (ret < 0) { regulator_unlock_two(rdev, r, &ww_ctx); put_device(&r->dev); goto out; } regulator_unlock_two(rdev, r, &ww_ctx); /* * In set_machine_constraints() we may have turned this regulator on * but we couldn't propagate to the supply if it hadn't been resolved * yet. Do it now. */ if (rdev->use_count) { ret = regulator_enable(rdev->supply); if (ret < 0) { _regulator_put(rdev->supply); rdev->supply = NULL; goto out; } } out: return ret; } /* Internal regulator request function */ struct regulator *_regulator_get(struct device *dev, const char *id, enum regulator_get_type get_type) { struct regulator_dev *rdev; struct regulator *regulator; struct device_link *link; int ret; if (get_type >= MAX_GET_TYPE) { dev_err(dev, "invalid type %d in %s\n", get_type, __func__); return ERR_PTR(-EINVAL); } if (id == NULL) { pr_err("get() with no identifier\n"); return ERR_PTR(-EINVAL); } rdev = regulator_dev_lookup(dev, id); if (IS_ERR(rdev)) { ret = PTR_ERR(rdev); /* * If regulator_dev_lookup() fails with error other * than -ENODEV our job here is done, we simply return it. */ if (ret != -ENODEV) return ERR_PTR(ret); if (!have_full_constraints()) { dev_warn(dev, "incomplete constraints, dummy supplies not allowed\n"); return ERR_PTR(-ENODEV); } switch (get_type) { case NORMAL_GET: /* * Assume that a regulator is physically present and * enabled, even if it isn't hooked up, and just * provide a dummy. */ dev_warn(dev, "supply %s not found, using dummy regulator\n", id); rdev = dummy_regulator_rdev; get_device(&rdev->dev); break; case EXCLUSIVE_GET: dev_warn(dev, "dummy supplies not allowed for exclusive requests\n"); fallthrough; default: return ERR_PTR(-ENODEV); } } if (rdev->exclusive) { regulator = ERR_PTR(-EPERM); put_device(&rdev->dev); return regulator; } if (get_type == EXCLUSIVE_GET && rdev->open_count) { regulator = ERR_PTR(-EBUSY); put_device(&rdev->dev); return regulator; } mutex_lock(®ulator_list_mutex); ret = (rdev->coupling_desc.n_resolved != rdev->coupling_desc.n_coupled); mutex_unlock(®ulator_list_mutex); if (ret != 0) { regulator = ERR_PTR(-EPROBE_DEFER); put_device(&rdev->dev); return regulator; } ret = regulator_resolve_supply(rdev); if (ret < 0) { regulator = ERR_PTR(ret); put_device(&rdev->dev); return regulator; } if (!try_module_get(rdev->owner)) { regulator = ERR_PTR(-EPROBE_DEFER); put_device(&rdev->dev); return regulator; } regulator_lock(rdev); regulator = create_regulator(rdev, dev, id); regulator_unlock(rdev); if (regulator == NULL) { regulator = ERR_PTR(-ENOMEM); module_put(rdev->owner); put_device(&rdev->dev); return regulator; } rdev->open_count++; if (get_type == EXCLUSIVE_GET) { rdev->exclusive = 1; ret = _regulator_is_enabled(rdev); if (ret > 0) { rdev->use_count = 1; regulator->enable_count = 1; } else { rdev->use_count = 0; regulator->enable_count = 0; } } link = device_link_add(dev, &rdev->dev, DL_FLAG_STATELESS); if (!IS_ERR_OR_NULL(link)) regulator->device_link = true; return regulator; } /** * regulator_get - lookup and obtain a reference to a regulator. * @dev: device for regulator "consumer" * @id: Supply name or regulator ID. * * Returns a struct regulator corresponding to the regulator producer, * or IS_ERR() condition containing errno. * * Use of supply names configured via set_consumer_device_supply() is * strongly encouraged. It is recommended that the supply name used * should match the name used for the supply and/or the relevant * device pins in the datasheet. */ struct regulator *regulator_get(struct device *dev, const char *id) { return _regulator_get(dev, id, NORMAL_GET); } EXPORT_SYMBOL_GPL(regulator_get); /** * regulator_get_exclusive - obtain exclusive access to a regulator. * @dev: device for regulator "consumer" * @id: Supply name or regulator ID. * * Returns a struct regulator corresponding to the regulator producer, * or IS_ERR() condition containing errno. Other consumers will be * unable to obtain this regulator while this reference is held and the * use count for the regulator will be initialised to reflect the current * state of the regulator. * * This is intended for use by consumers which cannot tolerate shared * use of the regulator such as those which need to force the * regulator off for correct operation of the hardware they are * controlling. * * Use of supply names configured via set_consumer_device_supply() is * strongly encouraged. It is recommended that the supply name used * should match the name used for the supply and/or the relevant * device pins in the datasheet. */ struct regulator *regulator_get_exclusive(struct device *dev, const char *id) { return _regulator_get(dev, id, EXCLUSIVE_GET); } EXPORT_SYMBOL_GPL(regulator_get_exclusive); /** * regulator_get_optional - obtain optional access to a regulator. * @dev: device for regulator "consumer" * @id: Supply name or regulator ID. * * Returns a struct regulator corresponding to the regulator producer, * or IS_ERR() condition containing errno. * * This is intended for use by consumers for devices which can have * some supplies unconnected in normal use, such as some MMC devices. * It can allow the regulator core to provide stub supplies for other * supplies requested using normal regulator_get() calls without * disrupting the operation of drivers that can handle absent * supplies. * * Use of supply names configured via set_consumer_device_supply() is * strongly encouraged. It is recommended that the supply name used * should match the name used for the supply and/or the relevant * device pins in the datasheet. */ struct regulator *regulator_get_optional(struct device *dev, const char *id) { return _regulator_get(dev, id, OPTIONAL_GET); } EXPORT_SYMBOL_GPL(regulator_get_optional); static void destroy_regulator(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; debugfs_remove_recursive(regulator->debugfs); if (regulator->dev) { if (regulator->device_link) device_link_remove(regulator->dev, &rdev->dev); /* remove any sysfs entries */ sysfs_remove_link(&rdev->dev.kobj, regulator->supply_name); } regulator_lock(rdev); list_del(®ulator->list); rdev->open_count--; rdev->exclusive = 0; regulator_unlock(rdev); kfree_const(regulator->supply_name); kfree(regulator); } /* regulator_list_mutex lock held by regulator_put() */ static void _regulator_put(struct regulator *regulator) { struct regulator_dev *rdev; if (IS_ERR_OR_NULL(regulator)) return; lockdep_assert_held_once(®ulator_list_mutex); /* Docs say you must disable before calling regulator_put() */ WARN_ON(regulator->enable_count); rdev = regulator->rdev; destroy_regulator(regulator); module_put(rdev->owner); put_device(&rdev->dev); } /** * regulator_put - "free" the regulator source * @regulator: regulator source * * Note: drivers must ensure that all regulator_enable calls made on this * regulator source are balanced by regulator_disable calls prior to calling * this function. */ void regulator_put(struct regulator *regulator) { mutex_lock(®ulator_list_mutex); _regulator_put(regulator); mutex_unlock(®ulator_list_mutex); } EXPORT_SYMBOL_GPL(regulator_put); /** * regulator_register_supply_alias - Provide device alias for supply lookup * * @dev: device that will be given as the regulator "consumer" * @id: Supply name or regulator ID * @alias_dev: device that should be used to lookup the supply * @alias_id: Supply name or regulator ID that should be used to lookup the * supply * * All lookups for id on dev will instead be conducted for alias_id on * alias_dev. */ int regulator_register_supply_alias(struct device *dev, const char *id, struct device *alias_dev, const char *alias_id) { struct regulator_supply_alias *map; map = regulator_find_supply_alias(dev, id); if (map) return -EEXIST; map = kzalloc(sizeof(struct regulator_supply_alias), GFP_KERNEL); if (!map) return -ENOMEM; map->src_dev = dev; map->src_supply = id; map->alias_dev = alias_dev; map->alias_supply = alias_id; list_add(&map->list, ®ulator_supply_alias_list); pr_info("Adding alias for supply %s,%s -> %s,%s\n", id, dev_name(dev), alias_id, dev_name(alias_dev)); return 0; } EXPORT_SYMBOL_GPL(regulator_register_supply_alias); /** * regulator_unregister_supply_alias - Remove device alias * * @dev: device that will be given as the regulator "consumer" * @id: Supply name or regulator ID * * Remove a lookup alias if one exists for id on dev. */ void regulator_unregister_supply_alias(struct device *dev, const char *id) { struct regulator_supply_alias *map; map = regulator_find_supply_alias(dev, id); if (map) { list_del(&map->list); kfree(map); } } EXPORT_SYMBOL_GPL(regulator_unregister_supply_alias); /** * regulator_bulk_register_supply_alias - register multiple aliases * * @dev: device that will be given as the regulator "consumer" * @id: List of supply names or regulator IDs * @alias_dev: device that should be used to lookup the supply * @alias_id: List of supply names or regulator IDs that should be used to * lookup the supply * @num_id: Number of aliases to register * * @return 0 on success, an errno on failure. * * This helper function allows drivers to register several supply * aliases in one operation. If any of the aliases cannot be * registered any aliases that were registered will be removed * before returning to the caller. */ int regulator_bulk_register_supply_alias(struct device *dev, const char *const *id, struct device *alias_dev, const char *const *alias_id, int num_id) { int i; int ret; for (i = 0; i < num_id; ++i) { ret = regulator_register_supply_alias(dev, id[i], alias_dev, alias_id[i]); if (ret < 0) goto err; } return 0; err: dev_err(dev, "Failed to create supply alias %s,%s -> %s,%s\n", id[i], dev_name(dev), alias_id[i], dev_name(alias_dev)); while (--i >= 0) regulator_unregister_supply_alias(dev, id[i]); return ret; } EXPORT_SYMBOL_GPL(regulator_bulk_register_supply_alias); /** * regulator_bulk_unregister_supply_alias - unregister multiple aliases * * @dev: device that will be given as the regulator "consumer" * @id: List of supply names or regulator IDs * @num_id: Number of aliases to unregister * * This helper function allows drivers to unregister several supply * aliases in one operation. */ void regulator_bulk_unregister_supply_alias(struct device *dev, const char *const *id, int num_id) { int i; for (i = 0; i < num_id; ++i) regulator_unregister_supply_alias(dev, id[i]); } EXPORT_SYMBOL_GPL(regulator_bulk_unregister_supply_alias); /* Manage enable GPIO list. Same GPIO pin can be shared among regulators */ static int regulator_ena_gpio_request(struct regulator_dev *rdev, const struct regulator_config *config) { struct regulator_enable_gpio *pin, *new_pin; struct gpio_desc *gpiod; gpiod = config->ena_gpiod; new_pin = kzalloc(sizeof(*new_pin), GFP_KERNEL); mutex_lock(®ulator_list_mutex); list_for_each_entry(pin, ®ulator_ena_gpio_list, list) { if (pin->gpiod == gpiod) { rdev_dbg(rdev, "GPIO is already used\n"); goto update_ena_gpio_to_rdev; } } if (new_pin == NULL) { mutex_unlock(®ulator_list_mutex); return -ENOMEM; } pin = new_pin; new_pin = NULL; pin->gpiod = gpiod; list_add(&pin->list, ®ulator_ena_gpio_list); update_ena_gpio_to_rdev: pin->request_count++; rdev->ena_pin = pin; mutex_unlock(®ulator_list_mutex); kfree(new_pin); return 0; } static void regulator_ena_gpio_free(struct regulator_dev *rdev) { struct regulator_enable_gpio *pin, *n; if (!rdev->ena_pin) return; /* Free the GPIO only in case of no use */ list_for_each_entry_safe(pin, n, ®ulator_ena_gpio_list, list) { if (pin != rdev->ena_pin) continue; if (--pin->request_count) break; gpiod_put(pin->gpiod); list_del(&pin->list); kfree(pin); break; } rdev->ena_pin = NULL; } /** * regulator_ena_gpio_ctrl - balance enable_count of each GPIO and actual GPIO pin control * @rdev: regulator_dev structure * @enable: enable GPIO at initial use? * * GPIO is enabled in case of initial use. (enable_count is 0) * GPIO is disabled when it is not shared any more. (enable_count <= 1) */ static int regulator_ena_gpio_ctrl(struct regulator_dev *rdev, bool enable) { struct regulator_enable_gpio *pin = rdev->ena_pin; if (!pin) return -EINVAL; if (enable) { /* Enable GPIO at initial use */ if (pin->enable_count == 0) gpiod_set_value_cansleep(pin->gpiod, 1); pin->enable_count++; } else { if (pin->enable_count > 1) { pin->enable_count--; return 0; } /* Disable GPIO if not used */ if (pin->enable_count <= 1) { gpiod_set_value_cansleep(pin->gpiod, 0); pin->enable_count = 0; } } return 0; } /** * _regulator_delay_helper - a delay helper function * @delay: time to delay in microseconds * * Delay for the requested amount of time as per the guidelines in: * * Documentation/timers/timers-howto.rst * * The assumption here is that these regulator operations will never used in * atomic context and therefore sleeping functions can be used. */ static void _regulator_delay_helper(unsigned int delay) { unsigned int ms = delay / 1000; unsigned int us = delay % 1000; if (ms > 0) { /* * For small enough values, handle super-millisecond * delays in the usleep_range() call below. */ if (ms < 20) us += ms * 1000; else msleep(ms); } /* * Give the scheduler some room to coalesce with any other * wakeup sources. For delays shorter than 10 us, don't even * bother setting up high-resolution timers and just busy- * loop. */ if (us >= 10) usleep_range(us, us + 100); else udelay(us); } /** * _regulator_check_status_enabled * * A helper function to check if the regulator status can be interpreted * as 'regulator is enabled'. * @rdev: the regulator device to check * * Return: * * 1 - if status shows regulator is in enabled state * * 0 - if not enabled state * * Error Value - as received from ops->get_status() */ static inline int _regulator_check_status_enabled(struct regulator_dev *rdev) { int ret = rdev->desc->ops->get_status(rdev); if (ret < 0) { rdev_info(rdev, "get_status returned error: %d\n", ret); return ret; } switch (ret) { case REGULATOR_STATUS_OFF: case REGULATOR_STATUS_ERROR: case REGULATOR_STATUS_UNDEFINED: return 0; default: return 1; } } static int _regulator_do_enable(struct regulator_dev *rdev) { int ret, delay; /* Query before enabling in case configuration dependent. */ ret = _regulator_get_enable_time(rdev); if (ret >= 0) { delay = ret; } else { rdev_warn(rdev, "enable_time() failed: %pe\n", ERR_PTR(ret)); delay = 0; } trace_regulator_enable(rdev_get_name(rdev)); if (rdev->desc->off_on_delay) { /* if needed, keep a distance of off_on_delay from last time * this regulator was disabled. */ ktime_t end = ktime_add_us(rdev->last_off, rdev->desc->off_on_delay); s64 remaining = ktime_us_delta(end, ktime_get_boottime()); if (remaining > 0) _regulator_delay_helper(remaining); } if (rdev->ena_pin) { if (!rdev->ena_gpio_state) { ret = regulator_ena_gpio_ctrl(rdev, true); if (ret < 0) return ret; rdev->ena_gpio_state = 1; } } else if (rdev->desc->ops->enable) { ret = rdev->desc->ops->enable(rdev); if (ret < 0) return ret; } else { return -EINVAL; } /* Allow the regulator to ramp; it would be useful to extend * this for bulk operations so that the regulators can ramp * together. */ trace_regulator_enable_delay(rdev_get_name(rdev)); /* If poll_enabled_time is set, poll upto the delay calculated * above, delaying poll_enabled_time uS to check if the regulator * actually got enabled. * If the regulator isn't enabled after our delay helper has expired, * return -ETIMEDOUT. */ if (rdev->desc->poll_enabled_time) { int time_remaining = delay; while (time_remaining > 0) { _regulator_delay_helper(rdev->desc->poll_enabled_time); if (rdev->desc->ops->get_status) { ret = _regulator_check_status_enabled(rdev); if (ret < 0) return ret; else if (ret) break; } else if (rdev->desc->ops->is_enabled(rdev)) break; time_remaining -= rdev->desc->poll_enabled_time; } if (time_remaining <= 0) { rdev_err(rdev, "Enabled check timed out\n"); return -ETIMEDOUT; } } else { _regulator_delay_helper(delay); } trace_regulator_enable_complete(rdev_get_name(rdev)); return 0; } /** * _regulator_handle_consumer_enable - handle that a consumer enabled * @regulator: regulator source * * Some things on a regulator consumer (like the contribution towards total * load on the regulator) only have an effect when the consumer wants the * regulator enabled. Explained in example with two consumers of the same * regulator: * consumer A: set_load(100); => total load = 0 * consumer A: regulator_enable(); => total load = 100 * consumer B: set_load(1000); => total load = 100 * consumer B: regulator_enable(); => total load = 1100 * consumer A: regulator_disable(); => total_load = 1000 * * This function (together with _regulator_handle_consumer_disable) is * responsible for keeping track of the refcount for a given regulator consumer * and applying / unapplying these things. * * Returns 0 upon no error; -error upon error. */ static int _regulator_handle_consumer_enable(struct regulator *regulator) { int ret; struct regulator_dev *rdev = regulator->rdev; lockdep_assert_held_once(&rdev->mutex.base); regulator->enable_count++; if (regulator->uA_load && regulator->enable_count == 1) { ret = drms_uA_update(rdev); if (ret) regulator->enable_count--; return ret; } return 0; } /** * _regulator_handle_consumer_disable - handle that a consumer disabled * @regulator: regulator source * * The opposite of _regulator_handle_consumer_enable(). * * Returns 0 upon no error; -error upon error. */ static int _regulator_handle_consumer_disable(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; lockdep_assert_held_once(&rdev->mutex.base); if (!regulator->enable_count) { rdev_err(rdev, "Underflow of regulator enable count\n"); return -EINVAL; } regulator->enable_count--; if (regulator->uA_load && regulator->enable_count == 0) return drms_uA_update(rdev); return 0; } /* locks held by regulator_enable() */ static int _regulator_enable(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; int ret; lockdep_assert_held_once(&rdev->mutex.base); if (rdev->use_count == 0 && rdev->supply) { ret = _regulator_enable(rdev->supply); if (ret < 0) return ret; } /* balance only if there are regulators coupled */ if (rdev->coupling_desc.n_coupled > 1) { ret = regulator_balance_voltage(rdev, PM_SUSPEND_ON); if (ret < 0) goto err_disable_supply; } ret = _regulator_handle_consumer_enable(regulator); if (ret < 0) goto err_disable_supply; if (rdev->use_count == 0) { /* * The regulator may already be enabled if it's not switchable * or was left on */ ret = _regulator_is_enabled(rdev); if (ret == -EINVAL || ret == 0) { if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_STATUS)) { ret = -EPERM; goto err_consumer_disable; } ret = _regulator_do_enable(rdev); if (ret < 0) goto err_consumer_disable; _notifier_call_chain(rdev, REGULATOR_EVENT_ENABLE, NULL); } else if (ret < 0) { rdev_err(rdev, "is_enabled() failed: %pe\n", ERR_PTR(ret)); goto err_consumer_disable; } /* Fallthrough on positive return values - already enabled */ } if (regulator->enable_count == 1) rdev->use_count++; return 0; err_consumer_disable: _regulator_handle_consumer_disable(regulator); err_disable_supply: if (rdev->use_count == 0 && rdev->supply) _regulator_disable(rdev->supply); return ret; } /** * regulator_enable - enable regulator output * @regulator: regulator source * * Request that the regulator be enabled with the regulator output at * the predefined voltage or current value. Calls to regulator_enable() * must be balanced with calls to regulator_disable(). * * NOTE: the output value can be set by other drivers, boot loader or may be * hardwired in the regulator. */ int regulator_enable(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; struct ww_acquire_ctx ww_ctx; int ret; regulator_lock_dependent(rdev, &ww_ctx); ret = _regulator_enable(regulator); regulator_unlock_dependent(rdev, &ww_ctx); return ret; } EXPORT_SYMBOL_GPL(regulator_enable); static int _regulator_do_disable(struct regulator_dev *rdev) { int ret; trace_regulator_disable(rdev_get_name(rdev)); if (rdev->ena_pin) { if (rdev->ena_gpio_state) { ret = regulator_ena_gpio_ctrl(rdev, false); if (ret < 0) return ret; rdev->ena_gpio_state = 0; } } else if (rdev->desc->ops->disable) { ret = rdev->desc->ops->disable(rdev); if (ret != 0) return ret; } if (rdev->desc->off_on_delay) rdev->last_off = ktime_get_boottime(); trace_regulator_disable_complete(rdev_get_name(rdev)); return 0; } /* locks held by regulator_disable() */ static int _regulator_disable(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; int ret = 0; lockdep_assert_held_once(&rdev->mutex.base); if (WARN(regulator->enable_count == 0, "unbalanced disables for %s\n", rdev_get_name(rdev))) return -EIO; if (regulator->enable_count == 1) { /* disabling last enable_count from this regulator */ /* are we the last user and permitted to disable ? */ if (rdev->use_count == 1 && (rdev->constraints && !rdev->constraints->always_on)) { /* we are last user */ if (regulator_ops_is_valid(rdev, REGULATOR_CHANGE_STATUS)) { ret = _notifier_call_chain(rdev, REGULATOR_EVENT_PRE_DISABLE, NULL); if (ret & NOTIFY_STOP_MASK) return -EINVAL; ret = _regulator_do_disable(rdev); if (ret < 0) { rdev_err(rdev, "failed to disable: %pe\n", ERR_PTR(ret)); _notifier_call_chain(rdev, REGULATOR_EVENT_ABORT_DISABLE, NULL); return ret; } _notifier_call_chain(rdev, REGULATOR_EVENT_DISABLE, NULL); } rdev->use_count = 0; } else if (rdev->use_count > 1) { rdev->use_count--; } } if (ret == 0) ret = _regulator_handle_consumer_disable(regulator); if (ret == 0 && rdev->coupling_desc.n_coupled > 1) ret = regulator_balance_voltage(rdev, PM_SUSPEND_ON); if (ret == 0 && rdev->use_count == 0 && rdev->supply) ret = _regulator_disable(rdev->supply); return ret; } /** * regulator_disable - disable regulator output * @regulator: regulator source * * Disable the regulator output voltage or current. Calls to * regulator_enable() must be balanced with calls to * regulator_disable(). * * NOTE: this will only disable the regulator output if no other consumer * devices have it enabled, the regulator device supports disabling and * machine constraints permit this operation. */ int regulator_disable(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; struct ww_acquire_ctx ww_ctx; int ret; regulator_lock_dependent(rdev, &ww_ctx); ret = _regulator_disable(regulator); regulator_unlock_dependent(rdev, &ww_ctx); return ret; } EXPORT_SYMBOL_GPL(regulator_disable); /* locks held by regulator_force_disable() */ static int _regulator_force_disable(struct regulator_dev *rdev) { int ret = 0; lockdep_assert_held_once(&rdev->mutex.base); ret = _notifier_call_chain(rdev, REGULATOR_EVENT_FORCE_DISABLE | REGULATOR_EVENT_PRE_DISABLE, NULL); if (ret & NOTIFY_STOP_MASK) return -EINVAL; ret = _regulator_do_disable(rdev); if (ret < 0) { rdev_err(rdev, "failed to force disable: %pe\n", ERR_PTR(ret)); _notifier_call_chain(rdev, REGULATOR_EVENT_FORCE_DISABLE | REGULATOR_EVENT_ABORT_DISABLE, NULL); return ret; } _notifier_call_chain(rdev, REGULATOR_EVENT_FORCE_DISABLE | REGULATOR_EVENT_DISABLE, NULL); return 0; } /** * regulator_force_disable - force disable regulator output * @regulator: regulator source * * Forcibly disable the regulator output voltage or current. * NOTE: this *will* disable the regulator output even if other consumer * devices have it enabled. This should be used for situations when device * damage will likely occur if the regulator is not disabled (e.g. over temp). */ int regulator_force_disable(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; struct ww_acquire_ctx ww_ctx; int ret; regulator_lock_dependent(rdev, &ww_ctx); ret = _regulator_force_disable(regulator->rdev); if (rdev->coupling_desc.n_coupled > 1) regulator_balance_voltage(rdev, PM_SUSPEND_ON); if (regulator->uA_load) { regulator->uA_load = 0; ret = drms_uA_update(rdev); } if (rdev->use_count != 0 && rdev->supply) _regulator_disable(rdev->supply); regulator_unlock_dependent(rdev, &ww_ctx); return ret; } EXPORT_SYMBOL_GPL(regulator_force_disable); static void regulator_disable_work(struct work_struct *work) { struct regulator_dev *rdev = container_of(work, struct regulator_dev, disable_work.work); struct ww_acquire_ctx ww_ctx; int count, i, ret; struct regulator *regulator; int total_count = 0; regulator_lock_dependent(rdev, &ww_ctx); /* * Workqueue functions queue the new work instance while the previous * work instance is being processed. Cancel the queued work instance * as the work instance under processing does the job of the queued * work instance. */ cancel_delayed_work(&rdev->disable_work); list_for_each_entry(regulator, &rdev->consumer_list, list) { count = regulator->deferred_disables; if (!count) continue; total_count += count; regulator->deferred_disables = 0; for (i = 0; i < count; i++) { ret = _regulator_disable(regulator); if (ret != 0) rdev_err(rdev, "Deferred disable failed: %pe\n", ERR_PTR(ret)); } } WARN_ON(!total_count); if (rdev->coupling_desc.n_coupled > 1) regulator_balance_voltage(rdev, PM_SUSPEND_ON); regulator_unlock_dependent(rdev, &ww_ctx); } /** * regulator_disable_deferred - disable regulator output with delay * @regulator: regulator source * @ms: milliseconds until the regulator is disabled * * Execute regulator_disable() on the regulator after a delay. This * is intended for use with devices that require some time to quiesce. * * NOTE: this will only disable the regulator output if no other consumer * devices have it enabled, the regulator device supports disabling and * machine constraints permit this operation. */ int regulator_disable_deferred(struct regulator *regulator, int ms) { struct regulator_dev *rdev = regulator->rdev; if (!ms) return regulator_disable(regulator); regulator_lock(rdev); regulator->deferred_disables++; mod_delayed_work(system_power_efficient_wq, &rdev->disable_work, msecs_to_jiffies(ms)); regulator_unlock(rdev); return 0; } EXPORT_SYMBOL_GPL(regulator_disable_deferred); static int _regulator_is_enabled(struct regulator_dev *rdev) { /* A GPIO control always takes precedence */ if (rdev->ena_pin) return rdev->ena_gpio_state; /* If we don't know then assume that the regulator is always on */ if (!rdev->desc->ops->is_enabled) return 1; return rdev->desc->ops->is_enabled(rdev); } static int _regulator_list_voltage(struct regulator_dev *rdev, unsigned selector, int lock) { const struct regulator_ops *ops = rdev->desc->ops; int ret; if (rdev->desc->fixed_uV && rdev->desc->n_voltages == 1 && !selector) return rdev->desc->fixed_uV; if (ops->list_voltage) { if (selector >= rdev->desc->n_voltages) return -EINVAL; if (selector < rdev->desc->linear_min_sel) return 0; if (lock) regulator_lock(rdev); ret = ops->list_voltage(rdev, selector); if (lock) regulator_unlock(rdev); } else if (rdev->is_switch && rdev->supply) { ret = _regulator_list_voltage(rdev->supply->rdev, selector, lock); } else { return -EINVAL; } if (ret > 0) { if (ret < rdev->constraints->min_uV) ret = 0; else if (ret > rdev->constraints->max_uV) ret = 0; } return ret; } /** * regulator_is_enabled - is the regulator output enabled * @regulator: regulator source * * Returns positive if the regulator driver backing the source/client * has requested that the device be enabled, zero if it hasn't, else a * negative errno code. * * Note that the device backing this regulator handle can have multiple * users, so it might be enabled even if regulator_enable() was never * called for this particular source. */ int regulator_is_enabled(struct regulator *regulator) { int ret; if (regulator->always_on) return 1; regulator_lock(regulator->rdev); ret = _regulator_is_enabled(regulator->rdev); regulator_unlock(regulator->rdev); return ret; } EXPORT_SYMBOL_GPL(regulator_is_enabled); /** * regulator_count_voltages - count regulator_list_voltage() selectors * @regulator: regulator source * * Returns number of selectors, or negative errno. Selectors are * numbered starting at zero, and typically correspond to bitfields * in hardware registers. */ int regulator_count_voltages(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; if (rdev->desc->n_voltages) return rdev->desc->n_voltages; if (!rdev->is_switch || !rdev->supply) return -EINVAL; return regulator_count_voltages(rdev->supply); } EXPORT_SYMBOL_GPL(regulator_count_voltages); /** * regulator_list_voltage - enumerate supported voltages * @regulator: regulator source * @selector: identify voltage to list * Context: can sleep * * Returns a voltage that can be passed to @regulator_set_voltage(), * zero if this selector code can't be used on this system, or a * negative errno. */ int regulator_list_voltage(struct regulator *regulator, unsigned selector) { return _regulator_list_voltage(regulator->rdev, selector, 1); } EXPORT_SYMBOL_GPL(regulator_list_voltage); /** * regulator_get_regmap - get the regulator's register map * @regulator: regulator source * * Returns the register map for the given regulator, or an ERR_PTR value * if the regulator doesn't use regmap. */ struct regmap *regulator_get_regmap(struct regulator *regulator) { struct regmap *map = regulator->rdev->regmap; return map ? map : ERR_PTR(-EOPNOTSUPP); } /** * regulator_get_hardware_vsel_register - get the HW voltage selector register * @regulator: regulator source * @vsel_reg: voltage selector register, output parameter * @vsel_mask: mask for voltage selector bitfield, output parameter * * Returns the hardware register offset and bitmask used for setting the * regulator voltage. This might be useful when configuring voltage-scaling * hardware or firmware that can make I2C requests behind the kernel's back, * for example. * * On success, the output parameters @vsel_reg and @vsel_mask are filled in * and 0 is returned, otherwise a negative errno is returned. */ int regulator_get_hardware_vsel_register(struct regulator *regulator, unsigned *vsel_reg, unsigned *vsel_mask) { struct regulator_dev *rdev = regulator->rdev; const struct regulator_ops *ops = rdev->desc->ops; if (ops->set_voltage_sel != regulator_set_voltage_sel_regmap) return -EOPNOTSUPP; *vsel_reg = rdev->desc->vsel_reg; *vsel_mask = rdev->desc->vsel_mask; return 0; } EXPORT_SYMBOL_GPL(regulator_get_hardware_vsel_register); /** * regulator_list_hardware_vsel - get the HW-specific register value for a selector * @regulator: regulator source * @selector: identify voltage to list * * Converts the selector to a hardware-specific voltage selector that can be * directly written to the regulator registers. The address of the voltage * register can be determined by calling @regulator_get_hardware_vsel_register. * * On error a negative errno is returned. */ int regulator_list_hardware_vsel(struct regulator *regulator, unsigned selector) { struct regulator_dev *rdev = regulator->rdev; const struct regulator_ops *ops = rdev->desc->ops; if (selector >= rdev->desc->n_voltages) return -EINVAL; if (selector < rdev->desc->linear_min_sel) return 0; if (ops->set_voltage_sel != regulator_set_voltage_sel_regmap) return -EOPNOTSUPP; return selector; } EXPORT_SYMBOL_GPL(regulator_list_hardware_vsel); /** * regulator_get_linear_step - return the voltage step size between VSEL values * @regulator: regulator source * * Returns the voltage step size between VSEL values for linear * regulators, or return 0 if the regulator isn't a linear regulator. */ unsigned int regulator_get_linear_step(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; return rdev->desc->uV_step; } EXPORT_SYMBOL_GPL(regulator_get_linear_step); /** * regulator_is_supported_voltage - check if a voltage range can be supported * * @regulator: Regulator to check. * @min_uV: Minimum required voltage in uV. * @max_uV: Maximum required voltage in uV. * * Returns a boolean. */ int regulator_is_supported_voltage(struct regulator *regulator, int min_uV, int max_uV) { struct regulator_dev *rdev = regulator->rdev; int i, voltages, ret; /* If we can't change voltage check the current voltage */ if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_VOLTAGE)) { ret = regulator_get_voltage(regulator); if (ret >= 0) return min_uV <= ret && ret <= max_uV; else return ret; } /* Any voltage within constrains range is fine? */ if (rdev->desc->continuous_voltage_range) return min_uV >= rdev->constraints->min_uV && max_uV <= rdev->constraints->max_uV; ret = regulator_count_voltages(regulator); if (ret < 0) return 0; voltages = ret; for (i = 0; i < voltages; i++) { ret = regulator_list_voltage(regulator, i); if (ret >= min_uV && ret <= max_uV) return 1; } return 0; } EXPORT_SYMBOL_GPL(regulator_is_supported_voltage); static int regulator_map_voltage(struct regulator_dev *rdev, int min_uV, int max_uV) { const struct regulator_desc *desc = rdev->desc; if (desc->ops->map_voltage) return desc->ops->map_voltage(rdev, min_uV, max_uV); if (desc->ops->list_voltage == regulator_list_voltage_linear) return regulator_map_voltage_linear(rdev, min_uV, max_uV); if (desc->ops->list_voltage == regulator_list_voltage_linear_range) return regulator_map_voltage_linear_range(rdev, min_uV, max_uV); if (desc->ops->list_voltage == regulator_list_voltage_pickable_linear_range) return regulator_map_voltage_pickable_linear_range(rdev, min_uV, max_uV); return regulator_map_voltage_iterate(rdev, min_uV, max_uV); } static int _regulator_call_set_voltage(struct regulator_dev *rdev, int min_uV, int max_uV, unsigned *selector) { struct pre_voltage_change_data data; int ret; data.old_uV = regulator_get_voltage_rdev(rdev); data.min_uV = min_uV; data.max_uV = max_uV; ret = _notifier_call_chain(rdev, REGULATOR_EVENT_PRE_VOLTAGE_CHANGE, &data); if (ret & NOTIFY_STOP_MASK) return -EINVAL; ret = rdev->desc->ops->set_voltage(rdev, min_uV, max_uV, selector); if (ret >= 0) return ret; _notifier_call_chain(rdev, REGULATOR_EVENT_ABORT_VOLTAGE_CHANGE, (void *)data.old_uV); return ret; } static int _regulator_call_set_voltage_sel(struct regulator_dev *rdev, int uV, unsigned selector) { struct pre_voltage_change_data data; int ret; data.old_uV = regulator_get_voltage_rdev(rdev); data.min_uV = uV; data.max_uV = uV; ret = _notifier_call_chain(rdev, REGULATOR_EVENT_PRE_VOLTAGE_CHANGE, &data); if (ret & NOTIFY_STOP_MASK) return -EINVAL; ret = rdev->desc->ops->set_voltage_sel(rdev, selector); if (ret >= 0) return ret; _notifier_call_chain(rdev, REGULATOR_EVENT_ABORT_VOLTAGE_CHANGE, (void *)data.old_uV); return ret; } static int _regulator_set_voltage_sel_step(struct regulator_dev *rdev, int uV, int new_selector) { const struct regulator_ops *ops = rdev->desc->ops; int diff, old_sel, curr_sel, ret; /* Stepping is only needed if the regulator is enabled. */ if (!_regulator_is_enabled(rdev)) goto final_set; if (!ops->get_voltage_sel) return -EINVAL; old_sel = ops->get_voltage_sel(rdev); if (old_sel < 0) return old_sel; diff = new_selector - old_sel; if (diff == 0) return 0; /* No change needed. */ if (diff > 0) { /* Stepping up. */ for (curr_sel = old_sel + rdev->desc->vsel_step; curr_sel < new_selector; curr_sel += rdev->desc->vsel_step) { /* * Call the callback directly instead of using * _regulator_call_set_voltage_sel() as we don't * want to notify anyone yet. Same in the branch * below. */ ret = ops->set_voltage_sel(rdev, curr_sel); if (ret) goto try_revert; } } else { /* Stepping down. */ for (curr_sel = old_sel - rdev->desc->vsel_step; curr_sel > new_selector; curr_sel -= rdev->desc->vsel_step) { ret = ops->set_voltage_sel(rdev, curr_sel); if (ret) goto try_revert; } } final_set: /* The final selector will trigger the notifiers. */ return _regulator_call_set_voltage_sel(rdev, uV, new_selector); try_revert: /* * At least try to return to the previous voltage if setting a new * one failed. */ (void)ops->set_voltage_sel(rdev, old_sel); return ret; } static int _regulator_set_voltage_time(struct regulator_dev *rdev, int old_uV, int new_uV) { unsigned int ramp_delay = 0; if (rdev->constraints->ramp_delay) ramp_delay = rdev->constraints->ramp_delay; else if (rdev->desc->ramp_delay) ramp_delay = rdev->desc->ramp_delay; else if (rdev->constraints->settling_time) return rdev->constraints->settling_time; else if (rdev->constraints->settling_time_up && (new_uV > old_uV)) return rdev->constraints->settling_time_up; else if (rdev->constraints->settling_time_down && (new_uV < old_uV)) return rdev->constraints->settling_time_down; if (ramp_delay == 0) return 0; return DIV_ROUND_UP(abs(new_uV - old_uV), ramp_delay); } static int _regulator_do_set_voltage(struct regulator_dev *rdev, int min_uV, int max_uV) { int ret; int delay = 0; int best_val = 0; unsigned int selector; int old_selector = -1; const struct regulator_ops *ops = rdev->desc->ops; int old_uV = regulator_get_voltage_rdev(rdev); trace_regulator_set_voltage(rdev_get_name(rdev), min_uV, max_uV); min_uV += rdev->constraints->uV_offset; max_uV += rdev->constraints->uV_offset; /* * If we can't obtain the old selector there is not enough * info to call set_voltage_time_sel(). */ if (_regulator_is_enabled(rdev) && ops->set_voltage_time_sel && ops->get_voltage_sel) { old_selector = ops->get_voltage_sel(rdev); if (old_selector < 0) return old_selector; } if (ops->set_voltage) { ret = _regulator_call_set_voltage(rdev, min_uV, max_uV, &selector); if (ret >= 0) { if (ops->list_voltage) best_val = ops->list_voltage(rdev, selector); else best_val = regulator_get_voltage_rdev(rdev); } } else if (ops->set_voltage_sel) { ret = regulator_map_voltage(rdev, min_uV, max_uV); if (ret >= 0) { best_val = ops->list_voltage(rdev, ret); if (min_uV <= best_val && max_uV >= best_val) { selector = ret; if (old_selector == selector) ret = 0; else if (rdev->desc->vsel_step) ret = _regulator_set_voltage_sel_step( rdev, best_val, selector); else ret = _regulator_call_set_voltage_sel( rdev, best_val, selector); } else { ret = -EINVAL; } } } else { ret = -EINVAL; } if (ret) goto out; if (ops->set_voltage_time_sel) { /* * Call set_voltage_time_sel if successfully obtained * old_selector */ if (old_selector >= 0 && old_selector != selector) delay = ops->set_voltage_time_sel(rdev, old_selector, selector); } else { if (old_uV != best_val) { if (ops->set_voltage_time) delay = ops->set_voltage_time(rdev, old_uV, best_val); else delay = _regulator_set_voltage_time(rdev, old_uV, best_val); } } if (delay < 0) { rdev_warn(rdev, "failed to get delay: %pe\n", ERR_PTR(delay)); delay = 0; } /* Insert any necessary delays */ _regulator_delay_helper(delay); if (best_val >= 0) { unsigned long data = best_val; _notifier_call_chain(rdev, REGULATOR_EVENT_VOLTAGE_CHANGE, (void *)data); } out: trace_regulator_set_voltage_complete(rdev_get_name(rdev), best_val); return ret; } static int _regulator_do_set_suspend_voltage(struct regulator_dev *rdev, int min_uV, int max_uV, suspend_state_t state) { struct regulator_state *rstate; int uV, sel; rstate = regulator_get_suspend_state(rdev, state); if (rstate == NULL) return -EINVAL; if (min_uV < rstate->min_uV) min_uV = rstate->min_uV; if (max_uV > rstate->max_uV) max_uV = rstate->max_uV; sel = regulator_map_voltage(rdev, min_uV, max_uV); if (sel < 0) return sel; uV = rdev->desc->ops->list_voltage(rdev, sel); if (uV >= min_uV && uV <= max_uV) rstate->uV = uV; return 0; } static int regulator_set_voltage_unlocked(struct regulator *regulator, int min_uV, int max_uV, suspend_state_t state) { struct regulator_dev *rdev = regulator->rdev; struct regulator_voltage *voltage = ®ulator->voltage[state]; int ret = 0; int old_min_uV, old_max_uV; int current_uV; /* If we're setting the same range as last time the change * should be a noop (some cpufreq implementations use the same * voltage for multiple frequencies, for example). */ if (voltage->min_uV == min_uV && voltage->max_uV == max_uV) goto out; /* If we're trying to set a range that overlaps the current voltage, * return successfully even though the regulator does not support * changing the voltage. */ if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_VOLTAGE)) { current_uV = regulator_get_voltage_rdev(rdev); if (min_uV <= current_uV && current_uV <= max_uV) { voltage->min_uV = min_uV; voltage->max_uV = max_uV; goto out; } } /* sanity check */ if (!rdev->desc->ops->set_voltage && !rdev->desc->ops->set_voltage_sel) { ret = -EINVAL; goto out; } /* constraints check */ ret = regulator_check_voltage(rdev, &min_uV, &max_uV); if (ret < 0) goto out; /* restore original values in case of error */ old_min_uV = voltage->min_uV; old_max_uV = voltage->max_uV; voltage->min_uV = min_uV; voltage->max_uV = max_uV; /* for not coupled regulators this will just set the voltage */ ret = regulator_balance_voltage(rdev, state); if (ret < 0) { voltage->min_uV = old_min_uV; voltage->max_uV = old_max_uV; } out: return ret; } int regulator_set_voltage_rdev(struct regulator_dev *rdev, int min_uV, int max_uV, suspend_state_t state) { int best_supply_uV = 0; int supply_change_uV = 0; int ret; if (rdev->supply && regulator_ops_is_valid(rdev->supply->rdev, REGULATOR_CHANGE_VOLTAGE) && (rdev->desc->min_dropout_uV || !(rdev->desc->ops->get_voltage || rdev->desc->ops->get_voltage_sel))) { int current_supply_uV; int selector; selector = regulator_map_voltage(rdev, min_uV, max_uV); if (selector < 0) { ret = selector; goto out; } best_supply_uV = _regulator_list_voltage(rdev, selector, 0); if (best_supply_uV < 0) { ret = best_supply_uV; goto out; } best_supply_uV += rdev->desc->min_dropout_uV; current_supply_uV = regulator_get_voltage_rdev(rdev->supply->rdev); if (current_supply_uV < 0) { ret = current_supply_uV; goto out; } supply_change_uV = best_supply_uV - current_supply_uV; } if (supply_change_uV > 0) { ret = regulator_set_voltage_unlocked(rdev->supply, best_supply_uV, INT_MAX, state); if (ret) { dev_err(&rdev->dev, "Failed to increase supply voltage: %pe\n", ERR_PTR(ret)); goto out; } } if (state == PM_SUSPEND_ON) ret = _regulator_do_set_voltage(rdev, min_uV, max_uV); else ret = _regulator_do_set_suspend_voltage(rdev, min_uV, max_uV, state); if (ret < 0) goto out; if (supply_change_uV < 0) { ret = regulator_set_voltage_unlocked(rdev->supply, best_supply_uV, INT_MAX, state); if (ret) dev_warn(&rdev->dev, "Failed to decrease supply voltage: %pe\n", ERR_PTR(ret)); /* No need to fail here */ ret = 0; } out: return ret; } EXPORT_SYMBOL_GPL(regulator_set_voltage_rdev); static int regulator_limit_voltage_step(struct regulator_dev *rdev, int *current_uV, int *min_uV) { struct regulation_constraints *constraints = rdev->constraints; /* Limit voltage change only if necessary */ if (!constraints->max_uV_step || !_regulator_is_enabled(rdev)) return 1; if (*current_uV < 0) { *current_uV = regulator_get_voltage_rdev(rdev); if (*current_uV < 0) return *current_uV; } if (abs(*current_uV - *min_uV) <= constraints->max_uV_step) return 1; /* Clamp target voltage within the given step */ if (*current_uV < *min_uV) *min_uV = min(*current_uV + constraints->max_uV_step, *min_uV); else *min_uV = max(*current_uV - constraints->max_uV_step, *min_uV); return 0; } static int regulator_get_optimal_voltage(struct regulator_dev *rdev, int *current_uV, int *min_uV, int *max_uV, suspend_state_t state, int n_coupled) { struct coupling_desc *c_desc = &rdev->coupling_desc; struct regulator_dev **c_rdevs = c_desc->coupled_rdevs; struct regulation_constraints *constraints = rdev->constraints; int desired_min_uV = 0, desired_max_uV = INT_MAX; int max_current_uV = 0, min_current_uV = INT_MAX; int highest_min_uV = 0, target_uV, possible_uV; int i, ret, max_spread; bool done; *current_uV = -1; /* * If there are no coupled regulators, simply set the voltage * demanded by consumers. */ if (n_coupled == 1) { /* * If consumers don't provide any demands, set voltage * to min_uV */ desired_min_uV = constraints->min_uV; desired_max_uV = constraints->max_uV; ret = regulator_check_consumers(rdev, &desired_min_uV, &desired_max_uV, state); if (ret < 0) return ret; done = true; goto finish; } /* Find highest min desired voltage */ for (i = 0; i < n_coupled; i++) { int tmp_min = 0; int tmp_max = INT_MAX; lockdep_assert_held_once(&c_rdevs[i]->mutex.base); ret = regulator_check_consumers(c_rdevs[i], &tmp_min, &tmp_max, state); if (ret < 0) return ret; ret = regulator_check_voltage(c_rdevs[i], &tmp_min, &tmp_max); if (ret < 0) return ret; highest_min_uV = max(highest_min_uV, tmp_min); if (i == 0) { desired_min_uV = tmp_min; desired_max_uV = tmp_max; } } max_spread = constraints->max_spread[0]; /* * Let target_uV be equal to the desired one if possible. * If not, set it to minimum voltage, allowed by other coupled * regulators. */ target_uV = max(desired_min_uV, highest_min_uV - max_spread); /* * Find min and max voltages, which currently aren't violating * max_spread. */ for (i = 1; i < n_coupled; i++) { int tmp_act; if (!_regulator_is_enabled(c_rdevs[i])) continue; tmp_act = regulator_get_voltage_rdev(c_rdevs[i]); if (tmp_act < 0) return tmp_act; min_current_uV = min(tmp_act, min_current_uV); max_current_uV = max(tmp_act, max_current_uV); } /* There aren't any other regulators enabled */ if (max_current_uV == 0) { possible_uV = target_uV; } else { /* * Correct target voltage, so as it currently isn't * violating max_spread */ possible_uV = max(target_uV, max_current_uV - max_spread); possible_uV = min(possible_uV, min_current_uV + max_spread); } if (possible_uV > desired_max_uV) return -EINVAL; done = (possible_uV == target_uV); desired_min_uV = possible_uV; finish: /* Apply max_uV_step constraint if necessary */ if (state == PM_SUSPEND_ON) { ret = regulator_limit_voltage_step(rdev, current_uV, &desired_min_uV); if (ret < 0) return ret; if (ret == 0) done = false; } /* Set current_uV if wasn't done earlier in the code and if necessary */ if (n_coupled > 1 && *current_uV == -1) { if (_regulator_is_enabled(rdev)) { ret = regulator_get_voltage_rdev(rdev); if (ret < 0) return ret; *current_uV = ret; } else { *current_uV = desired_min_uV; } } *min_uV = desired_min_uV; *max_uV = desired_max_uV; return done; } int regulator_do_balance_voltage(struct regulator_dev *rdev, suspend_state_t state, bool skip_coupled) { struct regulator_dev **c_rdevs; struct regulator_dev *best_rdev; struct coupling_desc *c_desc = &rdev->coupling_desc; int i, ret, n_coupled, best_min_uV, best_max_uV, best_c_rdev; unsigned int delta, best_delta; unsigned long c_rdev_done = 0; bool best_c_rdev_done; c_rdevs = c_desc->coupled_rdevs; n_coupled = skip_coupled ? 1 : c_desc->n_coupled; /* * Find the best possible voltage change on each loop. Leave the loop * if there isn't any possible change. */ do { best_c_rdev_done = false; best_delta = 0; best_min_uV = 0; best_max_uV = 0; best_c_rdev = 0; best_rdev = NULL; /* * Find highest difference between optimal voltage * and current voltage. */ for (i = 0; i < n_coupled; i++) { /* * optimal_uV is the best voltage that can be set for * i-th regulator at the moment without violating * max_spread constraint in order to balance * the coupled voltages. */ int optimal_uV = 0, optimal_max_uV = 0, current_uV = 0; if (test_bit(i, &c_rdev_done)) continue; ret = regulator_get_optimal_voltage(c_rdevs[i], ¤t_uV, &optimal_uV, &optimal_max_uV, state, n_coupled); if (ret < 0) goto out; delta = abs(optimal_uV - current_uV); if (delta && best_delta <= delta) { best_c_rdev_done = ret; best_delta = delta; best_rdev = c_rdevs[i]; best_min_uV = optimal_uV; best_max_uV = optimal_max_uV; best_c_rdev = i; } } /* Nothing to change, return successfully */ if (!best_rdev) { ret = 0; goto out; } ret = regulator_set_voltage_rdev(best_rdev, best_min_uV, best_max_uV, state); if (ret < 0) goto out; if (best_c_rdev_done) set_bit(best_c_rdev, &c_rdev_done); } while (n_coupled > 1); out: return ret; } static int regulator_balance_voltage(struct regulator_dev *rdev, suspend_state_t state) { struct coupling_desc *c_desc = &rdev->coupling_desc; struct regulator_coupler *coupler = c_desc->coupler; bool skip_coupled = false; /* * If system is in a state other than PM_SUSPEND_ON, don't check * other coupled regulators. */ if (state != PM_SUSPEND_ON) skip_coupled = true; if (c_desc->n_resolved < c_desc->n_coupled) { rdev_err(rdev, "Not all coupled regulators registered\n"); return -EPERM; } /* Invoke custom balancer for customized couplers */ if (coupler && coupler->balance_voltage) return coupler->balance_voltage(coupler, rdev, state); return regulator_do_balance_voltage(rdev, state, skip_coupled); } /** * regulator_set_voltage - set regulator output voltage * @regulator: regulator source * @min_uV: Minimum required voltage in uV * @max_uV: Maximum acceptable voltage in uV * * Sets a voltage regulator to the desired output voltage. This can be set * during any regulator state. IOW, regulator can be disabled or enabled. * * If the regulator is enabled then the voltage will change to the new value * immediately otherwise if the regulator is disabled the regulator will * output at the new voltage when enabled. * * NOTE: If the regulator is shared between several devices then the lowest * request voltage that meets the system constraints will be used. * Regulator system constraints must be set for this regulator before * calling this function otherwise this call will fail. */ int regulator_set_voltage(struct regulator *regulator, int min_uV, int max_uV) { struct ww_acquire_ctx ww_ctx; int ret; regulator_lock_dependent(regulator->rdev, &ww_ctx); ret = regulator_set_voltage_unlocked(regulator, min_uV, max_uV, PM_SUSPEND_ON); regulator_unlock_dependent(regulator->rdev, &ww_ctx); return ret; } EXPORT_SYMBOL_GPL(regulator_set_voltage); static inline int regulator_suspend_toggle(struct regulator_dev *rdev, suspend_state_t state, bool en) { struct regulator_state *rstate; rstate = regulator_get_suspend_state(rdev, state); if (rstate == NULL) return -EINVAL; if (!rstate->changeable) return -EPERM; rstate->enabled = (en) ? ENABLE_IN_SUSPEND : DISABLE_IN_SUSPEND; return 0; } int regulator_suspend_enable(struct regulator_dev *rdev, suspend_state_t state) { return regulator_suspend_toggle(rdev, state, true); } EXPORT_SYMBOL_GPL(regulator_suspend_enable); int regulator_suspend_disable(struct regulator_dev *rdev, suspend_state_t state) { struct regulator *regulator; struct regulator_voltage *voltage; /* * if any consumer wants this regulator device keeping on in * suspend states, don't set it as disabled. */ list_for_each_entry(regulator, &rdev->consumer_list, list) { voltage = ®ulator->voltage[state]; if (voltage->min_uV || voltage->max_uV) return 0; } return regulator_suspend_toggle(rdev, state, false); } EXPORT_SYMBOL_GPL(regulator_suspend_disable); static int _regulator_set_suspend_voltage(struct regulator *regulator, int min_uV, int max_uV, suspend_state_t state) { struct regulator_dev *rdev = regulator->rdev; struct regulator_state *rstate; rstate = regulator_get_suspend_state(rdev, state); if (rstate == NULL) return -EINVAL; if (rstate->min_uV == rstate->max_uV) { rdev_err(rdev, "The suspend voltage can't be changed!\n"); return -EPERM; } return regulator_set_voltage_unlocked(regulator, min_uV, max_uV, state); } int regulator_set_suspend_voltage(struct regulator *regulator, int min_uV, int max_uV, suspend_state_t state) { struct ww_acquire_ctx ww_ctx; int ret; /* PM_SUSPEND_ON is handled by regulator_set_voltage() */ if (regulator_check_states(state) || state == PM_SUSPEND_ON) return -EINVAL; regulator_lock_dependent(regulator->rdev, &ww_ctx); ret = _regulator_set_suspend_voltage(regulator, min_uV, max_uV, state); regulator_unlock_dependent(regulator->rdev, &ww_ctx); return ret; } EXPORT_SYMBOL_GPL(regulator_set_suspend_voltage); /** * regulator_set_voltage_time - get raise/fall time * @regulator: regulator source * @old_uV: starting voltage in microvolts * @new_uV: target voltage in microvolts * * Provided with the starting and ending voltage, this function attempts to * calculate the time in microseconds required to rise or fall to this new * voltage. */ int regulator_set_voltage_time(struct regulator *regulator, int old_uV, int new_uV) { struct regulator_dev *rdev = regulator->rdev; const struct regulator_ops *ops = rdev->desc->ops; int old_sel = -1; int new_sel = -1; int voltage; int i; if (ops->set_voltage_time) return ops->set_voltage_time(rdev, old_uV, new_uV); else if (!ops->set_voltage_time_sel) return _regulator_set_voltage_time(rdev, old_uV, new_uV); /* Currently requires operations to do this */ if (!ops->list_voltage || !rdev->desc->n_voltages) return -EINVAL; for (i = 0; i < rdev->desc->n_voltages; i++) { /* We only look for exact voltage matches here */ if (i < rdev->desc->linear_min_sel) continue; if (old_sel >= 0 && new_sel >= 0) break; voltage = regulator_list_voltage(regulator, i); if (voltage < 0) return -EINVAL; if (voltage == 0) continue; if (voltage == old_uV) old_sel = i; if (voltage == new_uV) new_sel = i; } if (old_sel < 0 || new_sel < 0) return -EINVAL; return ops->set_voltage_time_sel(rdev, old_sel, new_sel); } EXPORT_SYMBOL_GPL(regulator_set_voltage_time); /** * regulator_set_voltage_time_sel - get raise/fall time * @rdev: regulator source device * @old_selector: selector for starting voltage * @new_selector: selector for target voltage * * Provided with the starting and target voltage selectors, this function * returns time in microseconds required to rise or fall to this new voltage * * Drivers providing ramp_delay in regulation_constraints can use this as their * set_voltage_time_sel() operation. */ int regulator_set_voltage_time_sel(struct regulator_dev *rdev, unsigned int old_selector, unsigned int new_selector) { int old_volt, new_volt; /* sanity check */ if (!rdev->desc->ops->list_voltage) return -EINVAL; old_volt = rdev->desc->ops->list_voltage(rdev, old_selector); new_volt = rdev->desc->ops->list_voltage(rdev, new_selector); if (rdev->desc->ops->set_voltage_time) return rdev->desc->ops->set_voltage_time(rdev, old_volt, new_volt); else return _regulator_set_voltage_time(rdev, old_volt, new_volt); } EXPORT_SYMBOL_GPL(regulator_set_voltage_time_sel); int regulator_sync_voltage_rdev(struct regulator_dev *rdev) { int ret; regulator_lock(rdev); if (!rdev->desc->ops->set_voltage && !rdev->desc->ops->set_voltage_sel) { ret = -EINVAL; goto out; } /* balance only, if regulator is coupled */ if (rdev->coupling_desc.n_coupled > 1) ret = regulator_balance_voltage(rdev, PM_SUSPEND_ON); else ret = -EOPNOTSUPP; out: regulator_unlock(rdev); return ret; } /** * regulator_sync_voltage - re-apply last regulator output voltage * @regulator: regulator source * * Re-apply the last configured voltage. This is intended to be used * where some external control source the consumer is cooperating with * has caused the configured voltage to change. */ int regulator_sync_voltage(struct regulator *regulator) { struct regulator_dev *rdev = regulator->rdev; struct regulator_voltage *voltage = ®ulator->voltage[PM_SUSPEND_ON]; int ret, min_uV, max_uV; if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_VOLTAGE)) return 0; regulator_lock(rdev); if (!rdev->desc->ops->set_voltage && !rdev->desc->ops->set_voltage_sel) { ret = -EINVAL; goto out; } /* This is only going to work if we've had a voltage configured. */ if (!voltage->min_uV && !voltage->max_uV) { ret = -EINVAL; goto out; } min_uV = voltage->min_uV; max_uV = voltage->max_uV; /* This should be a paranoia check... */ ret = regulator_check_voltage(rdev, &min_uV, &max_uV); if (ret < 0) goto out; ret = regulator_check_consumers(rdev, &min_uV, &max_uV, 0); if (ret < 0) goto out; /* balance only, if regulator is coupled */ if (rdev->coupling_desc.n_coupled > 1) ret = regulator_balance_voltage(rdev, PM_SUSPEND_ON); else ret = _regulator_do_set_voltage(rdev, min_uV, max_uV); out: regulator_unlock(rdev); return ret; } EXPORT_SYMBOL_GPL(regulator_sync_voltage); int regulator_get_voltage_rdev(struct regulator_dev *rdev) { int sel, ret; bool bypassed; if (rdev->desc->ops->get_bypass) { ret = rdev->desc->ops->get_bypass(rdev, &bypassed); if (ret < 0) return ret; if (bypassed) { /* if bypassed the regulator must have a supply */ if (!rdev->supply) { rdev_err(rdev, "bypassed regulator has no supply!\n"); return -EPROBE_DEFER; } return regulator_get_voltage_rdev(rdev->supply->rdev); } } if (rdev->desc->ops->get_voltage_sel) { sel = rdev->desc->ops->get_voltage_sel(rdev); if (sel < 0) return sel; ret = rdev->desc->ops->list_voltage(rdev, sel); } else if (rdev->desc->ops->get_voltage) { ret = rdev->desc->ops->get_voltage(rdev); } else if (rdev->desc->ops->list_voltage) { ret = rdev->desc->ops->list_voltage(rdev, 0); } else if (rdev->desc->fixed_uV && (rdev->desc->n_voltages == 1)) { ret = rdev->desc->fixed_uV; } else if (rdev->supply) { ret = regulator_get_voltage_rdev(rdev->supply->rdev); } else if (rdev->supply_name) { return -EPROBE_DEFER; } else { return -EINVAL; } if (ret < 0) return ret; return ret - rdev->constraints->uV_offset; } EXPORT_SYMBOL_GPL(regulator_get_voltage_rdev); /** * regulator_get_voltage - get regulator output voltage * @regulator: regulator source * * This returns the current regulator voltage in uV. * * NOTE: If the regulator is disabled it will return the voltage value. This * function should not be used to determine regulator state. */ int regulator_get_voltage(struct regulator *regulator) { struct ww_acquire_ctx ww_ctx; int ret; regulator_lock_dependent(regulator->rdev, &ww_ctx); ret = regulator_get_voltage_rdev(regulator->rdev); regulator_unlock_dependent(regulator->rdev, &ww_ctx); return ret; } EXPORT_SYMBOL_GPL(regulator_get_voltage); /** * regulator_set_current_limit - set regulator output current limit * @regulator: regulator source * @min_uA: Minimum supported current in uA * @max_uA: Maximum supported current in uA * * Sets current sink to the desired output current. This can be set during * any regulator state. IOW, regulator can be disabled or enabled. * * If the regulator is enabled then the current will change to the new value * immediately otherwise if the regulator is disabled the regulator will * output at the new current when enabled. * * NOTE: Regulator system constraints must be set for this regulator before * calling this function otherwise this call will fail. */ int regulator_set_current_limit(struct regulator *regulator, int min_uA, int max_uA) { struct regulator_dev *rdev = regulator->rdev; int ret; regulator_lock(rdev); /* sanity check */ if (!rdev->desc->ops->set_current_limit) { ret = -EINVAL; goto out; } /* constraints check */ ret = regulator_check_current_limit(rdev, &min_uA, &max_uA); if (ret < 0) goto out; ret = rdev->desc->ops->set_current_limit(rdev, min_uA, max_uA); out: regulator_unlock(rdev); return ret; } EXPORT_SYMBOL_GPL(regulator_set_current_limit); static int _regulator_get_current_limit_unlocked(struct regulator_dev *rdev) { /* sanity check */ if (!rdev->desc->ops->get_current_limit) return -EINVAL; return rdev->desc->ops->get_current_limit(rdev); } static int _regulator_get_current_limit(struct regulator_dev *rdev) { int ret; regulator_lock(rdev); ret = _regulator_get_current_limit_unlocked(rdev); regulator_unlock(rdev); return ret; } /** * regulator_get_current_limit - get regulator output current * @regulator: regulator source * * This returns the current supplied by the specified current sink in uA. * * NOTE: If the regulator is disabled it will return the current value. This * function should not be used to determine regulator state. */ int regulator_get_current_limit(struct regulator *regulator) { return _regulator_get_current_limit(regulator->rdev); } EXPORT_SYMBOL_GPL(regulator_get_current_limit); /** * regulator_set_mode - set regulator operating mode * @regulator: regulator source * @mode: operating mode - one of the REGULATOR_MODE constants * * Set regulator operating mode to increase regulator efficiency or improve * regulation performance. * * NOTE: Regulator system constraints must be set for this regulator before * calling this function otherwise this call will fail. */ int regulator_set_mode(struct regulator *regulator, unsigned int mode) { struct regulator_dev *rdev = regulator->rdev; int ret; int regulator_curr_mode; regulator_lock(rdev); /* sanity check */ if (!rdev->desc->ops->set_mode) { ret = -EINVAL; goto out; } /* return if the same mode is requested */ if (rdev->desc->ops->get_mode) { regulator_curr_mode = rdev->desc->ops->get_mode(rdev); if (regulator_curr_mode == mode) { ret = 0; goto out; } } /* constraints check */ ret = regulator_mode_constrain(rdev, &mode); if (ret < 0) goto out; ret = rdev->desc->ops->set_mode(rdev, mode); out: regulator_unlock(rdev); return ret; } EXPORT_SYMBOL_GPL(regulator_set_mode); static unsigned int _regulator_get_mode_unlocked(struct regulator_dev *rdev) { /* sanity check */ if (!rdev->desc->ops->get_mode) return -EINVAL; return rdev->desc->ops->get_mode(rdev); } static unsigned int _regulator_get_mode(struct regulator_dev *rdev) { int ret; regulator_lock(rdev); ret = _regulator_get_mode_unlocked(rdev); regulator_unlock(rdev); return ret; } /** * regulator_get_mode - get regulator operating mode * @regulator: regulator source * * Get the current regulator operating mode. */ unsigned int regulator_get_mode(struct regulator *regulator) { return _regulator_get_mode(regulator->rdev); } EXPORT_SYMBOL_GPL(regulator_get_mode); static int rdev_get_cached_err_flags(struct regulator_dev *rdev) { int ret = 0; if (rdev->use_cached_err) { spin_lock(&rdev->err_lock); ret = rdev->cached_err; spin_unlock(&rdev->err_lock); } return ret; } static int _regulator_get_error_flags(struct regulator_dev *rdev, unsigned int *flags) { int cached_flags, ret = 0; regulator_lock(rdev); cached_flags = rdev_get_cached_err_flags(rdev); if (rdev->desc->ops->get_error_flags) ret = rdev->desc->ops->get_error_flags(rdev, flags); else if (!rdev->use_cached_err) ret = -EINVAL; *flags |= cached_flags; regulator_unlock(rdev); return ret; } /** * regulator_get_error_flags - get regulator error information * @regulator: regulator source * @flags: pointer to store error flags * * Get the current regulator error information. */ int regulator_get_error_flags(struct regulator *regulator, unsigned int *flags) { return _regulator_get_error_flags(regulator->rdev, flags); } EXPORT_SYMBOL_GPL(regulator_get_error_flags); /** * regulator_set_load - set regulator load * @regulator: regulator source * @uA_load: load current * * Notifies the regulator core of a new device load. This is then used by * DRMS (if enabled by constraints) to set the most efficient regulator * operating mode for the new regulator loading. * * Consumer devices notify their supply regulator of the maximum power * they will require (can be taken from device datasheet in the power * consumption tables) when they change operational status and hence power * state. Examples of operational state changes that can affect power * consumption are :- * * o Device is opened / closed. * o Device I/O is about to begin or has just finished. * o Device is idling in between work. * * This information is also exported via sysfs to userspace. * * DRMS will sum the total requested load on the regulator and change * to the most efficient operating mode if platform constraints allow. * * NOTE: when a regulator consumer requests to have a regulator * disabled then any load that consumer requested no longer counts * toward the total requested load. If the regulator is re-enabled * then the previously requested load will start counting again. * * If a regulator is an always-on regulator then an individual consumer's * load will still be removed if that consumer is fully disabled. * * On error a negative errno is returned. */ int regulator_set_load(struct regulator *regulator, int uA_load) { struct regulator_dev *rdev = regulator->rdev; int old_uA_load; int ret = 0; regulator_lock(rdev); old_uA_load = regulator->uA_load; regulator->uA_load = uA_load; if (regulator->enable_count && old_uA_load != uA_load) { ret = drms_uA_update(rdev); if (ret < 0) regulator->uA_load = old_uA_load; } regulator_unlock(rdev); return ret; } EXPORT_SYMBOL_GPL(regulator_set_load); /** * regulator_allow_bypass - allow the regulator to go into bypass mode * * @regulator: Regulator to configure * @enable: enable or disable bypass mode * * Allow the regulator to go into bypass mode if all other consumers * for the regulator also enable bypass mode and the machine * constraints allow this. Bypass mode means that the regulator is * simply passing the input directly to the output with no regulation. */ int regulator_allow_bypass(struct regulator *regulator, bool enable) { struct regulator_dev *rdev = regulator->rdev; const char *name = rdev_get_name(rdev); int ret = 0; if (!rdev->desc->ops->set_bypass) return 0; if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_BYPASS)) return 0; regulator_lock(rdev); if (enable && !regulator->bypass) { rdev->bypass_count++; if (rdev->bypass_count == rdev->open_count) { trace_regulator_bypass_enable(name); ret = rdev->desc->ops->set_bypass(rdev, enable); if (ret != 0) rdev->bypass_count--; else trace_regulator_bypass_enable_complete(name); } } else if (!enable && regulator->bypass) { rdev->bypass_count--; if (rdev->bypass_count != rdev->open_count) { trace_regulator_bypass_disable(name); ret = rdev->desc->ops->set_bypass(rdev, enable); if (ret != 0) rdev->bypass_count++; else trace_regulator_bypass_disable_complete(name); } } if (ret == 0) regulator->bypass = enable; regulator_unlock(rdev); return ret; } EXPORT_SYMBOL_GPL(regulator_allow_bypass); /** * regulator_register_notifier - register regulator event notifier * @regulator: regulator source * @nb: notifier block * * Register notifier block to receive regulator events. */ int regulator_register_notifier(struct regulator *regulator, struct notifier_block *nb) { return blocking_notifier_chain_register(®ulator->rdev->notifier, nb); } EXPORT_SYMBOL_GPL(regulator_register_notifier); /** * regulator_unregister_notifier - unregister regulator event notifier * @regulator: regulator source * @nb: notifier block * * Unregister regulator event notifier block. */ int regulator_unregister_notifier(struct regulator *regulator, struct notifier_block *nb) { return blocking_notifier_chain_unregister(®ulator->rdev->notifier, nb); } EXPORT_SYMBOL_GPL(regulator_unregister_notifier); /* notify regulator consumers and downstream regulator consumers. * Note mutex must be held by caller. */ static int _notifier_call_chain(struct regulator_dev *rdev, unsigned long event, void *data) { /* call rdev chain first */ int ret = blocking_notifier_call_chain(&rdev->notifier, event, data); if (IS_REACHABLE(CONFIG_REGULATOR_NETLINK_EVENTS)) { struct device *parent = rdev->dev.parent; const char *rname = rdev_get_name(rdev); char name[32]; /* Avoid duplicate debugfs directory names */ if (parent && rname == rdev->desc->name) { snprintf(name, sizeof(name), "%s-%s", dev_name(parent), rname); rname = name; } reg_generate_netlink_event(rname, event); } return ret; } int _regulator_bulk_get(struct device *dev, int num_consumers, struct regulator_bulk_data *consumers, enum regulator_get_type get_type) { int i; int ret; for (i = 0; i < num_consumers; i++) consumers[i].consumer = NULL; for (i = 0; i < num_consumers; i++) { consumers[i].consumer = _regulator_get(dev, consumers[i].supply, get_type); if (IS_ERR(consumers[i].consumer)) { ret = dev_err_probe(dev, PTR_ERR(consumers[i].consumer), "Failed to get supply '%s'", consumers[i].supply); consumers[i].consumer = NULL; goto err; } if (consumers[i].init_load_uA > 0) { ret = regulator_set_load(consumers[i].consumer, consumers[i].init_load_uA); if (ret) { i++; goto err; } } } return 0; err: while (--i >= 0) regulator_put(consumers[i].consumer); return ret; } /** * regulator_bulk_get - get multiple regulator consumers * * @dev: Device to supply * @num_consumers: Number of consumers to register * @consumers: Configuration of consumers; clients are stored here. * * @return 0 on success, an errno on failure. * * This helper function allows drivers to get several regulator * consumers in one operation. If any of the regulators cannot be * acquired then any regulators that were allocated will be freed * before returning to the caller. */ int regulator_bulk_get(struct device *dev, int num_consumers, struct regulator_bulk_data *consumers) { return _regulator_bulk_get(dev, num_consumers, consumers, NORMAL_GET); } EXPORT_SYMBOL_GPL(regulator_bulk_get); static void regulator_bulk_enable_async(void *data, async_cookie_t cookie) { struct regulator_bulk_data *bulk = data; bulk->ret = regulator_enable(bulk->consumer); } /** * regulator_bulk_enable - enable multiple regulator consumers * * @num_consumers: Number of consumers * @consumers: Consumer data; clients are stored here. * @return 0 on success, an errno on failure * * This convenience API allows consumers to enable multiple regulator * clients in a single API call. If any consumers cannot be enabled * then any others that were enabled will be disabled again prior to * return. */ int regulator_bulk_enable(int num_consumers, struct regulator_bulk_data *consumers) { ASYNC_DOMAIN_EXCLUSIVE(async_domain); int i; int ret = 0; for (i = 0; i < num_consumers; i++) { async_schedule_domain(regulator_bulk_enable_async, &consumers[i], &async_domain); } async_synchronize_full_domain(&async_domain); /* If any consumer failed we need to unwind any that succeeded */ for (i = 0; i < num_consumers; i++) { if (consumers[i].ret != 0) { ret = consumers[i].ret; goto err; } } return 0; err: for (i = 0; i < num_consumers; i++) { if (consumers[i].ret < 0) pr_err("Failed to enable %s: %pe\n", consumers[i].supply, ERR_PTR(consumers[i].ret)); else regulator_disable(consumers[i].consumer); } return ret; } EXPORT_SYMBOL_GPL(regulator_bulk_enable); /** * regulator_bulk_disable - disable multiple regulator consumers * * @num_consumers: Number of consumers * @consumers: Consumer data; clients are stored here. * @return 0 on success, an errno on failure * * This convenience API allows consumers to disable multiple regulator * clients in a single API call. If any consumers cannot be disabled * then any others that were disabled will be enabled again prior to * return. */ int regulator_bulk_disable(int num_consumers, struct regulator_bulk_data *consumers) { int i; int ret, r; for (i = num_consumers - 1; i >= 0; --i) { ret = regulator_disable(consumers[i].consumer); if (ret != 0) goto err; } return 0; err: pr_err("Failed to disable %s: %pe\n", consumers[i].supply, ERR_PTR(ret)); for (++i; i < num_consumers; ++i) { r = regulator_enable(consumers[i].consumer); if (r != 0) pr_err("Failed to re-enable %s: %pe\n", consumers[i].supply, ERR_PTR(r)); } return ret; } EXPORT_SYMBOL_GPL(regulator_bulk_disable); /** * regulator_bulk_force_disable - force disable multiple regulator consumers * * @num_consumers: Number of consumers * @consumers: Consumer data; clients are stored here. * @return 0 on success, an errno on failure * * This convenience API allows consumers to forcibly disable multiple regulator * clients in a single API call. * NOTE: This should be used for situations when device damage will * likely occur if the regulators are not disabled (e.g. over temp). * Although regulator_force_disable function call for some consumers can * return error numbers, the function is called for all consumers. */ int regulator_bulk_force_disable(int num_consumers, struct regulator_bulk_data *consumers) { int i; int ret = 0; for (i = 0; i < num_consumers; i++) { consumers[i].ret = regulator_force_disable(consumers[i].consumer); /* Store first error for reporting */ if (consumers[i].ret && !ret) ret = consumers[i].ret; } return ret; } EXPORT_SYMBOL_GPL(regulator_bulk_force_disable); /** * regulator_bulk_free - free multiple regulator consumers * * @num_consumers: Number of consumers * @consumers: Consumer data; clients are stored here. * * This convenience API allows consumers to free multiple regulator * clients in a single API call. */ void regulator_bulk_free(int num_consumers, struct regulator_bulk_data *consumers) { int i; for (i = 0; i < num_consumers; i++) { regulator_put(consumers[i].consumer); consumers[i].consumer = NULL; } } EXPORT_SYMBOL_GPL(regulator_bulk_free); /** * regulator_handle_critical - Handle events for system-critical regulators. * @rdev: The regulator device. * @event: The event being handled. * * This function handles critical events such as under-voltage, over-current, * and unknown errors for regulators deemed system-critical. On detecting such * events, it triggers a hardware protection shutdown with a defined timeout. */ static void regulator_handle_critical(struct regulator_dev *rdev, unsigned long event) { const char *reason = NULL; if (!rdev->constraints->system_critical) return; switch (event) { case REGULATOR_EVENT_UNDER_VOLTAGE: reason = "System critical regulator: voltage drop detected"; break; case REGULATOR_EVENT_OVER_CURRENT: reason = "System critical regulator: over-current detected"; break; case REGULATOR_EVENT_FAIL: reason = "System critical regulator: unknown error"; } if (!reason) return; hw_protection_shutdown(reason, rdev->constraints->uv_less_critical_window_ms); } /** * regulator_notifier_call_chain - call regulator event notifier * @rdev: regulator source * @event: notifier block * @data: callback-specific data. * * Called by regulator drivers to notify clients a regulator event has * occurred. */ int regulator_notifier_call_chain(struct regulator_dev *rdev, unsigned long event, void *data) { regulator_handle_critical(rdev, event); _notifier_call_chain(rdev, event, data); return NOTIFY_DONE; } EXPORT_SYMBOL_GPL(regulator_notifier_call_chain); /** * regulator_mode_to_status - convert a regulator mode into a status * * @mode: Mode to convert * * Convert a regulator mode into a status. */ int regulator_mode_to_status(unsigned int mode) { switch (mode) { case REGULATOR_MODE_FAST: return REGULATOR_STATUS_FAST; case REGULATOR_MODE_NORMAL: return REGULATOR_STATUS_NORMAL; case REGULATOR_MODE_IDLE: return REGULATOR_STATUS_IDLE; case REGULATOR_MODE_STANDBY: return REGULATOR_STATUS_STANDBY; default: return REGULATOR_STATUS_UNDEFINED; } } EXPORT_SYMBOL_GPL(regulator_mode_to_status); static struct attribute *regulator_dev_attrs[] = { &dev_attr_name.attr, &dev_attr_num_users.attr, &dev_attr_type.attr, &dev_attr_microvolts.attr, &dev_attr_microamps.attr, &dev_attr_opmode.attr, &dev_attr_state.attr, &dev_attr_status.attr, &dev_attr_bypass.attr, &dev_attr_requested_microamps.attr, &dev_attr_min_microvolts.attr, &dev_attr_max_microvolts.attr, &dev_attr_min_microamps.attr, &dev_attr_max_microamps.attr, &dev_attr_under_voltage.attr, &dev_attr_over_current.attr, &dev_attr_regulation_out.attr, &dev_attr_fail.attr, &dev_attr_over_temp.attr, &dev_attr_under_voltage_warn.attr, &dev_attr_over_current_warn.attr, &dev_attr_over_voltage_warn.attr, &dev_attr_over_temp_warn.attr, &dev_attr_suspend_standby_state.attr, &dev_attr_suspend_mem_state.attr, &dev_attr_suspend_disk_state.attr, &dev_attr_suspend_standby_microvolts.attr, &dev_attr_suspend_mem_microvolts.attr, &dev_attr_suspend_disk_microvolts.attr, &dev_attr_suspend_standby_mode.attr, &dev_attr_suspend_mem_mode.attr, &dev_attr_suspend_disk_mode.attr, NULL }; /* * To avoid cluttering sysfs (and memory) with useless state, only * create attributes that can be meaningfully displayed. */ static umode_t regulator_attr_is_visible(struct kobject *kobj, struct attribute *attr, int idx) { struct device *dev = kobj_to_dev(kobj); struct regulator_dev *rdev = dev_to_rdev(dev); const struct regulator_ops *ops = rdev->desc->ops; umode_t mode = attr->mode; /* these three are always present */ if (attr == &dev_attr_name.attr || attr == &dev_attr_num_users.attr || attr == &dev_attr_type.attr) return mode; /* some attributes need specific methods to be displayed */ if (attr == &dev_attr_microvolts.attr) { if ((ops->get_voltage && ops->get_voltage(rdev) >= 0) || (ops->get_voltage_sel && ops->get_voltage_sel(rdev) >= 0) || (ops->list_voltage && ops->list_voltage(rdev, 0) >= 0) || (rdev->desc->fixed_uV && rdev->desc->n_voltages == 1)) return mode; return 0; } if (attr == &dev_attr_microamps.attr) return ops->get_current_limit ? mode : 0; if (attr == &dev_attr_opmode.attr) return ops->get_mode ? mode : 0; if (attr == &dev_attr_state.attr) return (rdev->ena_pin || ops->is_enabled) ? mode : 0; if (attr == &dev_attr_status.attr) return ops->get_status ? mode : 0; if (attr == &dev_attr_bypass.attr) return ops->get_bypass ? mode : 0; if (attr == &dev_attr_under_voltage.attr || attr == &dev_attr_over_current.attr || attr == &dev_attr_regulation_out.attr || attr == &dev_attr_fail.attr || attr == &dev_attr_over_temp.attr || attr == &dev_attr_under_voltage_warn.attr || attr == &dev_attr_over_current_warn.attr || attr == &dev_attr_over_voltage_warn.attr || attr == &dev_attr_over_temp_warn.attr) return ops->get_error_flags ? mode : 0; /* constraints need specific supporting methods */ if (attr == &dev_attr_min_microvolts.attr || attr == &dev_attr_max_microvolts.attr) return (ops->set_voltage || ops->set_voltage_sel) ? mode : 0; if (attr == &dev_attr_min_microamps.attr || attr == &dev_attr_max_microamps.attr) return ops->set_current_limit ? mode : 0; if (attr == &dev_attr_suspend_standby_state.attr || attr == &dev_attr_suspend_mem_state.attr || attr == &dev_attr_suspend_disk_state.attr) return mode; if (attr == &dev_attr_suspend_standby_microvolts.attr || attr == &dev_attr_suspend_mem_microvolts.attr || attr == &dev_attr_suspend_disk_microvolts.attr) return ops->set_suspend_voltage ? mode : 0; if (attr == &dev_attr_suspend_standby_mode.attr || attr == &dev_attr_suspend_mem_mode.attr || attr == &dev_attr_suspend_disk_mode.attr) return ops->set_suspend_mode ? mode : 0; return mode; } static const struct attribute_group regulator_dev_group = { .attrs = regulator_dev_attrs, .is_visible = regulator_attr_is_visible, }; static const struct attribute_group *regulator_dev_groups[] = { ®ulator_dev_group, NULL }; static void regulator_dev_release(struct device *dev) { struct regulator_dev *rdev = dev_get_drvdata(dev); debugfs_remove_recursive(rdev->debugfs); kfree(rdev->constraints); of_node_put(rdev->dev.of_node); kfree(rdev); } static void rdev_init_debugfs(struct regulator_dev *rdev) { struct device *parent = rdev->dev.parent; const char *rname = rdev_get_name(rdev); char name[NAME_MAX]; /* Avoid duplicate debugfs directory names */ if (parent && rname == rdev->desc->name) { snprintf(name, sizeof(name), "%s-%s", dev_name(parent), rname); rname = name; } rdev->debugfs = debugfs_create_dir(rname, debugfs_root); if (IS_ERR(rdev->debugfs)) rdev_dbg(rdev, "Failed to create debugfs directory\n"); debugfs_create_u32("use_count", 0444, rdev->debugfs, &rdev->use_count); debugfs_create_u32("open_count", 0444, rdev->debugfs, &rdev->open_count); debugfs_create_u32("bypass_count", 0444, rdev->debugfs, &rdev->bypass_count); } static int regulator_register_resolve_supply(struct device *dev, void *data) { struct regulator_dev *rdev = dev_to_rdev(dev); if (regulator_resolve_supply(rdev)) rdev_dbg(rdev, "unable to resolve supply\n"); return 0; } int regulator_coupler_register(struct regulator_coupler *coupler) { mutex_lock(®ulator_list_mutex); list_add_tail(&coupler->list, ®ulator_coupler_list); mutex_unlock(®ulator_list_mutex); return 0; } static struct regulator_coupler * regulator_find_coupler(struct regulator_dev *rdev) { struct regulator_coupler *coupler; int err; /* * Note that regulators are appended to the list and the generic * coupler is registered first, hence it will be attached at last * if nobody cared. */ list_for_each_entry_reverse(coupler, ®ulator_coupler_list, list) { err = coupler->attach_regulator(coupler, rdev); if (!err) { if (!coupler->balance_voltage && rdev->coupling_desc.n_coupled > 2) goto err_unsupported; return coupler; } if (err < 0) return ERR_PTR(err); if (err == 1) continue; break; } return ERR_PTR(-EINVAL); err_unsupported: if (coupler->detach_regulator) coupler->detach_regulator(coupler, rdev); rdev_err(rdev, "Voltage balancing for multiple regulator couples is unimplemented\n"); return ERR_PTR(-EPERM); } static void regulator_resolve_coupling(struct regulator_dev *rdev) { struct regulator_coupler *coupler = rdev->coupling_desc.coupler; struct coupling_desc *c_desc = &rdev->coupling_desc; int n_coupled = c_desc->n_coupled; struct regulator_dev *c_rdev; int i; for (i = 1; i < n_coupled; i++) { /* already resolved */ if (c_desc->coupled_rdevs[i]) continue; c_rdev = of_parse_coupled_regulator(rdev, i - 1); if (!c_rdev) continue; if (c_rdev->coupling_desc.coupler != coupler) { rdev_err(rdev, "coupler mismatch with %s\n", rdev_get_name(c_rdev)); return; } c_desc->coupled_rdevs[i] = c_rdev; c_desc->n_resolved++; regulator_resolve_coupling(c_rdev); } } static void regulator_remove_coupling(struct regulator_dev *rdev) { struct regulator_coupler *coupler = rdev->coupling_desc.coupler; struct coupling_desc *__c_desc, *c_desc = &rdev->coupling_desc; struct regulator_dev *__c_rdev, *c_rdev; unsigned int __n_coupled, n_coupled; int i, k; int err; n_coupled = c_desc->n_coupled; for (i = 1; i < n_coupled; i++) { c_rdev = c_desc->coupled_rdevs[i]; if (!c_rdev) continue; regulator_lock(c_rdev); __c_desc = &c_rdev->coupling_desc; __n_coupled = __c_desc->n_coupled; for (k = 1; k < __n_coupled; k++) { __c_rdev = __c_desc->coupled_rdevs[k]; if (__c_rdev == rdev) { __c_desc->coupled_rdevs[k] = NULL; __c_desc->n_resolved--; break; } } regulator_unlock(c_rdev); c_desc->coupled_rdevs[i] = NULL; c_desc->n_resolved--; } if (coupler && coupler->detach_regulator) { err = coupler->detach_regulator(coupler, rdev); if (err) rdev_err(rdev, "failed to detach from coupler: %pe\n", ERR_PTR(err)); } kfree(rdev->coupling_desc.coupled_rdevs); rdev->coupling_desc.coupled_rdevs = NULL; } static int regulator_init_coupling(struct regulator_dev *rdev) { struct regulator_dev **coupled; int err, n_phandles; if (!IS_ENABLED(CONFIG_OF)) n_phandles = 0; else n_phandles = of_get_n_coupled(rdev); coupled = kcalloc(n_phandles + 1, sizeof(*coupled), GFP_KERNEL); if (!coupled) return -ENOMEM; rdev->coupling_desc.coupled_rdevs = coupled; /* * Every regulator should always have coupling descriptor filled with * at least pointer to itself. */ rdev->coupling_desc.coupled_rdevs[0] = rdev; rdev->coupling_desc.n_coupled = n_phandles + 1; rdev->coupling_desc.n_resolved++; /* regulator isn't coupled */ if (n_phandles == 0) return 0; if (!of_check_coupling_data(rdev)) return -EPERM; mutex_lock(®ulator_list_mutex); rdev->coupling_desc.coupler = regulator_find_coupler(rdev); mutex_unlock(®ulator_list_mutex); if (IS_ERR(rdev->coupling_desc.coupler)) { err = PTR_ERR(rdev->coupling_desc.coupler); rdev_err(rdev, "failed to get coupler: %pe\n", ERR_PTR(err)); return err; } return 0; } static int generic_coupler_attach(struct regulator_coupler *coupler, struct regulator_dev *rdev) { if (rdev->coupling_desc.n_coupled > 2) { rdev_err(rdev, "Voltage balancing for multiple regulator couples is unimplemented\n"); return -EPERM; } if (!rdev->constraints->always_on) { rdev_err(rdev, "Coupling of a non always-on regulator is unimplemented\n"); return -ENOTSUPP; } return 0; } static struct regulator_coupler generic_regulator_coupler = { .attach_regulator = generic_coupler_attach, }; /** * regulator_register - register regulator * @dev: the device that drive the regulator * @regulator_desc: regulator to register * @cfg: runtime configuration for regulator * * Called by regulator drivers to register a regulator. * Returns a valid pointer to struct regulator_dev on success * or an ERR_PTR() on error. */ struct regulator_dev * regulator_register(struct device *dev, const struct regulator_desc *regulator_desc, const struct regulator_config *cfg) { const struct regulator_init_data *init_data; struct regulator_config *config = NULL; static atomic_t regulator_no = ATOMIC_INIT(-1); struct regulator_dev *rdev; bool dangling_cfg_gpiod = false; bool dangling_of_gpiod = false; int ret, i; bool resolved_early = false; if (cfg == NULL) return ERR_PTR(-EINVAL); if (cfg->ena_gpiod) dangling_cfg_gpiod = true; if (regulator_desc == NULL) { ret = -EINVAL; goto rinse; } WARN_ON(!dev || !cfg->dev); if (regulator_desc->name == NULL || regulator_desc->ops == NULL) { ret = -EINVAL; goto rinse; } if (regulator_desc->type != REGULATOR_VOLTAGE && regulator_desc->type != REGULATOR_CURRENT) { ret = -EINVAL; goto rinse; } /* Only one of each should be implemented */ WARN_ON(regulator_desc->ops->get_voltage && regulator_desc->ops->get_voltage_sel); WARN_ON(regulator_desc->ops->set_voltage && regulator_desc->ops->set_voltage_sel); /* If we're using selectors we must implement list_voltage. */ if (regulator_desc->ops->get_voltage_sel && !regulator_desc->ops->list_voltage) { ret = -EINVAL; goto rinse; } if (regulator_desc->ops->set_voltage_sel && !regulator_desc->ops->list_voltage) { ret = -EINVAL; goto rinse; } rdev = kzalloc(sizeof(struct regulator_dev), GFP_KERNEL); if (rdev == NULL) { ret = -ENOMEM; goto rinse; } device_initialize(&rdev->dev); dev_set_drvdata(&rdev->dev, rdev); rdev->dev.class = ®ulator_class; spin_lock_init(&rdev->err_lock); /* * Duplicate the config so the driver could override it after * parsing init data. */ config = kmemdup(cfg, sizeof(*cfg), GFP_KERNEL); if (config == NULL) { ret = -ENOMEM; goto clean; } init_data = regulator_of_get_init_data(dev, regulator_desc, config, &rdev->dev.of_node); /* * Sometimes not all resources are probed already so we need to take * that into account. This happens most the time if the ena_gpiod comes * from a gpio extender or something else. */ if (PTR_ERR(init_data) == -EPROBE_DEFER) { ret = -EPROBE_DEFER; goto clean; } /* * We need to keep track of any GPIO descriptor coming from the * device tree until we have handled it over to the core. If the * config that was passed in to this function DOES NOT contain * a descriptor, and the config after this call DOES contain * a descriptor, we definitely got one from parsing the device * tree. */ if (!cfg->ena_gpiod && config->ena_gpiod) dangling_of_gpiod = true; if (!init_data) { init_data = config->init_data; rdev->dev.of_node = of_node_get(config->of_node); } ww_mutex_init(&rdev->mutex, ®ulator_ww_class); rdev->reg_data = config->driver_data; rdev->owner = regulator_desc->owner; rdev->desc = regulator_desc; if (config->regmap) rdev->regmap = config->regmap; else if (dev_get_regmap(dev, NULL)) rdev->regmap = dev_get_regmap(dev, NULL); else if (dev->parent) rdev->regmap = dev_get_regmap(dev->parent, NULL); INIT_LIST_HEAD(&rdev->consumer_list); INIT_LIST_HEAD(&rdev->list); BLOCKING_INIT_NOTIFIER_HEAD(&rdev->notifier); INIT_DELAYED_WORK(&rdev->disable_work, regulator_disable_work); if (init_data && init_data->supply_regulator) rdev->supply_name = init_data->supply_regulator; else if (regulator_desc->supply_name) rdev->supply_name = regulator_desc->supply_name; /* register with sysfs */ rdev->dev.parent = config->dev; dev_set_name(&rdev->dev, "regulator.%lu", (unsigned long) atomic_inc_return(®ulator_no)); /* set regulator constraints */ if (init_data) rdev->constraints = kmemdup(&init_data->constraints, sizeof(*rdev->constraints), GFP_KERNEL); else rdev->constraints = kzalloc(sizeof(*rdev->constraints), GFP_KERNEL); if (!rdev->constraints) { ret = -ENOMEM; goto wash; } if ((rdev->supply_name && !rdev->supply) && (rdev->constraints->always_on || rdev->constraints->boot_on)) { ret = regulator_resolve_supply(rdev); if (ret) rdev_dbg(rdev, "unable to resolve supply early: %pe\n", ERR_PTR(ret)); resolved_early = true; } /* perform any regulator specific init */ if (init_data && init_data->regulator_init) { ret = init_data->regulator_init(rdev->reg_data); if (ret < 0) goto wash; } if (config->ena_gpiod) { ret = regulator_ena_gpio_request(rdev, config); if (ret != 0) { rdev_err(rdev, "Failed to request enable GPIO: %pe\n", ERR_PTR(ret)); goto wash; } /* The regulator core took over the GPIO descriptor */ dangling_cfg_gpiod = false; dangling_of_gpiod = false; } ret = set_machine_constraints(rdev); if (ret == -EPROBE_DEFER && !resolved_early) { /* Regulator might be in bypass mode and so needs its supply * to set the constraints */ /* FIXME: this currently triggers a chicken-and-egg problem * when creating -SUPPLY symlink in sysfs to a regulator * that is just being created */ rdev_dbg(rdev, "will resolve supply early: %s\n", rdev->supply_name); ret = regulator_resolve_supply(rdev); if (!ret) ret = set_machine_constraints(rdev); else rdev_dbg(rdev, "unable to resolve supply early: %pe\n", ERR_PTR(ret)); } if (ret < 0) goto wash; ret = regulator_init_coupling(rdev); if (ret < 0) goto wash; /* add consumers devices */ if (init_data) { for (i = 0; i < init_data->num_consumer_supplies; i++) { ret = set_consumer_device_supply(rdev, init_data->consumer_supplies[i].dev_name, init_data->consumer_supplies[i].supply); if (ret < 0) { dev_err(dev, "Failed to set supply %s\n", init_data->consumer_supplies[i].supply); goto unset_supplies; } } } if (!rdev->desc->ops->get_voltage && !rdev->desc->ops->list_voltage && !rdev->desc->fixed_uV) rdev->is_switch = true; ret = device_add(&rdev->dev); if (ret != 0) goto unset_supplies; rdev_init_debugfs(rdev); /* try to resolve regulators coupling since a new one was registered */ mutex_lock(®ulator_list_mutex); regulator_resolve_coupling(rdev); mutex_unlock(®ulator_list_mutex); /* try to resolve regulators supply since a new one was registered */ class_for_each_device(®ulator_class, NULL, NULL, regulator_register_resolve_supply); kfree(config); return rdev; unset_supplies: mutex_lock(®ulator_list_mutex); unset_regulator_supplies(rdev); regulator_remove_coupling(rdev); mutex_unlock(®ulator_list_mutex); wash: regulator_put(rdev->supply); kfree(rdev->coupling_desc.coupled_rdevs); mutex_lock(®ulator_list_mutex); regulator_ena_gpio_free(rdev); mutex_unlock(®ulator_list_mutex); clean: if (dangling_of_gpiod) gpiod_put(config->ena_gpiod); kfree(config); put_device(&rdev->dev); rinse: if (dangling_cfg_gpiod) gpiod_put(cfg->ena_gpiod); return ERR_PTR(ret); } EXPORT_SYMBOL_GPL(regulator_register); /** * regulator_unregister - unregister regulator * @rdev: regulator to unregister * * Called by regulator drivers to unregister a regulator. */ void regulator_unregister(struct regulator_dev *rdev) { if (rdev == NULL) return; if (rdev->supply) { while (rdev->use_count--) regulator_disable(rdev->supply); regulator_put(rdev->supply); } flush_work(&rdev->disable_work.work); mutex_lock(®ulator_list_mutex); WARN_ON(rdev->open_count); regulator_remove_coupling(rdev); unset_regulator_supplies(rdev); list_del(&rdev->list); regulator_ena_gpio_free(rdev); device_unregister(&rdev->dev); mutex_unlock(®ulator_list_mutex); } EXPORT_SYMBOL_GPL(regulator_unregister); #ifdef CONFIG_SUSPEND /** * regulator_suspend - prepare regulators for system wide suspend * @dev: ``&struct device`` pointer that is passed to _regulator_suspend() * * Configure each regulator with it's suspend operating parameters for state. */ static int regulator_suspend(struct device *dev) { struct regulator_dev *rdev = dev_to_rdev(dev); suspend_state_t state = pm_suspend_target_state; int ret; const struct regulator_state *rstate; rstate = regulator_get_suspend_state_check(rdev, state); if (!rstate) return 0; regulator_lock(rdev); ret = __suspend_set_state(rdev, rstate); regulator_unlock(rdev); return ret; } static int regulator_resume(struct device *dev) { suspend_state_t state = pm_suspend_target_state; struct regulator_dev *rdev = dev_to_rdev(dev); struct regulator_state *rstate; int ret = 0; rstate = regulator_get_suspend_state(rdev, state); if (rstate == NULL) return 0; /* Avoid grabbing the lock if we don't need to */ if (!rdev->desc->ops->resume) return 0; regulator_lock(rdev); if (rstate->enabled == ENABLE_IN_SUSPEND || rstate->enabled == DISABLE_IN_SUSPEND) ret = rdev->desc->ops->resume(rdev); regulator_unlock(rdev); return ret; } #else /* !CONFIG_SUSPEND */ #define regulator_suspend NULL #define regulator_resume NULL #endif /* !CONFIG_SUSPEND */ #ifdef CONFIG_PM static const struct dev_pm_ops __maybe_unused regulator_pm_ops = { .suspend = regulator_suspend, .resume = regulator_resume, }; #endif const struct class regulator_class = { .name = "regulator", .dev_release = regulator_dev_release, .dev_groups = regulator_dev_groups, #ifdef CONFIG_PM .pm = ®ulator_pm_ops, #endif }; /** * regulator_has_full_constraints - the system has fully specified constraints * * Calling this function will cause the regulator API to disable all * regulators which have a zero use count and don't have an always_on * constraint in a late_initcall. * * The intention is that this will become the default behaviour in a * future kernel release so users are encouraged to use this facility * now. */ void regulator_has_full_constraints(void) { has_full_constraints = 1; } EXPORT_SYMBOL_GPL(regulator_has_full_constraints); /** * rdev_get_drvdata - get rdev regulator driver data * @rdev: regulator * * Get rdev regulator driver private data. This call can be used in the * regulator driver context. */ void *rdev_get_drvdata(struct regulator_dev *rdev) { return rdev->reg_data; } EXPORT_SYMBOL_GPL(rdev_get_drvdata); /** * regulator_get_drvdata - get regulator driver data * @regulator: regulator * * Get regulator driver private data. This call can be used in the consumer * driver context when non API regulator specific functions need to be called. */ void *regulator_get_drvdata(struct regulator *regulator) { return regulator->rdev->reg_data; } EXPORT_SYMBOL_GPL(regulator_get_drvdata); /** * regulator_set_drvdata - set regulator driver data * @regulator: regulator * @data: data */ void regulator_set_drvdata(struct regulator *regulator, void *data) { regulator->rdev->reg_data = data; } EXPORT_SYMBOL_GPL(regulator_set_drvdata); /** * rdev_get_id - get regulator ID * @rdev: regulator */ int rdev_get_id(struct regulator_dev *rdev) { return rdev->desc->id; } EXPORT_SYMBOL_GPL(rdev_get_id); struct device *rdev_get_dev(struct regulator_dev *rdev) { return &rdev->dev; } EXPORT_SYMBOL_GPL(rdev_get_dev); struct regmap *rdev_get_regmap(struct regulator_dev *rdev) { return rdev->regmap; } EXPORT_SYMBOL_GPL(rdev_get_regmap); void *regulator_get_init_drvdata(struct regulator_init_data *reg_init_data) { return reg_init_data->driver_data; } EXPORT_SYMBOL_GPL(regulator_get_init_drvdata); #ifdef CONFIG_DEBUG_FS static int supply_map_show(struct seq_file *sf, void *data) { struct regulator_map *map; list_for_each_entry(map, ®ulator_map_list, list) { seq_printf(sf, "%s -> %s.%s\n", rdev_get_name(map->regulator), map->dev_name, map->supply); } return 0; } DEFINE_SHOW_ATTRIBUTE(supply_map); struct summary_data { struct seq_file *s; struct regulator_dev *parent; int level; }; static void regulator_summary_show_subtree(struct seq_file *s, struct regulator_dev *rdev, int level); static int regulator_summary_show_children(struct device *dev, void *data) { struct regulator_dev *rdev = dev_to_rdev(dev); struct summary_data *summary_data = data; if (rdev->supply && rdev->supply->rdev == summary_data->parent) regulator_summary_show_subtree(summary_data->s, rdev, summary_data->level + 1); return 0; } static void regulator_summary_show_subtree(struct seq_file *s, struct regulator_dev *rdev, int level) { struct regulation_constraints *c; struct regulator *consumer; struct summary_data summary_data; unsigned int opmode; if (!rdev) return; opmode = _regulator_get_mode_unlocked(rdev); seq_printf(s, "%*s%-*s %3d %4d %6d %7s ", level * 3 + 1, "", 30 - level * 3, rdev_get_name(rdev), rdev->use_count, rdev->open_count, rdev->bypass_count, regulator_opmode_to_str(opmode)); seq_printf(s, "%5dmV ", regulator_get_voltage_rdev(rdev) / 1000); seq_printf(s, "%5dmA ", _regulator_get_current_limit_unlocked(rdev) / 1000); c = rdev->constraints; if (c) { switch (rdev->desc->type) { case REGULATOR_VOLTAGE: seq_printf(s, "%5dmV %5dmV ", c->min_uV / 1000, c->max_uV / 1000); break; case REGULATOR_CURRENT: seq_printf(s, "%5dmA %5dmA ", c->min_uA / 1000, c->max_uA / 1000); break; } } seq_puts(s, "\n"); list_for_each_entry(consumer, &rdev->consumer_list, list) { if (consumer->dev && consumer->dev->class == ®ulator_class) continue; seq_printf(s, "%*s%-*s ", (level + 1) * 3 + 1, "", 30 - (level + 1) * 3, consumer->supply_name ? consumer->supply_name : consumer->dev ? dev_name(consumer->dev) : "deviceless"); switch (rdev->desc->type) { case REGULATOR_VOLTAGE: seq_printf(s, "%3d %33dmA%c%5dmV %5dmV", consumer->enable_count, consumer->uA_load / 1000, consumer->uA_load && !consumer->enable_count ? '*' : ' ', consumer->voltage[PM_SUSPEND_ON].min_uV / 1000, consumer->voltage[PM_SUSPEND_ON].max_uV / 1000); break; case REGULATOR_CURRENT: break; } seq_puts(s, "\n"); } summary_data.s = s; summary_data.level = level; summary_data.parent = rdev; class_for_each_device(®ulator_class, NULL, &summary_data, regulator_summary_show_children); } struct summary_lock_data { struct ww_acquire_ctx *ww_ctx; struct regulator_dev **new_contended_rdev; struct regulator_dev **old_contended_rdev; }; static int regulator_summary_lock_one(struct device *dev, void *data) { struct regulator_dev *rdev = dev_to_rdev(dev); struct summary_lock_data *lock_data = data; int ret = 0; if (rdev != *lock_data->old_contended_rdev) { ret = regulator_lock_nested(rdev, lock_data->ww_ctx); if (ret == -EDEADLK) *lock_data->new_contended_rdev = rdev; else WARN_ON_ONCE(ret); } else { *lock_data->old_contended_rdev = NULL; } return ret; } static int regulator_summary_unlock_one(struct device *dev, void *data) { struct regulator_dev *rdev = dev_to_rdev(dev); struct summary_lock_data *lock_data = data; if (lock_data) { if (rdev == *lock_data->new_contended_rdev) return -EDEADLK; } regulator_unlock(rdev); return 0; } static int regulator_summary_lock_all(struct ww_acquire_ctx *ww_ctx, struct regulator_dev **new_contended_rdev, struct regulator_dev **old_contended_rdev) { struct summary_lock_data lock_data; int ret; lock_data.ww_ctx = ww_ctx; lock_data.new_contended_rdev = new_contended_rdev; lock_data.old_contended_rdev = old_contended_rdev; ret = class_for_each_device(®ulator_class, NULL, &lock_data, regulator_summary_lock_one); if (ret) class_for_each_device(®ulator_class, NULL, &lock_data, regulator_summary_unlock_one); return ret; } static void regulator_summary_lock(struct ww_acquire_ctx *ww_ctx) { struct regulator_dev *new_contended_rdev = NULL; struct regulator_dev *old_contended_rdev = NULL; int err; mutex_lock(®ulator_list_mutex); ww_acquire_init(ww_ctx, ®ulator_ww_class); do { if (new_contended_rdev) { ww_mutex_lock_slow(&new_contended_rdev->mutex, ww_ctx); old_contended_rdev = new_contended_rdev; old_contended_rdev->ref_cnt++; old_contended_rdev->mutex_owner = current; } err = regulator_summary_lock_all(ww_ctx, &new_contended_rdev, &old_contended_rdev); if (old_contended_rdev) regulator_unlock(old_contended_rdev); } while (err == -EDEADLK); ww_acquire_done(ww_ctx); } static void regulator_summary_unlock(struct ww_acquire_ctx *ww_ctx) { class_for_each_device(®ulator_class, NULL, NULL, regulator_summary_unlock_one); ww_acquire_fini(ww_ctx); mutex_unlock(®ulator_list_mutex); } static int regulator_summary_show_roots(struct device *dev, void *data) { struct regulator_dev *rdev = dev_to_rdev(dev); struct seq_file *s = data; if (!rdev->supply) regulator_summary_show_subtree(s, rdev, 0); return 0; } static int regulator_summary_show(struct seq_file *s, void *data) { struct ww_acquire_ctx ww_ctx; seq_puts(s, " regulator use open bypass opmode voltage current min max\n"); seq_puts(s, "---------------------------------------------------------------------------------------\n"); regulator_summary_lock(&ww_ctx); class_for_each_device(®ulator_class, NULL, s, regulator_summary_show_roots); regulator_summary_unlock(&ww_ctx); return 0; } DEFINE_SHOW_ATTRIBUTE(regulator_summary); #endif /* CONFIG_DEBUG_FS */ static int __init regulator_init(void) { int ret; ret = class_register(®ulator_class); debugfs_root = debugfs_create_dir("regulator", NULL); if (IS_ERR(debugfs_root)) pr_debug("regulator: Failed to create debugfs directory\n"); #ifdef CONFIG_DEBUG_FS debugfs_create_file("supply_map", 0444, debugfs_root, NULL, &supply_map_fops); debugfs_create_file("regulator_summary", 0444, debugfs_root, NULL, ®ulator_summary_fops); #endif regulator_dummy_init(); regulator_coupler_register(&generic_regulator_coupler); return ret; } /* init early to allow our consumers to complete system booting */ core_initcall(regulator_init); static int regulator_late_cleanup(struct device *dev, void *data) { struct regulator_dev *rdev = dev_to_rdev(dev); struct regulation_constraints *c = rdev->constraints; int ret; if (c && c->always_on) return 0; if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_STATUS)) return 0; regulator_lock(rdev); if (rdev->use_count) goto unlock; /* If reading the status failed, assume that it's off. */ if (_regulator_is_enabled(rdev) <= 0) goto unlock; if (have_full_constraints()) { /* We log since this may kill the system if it goes * wrong. */ rdev_info(rdev, "disabling\n"); ret = _regulator_do_disable(rdev); if (ret != 0) rdev_err(rdev, "couldn't disable: %pe\n", ERR_PTR(ret)); } else { /* The intention is that in future we will * assume that full constraints are provided * so warn even if we aren't going to do * anything here. */ rdev_warn(rdev, "incomplete constraints, leaving on\n"); } unlock: regulator_unlock(rdev); return 0; } static bool regulator_ignore_unused; static int __init regulator_ignore_unused_setup(char *__unused) { regulator_ignore_unused = true; return 1; } __setup("regulator_ignore_unused", regulator_ignore_unused_setup); static void regulator_init_complete_work_function(struct work_struct *work) { /* * Regulators may had failed to resolve their input supplies * when were registered, either because the input supply was * not registered yet or because its parent device was not * bound yet. So attempt to resolve the input supplies for * pending regulators before trying to disable unused ones. */ class_for_each_device(®ulator_class, NULL, NULL, regulator_register_resolve_supply); /* * For debugging purposes, it may be useful to prevent unused * regulators from being disabled. */ if (regulator_ignore_unused) { pr_warn("regulator: Not disabling unused regulators\n"); return; } /* If we have a full configuration then disable any regulators * we have permission to change the status for and which are * not in use or always_on. This is effectively the default * for DT and ACPI as they have full constraints. */ class_for_each_device(®ulator_class, NULL, NULL, regulator_late_cleanup); } static DECLARE_DELAYED_WORK(regulator_init_complete_work, regulator_init_complete_work_function); static int __init regulator_init_complete(void) { /* * Since DT doesn't provide an idiomatic mechanism for * enabling full constraints and since it's much more natural * with DT to provide them just assume that a DT enabled * system has full constraints. */ if (of_have_populated_dt()) has_full_constraints = true; /* * We punt completion for an arbitrary amount of time since * systems like distros will load many drivers from userspace * so consumers might not always be ready yet, this is * particularly an issue with laptops where this might bounce * the display off then on. Ideally we'd get a notification * from userspace when this happens but we don't so just wait * a bit and hope we waited long enough. It'd be better if * we'd only do this on systems that need it, and a kernel * command line option might be useful. */ schedule_delayed_work(®ulator_init_complete_work, msecs_to_jiffies(30000)); return 0; } late_initcall_sync(regulator_init_complete); |
68 255 14 162 263 283 260 255 276 241 239 15 9 8 24 17 15 16 12 8 8 263 243 8 29 260 1 22 249 255 24 12 11 258 290 236 23 11 9 107 78 16 6 20 22 21 14 11 11 225 246 14 241 236 230 228 143 6 13 130 5 133 129 9 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Scatterlist Cryptographic API. * * Copyright (c) 2002 James Morris <jmorris@intercode.com.au> * Copyright (c) 2002 David S. Miller (davem@redhat.com) * Copyright (c) 2005 Herbert Xu <herbert@gondor.apana.org.au> * * Portions derived from Cryptoapi, by Alexander Kjeldaas <astor@fast.no> * and Nettle, by Niels Möller. */ #include <linux/err.h> #include <linux/errno.h> #include <linux/jump_label.h> #include <linux/kernel.h> #include <linux/kmod.h> #include <linux/module.h> #include <linux/param.h> #include <linux/sched/signal.h> #include <linux/slab.h> #include <linux/string.h> #include <linux/completion.h> #include "internal.h" LIST_HEAD(crypto_alg_list); EXPORT_SYMBOL_GPL(crypto_alg_list); DECLARE_RWSEM(crypto_alg_sem); EXPORT_SYMBOL_GPL(crypto_alg_sem); BLOCKING_NOTIFIER_HEAD(crypto_chain); EXPORT_SYMBOL_GPL(crypto_chain); #ifndef CONFIG_CRYPTO_MANAGER_DISABLE_TESTS DEFINE_STATIC_KEY_FALSE(__crypto_boot_test_finished); EXPORT_SYMBOL_GPL(__crypto_boot_test_finished); #endif static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg); struct crypto_alg *crypto_mod_get(struct crypto_alg *alg) { return try_module_get(alg->cra_module) ? crypto_alg_get(alg) : NULL; } EXPORT_SYMBOL_GPL(crypto_mod_get); void crypto_mod_put(struct crypto_alg *alg) { struct module *module = alg->cra_module; crypto_alg_put(alg); module_put(module); } EXPORT_SYMBOL_GPL(crypto_mod_put); static struct crypto_alg *__crypto_alg_lookup(const char *name, u32 type, u32 mask) { struct crypto_alg *q, *alg = NULL; int best = -2; list_for_each_entry(q, &crypto_alg_list, cra_list) { int exact, fuzzy; if (crypto_is_moribund(q)) continue; if ((q->cra_flags ^ type) & mask) continue; if (crypto_is_larval(q) && !crypto_is_test_larval((struct crypto_larval *)q) && ((struct crypto_larval *)q)->mask != mask) continue; exact = !strcmp(q->cra_driver_name, name); fuzzy = !strcmp(q->cra_name, name); if (!exact && !(fuzzy && q->cra_priority > best)) continue; if (unlikely(!crypto_mod_get(q))) continue; best = q->cra_priority; if (alg) crypto_mod_put(alg); alg = q; if (exact) break; } return alg; } static void crypto_larval_destroy(struct crypto_alg *alg) { struct crypto_larval *larval = (void *)alg; BUG_ON(!crypto_is_larval(alg)); if (!IS_ERR_OR_NULL(larval->adult)) crypto_mod_put(larval->adult); kfree(larval); } struct crypto_larval *crypto_larval_alloc(const char *name, u32 type, u32 mask) { struct crypto_larval *larval; larval = kzalloc(sizeof(*larval), GFP_KERNEL); if (!larval) return ERR_PTR(-ENOMEM); larval->mask = mask; larval->alg.cra_flags = CRYPTO_ALG_LARVAL | type; larval->alg.cra_priority = -1; larval->alg.cra_destroy = crypto_larval_destroy; strscpy(larval->alg.cra_name, name, CRYPTO_MAX_ALG_NAME); init_completion(&larval->completion); return larval; } EXPORT_SYMBOL_GPL(crypto_larval_alloc); static struct crypto_alg *crypto_larval_add(const char *name, u32 type, u32 mask) { struct crypto_alg *alg; struct crypto_larval *larval; larval = crypto_larval_alloc(name, type, mask); if (IS_ERR(larval)) return ERR_CAST(larval); refcount_set(&larval->alg.cra_refcnt, 2); down_write(&crypto_alg_sem); alg = __crypto_alg_lookup(name, type, mask); if (!alg) { alg = &larval->alg; list_add(&alg->cra_list, &crypto_alg_list); } up_write(&crypto_alg_sem); if (alg != &larval->alg) { kfree(larval); if (crypto_is_larval(alg)) alg = crypto_larval_wait(alg); } return alg; } void crypto_larval_kill(struct crypto_alg *alg) { struct crypto_larval *larval = (void *)alg; down_write(&crypto_alg_sem); list_del(&alg->cra_list); up_write(&crypto_alg_sem); complete_all(&larval->completion); crypto_alg_put(alg); } EXPORT_SYMBOL_GPL(crypto_larval_kill); void crypto_wait_for_test(struct crypto_larval *larval) { int err; err = crypto_probing_notify(CRYPTO_MSG_ALG_REGISTER, larval->adult); if (WARN_ON_ONCE(err != NOTIFY_STOP)) goto out; err = wait_for_completion_killable(&larval->completion); WARN_ON(err); out: crypto_larval_kill(&larval->alg); } EXPORT_SYMBOL_GPL(crypto_wait_for_test); static void crypto_start_test(struct crypto_larval *larval) { if (!crypto_is_test_larval(larval)) return; if (larval->test_started) return; down_write(&crypto_alg_sem); if (larval->test_started) { up_write(&crypto_alg_sem); return; } larval->test_started = true; up_write(&crypto_alg_sem); crypto_wait_for_test(larval); } static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg) { struct crypto_larval *larval = (void *)alg; long timeout; if (!crypto_boot_test_finished()) crypto_start_test(larval); timeout = wait_for_completion_killable_timeout( &larval->completion, 60 * HZ); alg = larval->adult; if (timeout < 0) alg = ERR_PTR(-EINTR); else if (!timeout) alg = ERR_PTR(-ETIMEDOUT); else if (!alg) alg = ERR_PTR(-ENOENT); else if (IS_ERR(alg)) ; else if (crypto_is_test_larval(larval) && !(alg->cra_flags & CRYPTO_ALG_TESTED)) alg = ERR_PTR(-EAGAIN); else if (alg->cra_flags & CRYPTO_ALG_FIPS_INTERNAL) alg = ERR_PTR(-EAGAIN); else if (!crypto_mod_get(alg)) alg = ERR_PTR(-EAGAIN); crypto_mod_put(&larval->alg); return alg; } static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type, u32 mask) { const u32 fips = CRYPTO_ALG_FIPS_INTERNAL; struct crypto_alg *alg; u32 test = 0; if (!((type | mask) & CRYPTO_ALG_TESTED)) test |= CRYPTO_ALG_TESTED; down_read(&crypto_alg_sem); alg = __crypto_alg_lookup(name, (type | test) & ~fips, (mask | test) & ~fips); if (alg) { if (((type | mask) ^ fips) & fips) mask |= fips; mask &= fips; if (!crypto_is_larval(alg) && ((type ^ alg->cra_flags) & mask)) { /* Algorithm is disallowed in FIPS mode. */ crypto_mod_put(alg); alg = ERR_PTR(-ENOENT); } } else if (test) { alg = __crypto_alg_lookup(name, type, mask); if (alg && !crypto_is_larval(alg)) { /* Test failed */ crypto_mod_put(alg); alg = ERR_PTR(-ELIBBAD); } } up_read(&crypto_alg_sem); return alg; } static struct crypto_alg *crypto_larval_lookup(const char *name, u32 type, u32 mask) { struct crypto_alg *alg; if (!name) return ERR_PTR(-ENOENT); type &= ~(CRYPTO_ALG_LARVAL | CRYPTO_ALG_DEAD); mask &= ~(CRYPTO_ALG_LARVAL | CRYPTO_ALG_DEAD); alg = crypto_alg_lookup(name, type, mask); if (!alg && !(mask & CRYPTO_NOLOAD)) { request_module("crypto-%s", name); if (!((type ^ CRYPTO_ALG_NEED_FALLBACK) & mask & CRYPTO_ALG_NEED_FALLBACK)) request_module("crypto-%s-all", name); alg = crypto_alg_lookup(name, type, mask); } if (!IS_ERR_OR_NULL(alg) && crypto_is_larval(alg)) alg = crypto_larval_wait(alg); else if (!alg) alg = crypto_larval_add(name, type, mask); return alg; } int crypto_probing_notify(unsigned long val, void *v) { int ok; ok = blocking_notifier_call_chain(&crypto_chain, val, v); if (ok == NOTIFY_DONE) { request_module("cryptomgr"); ok = blocking_notifier_call_chain(&crypto_chain, val, v); } return ok; } EXPORT_SYMBOL_GPL(crypto_probing_notify); struct crypto_alg *crypto_alg_mod_lookup(const char *name, u32 type, u32 mask) { struct crypto_alg *alg; struct crypto_alg *larval; int ok; /* * If the internal flag is set for a cipher, require a caller to * invoke the cipher with the internal flag to use that cipher. * Also, if a caller wants to allocate a cipher that may or may * not be an internal cipher, use type | CRYPTO_ALG_INTERNAL and * !(mask & CRYPTO_ALG_INTERNAL). */ if (!((type | mask) & CRYPTO_ALG_INTERNAL)) mask |= CRYPTO_ALG_INTERNAL; larval = crypto_larval_lookup(name, type, mask); if (IS_ERR(larval) || !crypto_is_larval(larval)) return larval; ok = crypto_probing_notify(CRYPTO_MSG_ALG_REQUEST, larval); if (ok == NOTIFY_STOP) alg = crypto_larval_wait(larval); else { crypto_mod_put(larval); alg = ERR_PTR(-ENOENT); } crypto_larval_kill(larval); return alg; } EXPORT_SYMBOL_GPL(crypto_alg_mod_lookup); static void crypto_exit_ops(struct crypto_tfm *tfm) { const struct crypto_type *type = tfm->__crt_alg->cra_type; if (type && tfm->exit) tfm->exit(tfm); } static unsigned int crypto_ctxsize(struct crypto_alg *alg, u32 type, u32 mask) { const struct crypto_type *type_obj = alg->cra_type; unsigned int len; len = alg->cra_alignmask & ~(crypto_tfm_ctx_alignment() - 1); if (type_obj) return len + type_obj->ctxsize(alg, type, mask); switch (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) { default: BUG(); case CRYPTO_ALG_TYPE_CIPHER: len += crypto_cipher_ctxsize(alg); break; case CRYPTO_ALG_TYPE_COMPRESS: len += crypto_compress_ctxsize(alg); break; } return len; } void crypto_shoot_alg(struct crypto_alg *alg) { down_write(&crypto_alg_sem); alg->cra_flags |= CRYPTO_ALG_DYING; up_write(&crypto_alg_sem); } EXPORT_SYMBOL_GPL(crypto_shoot_alg); struct crypto_tfm *__crypto_alloc_tfmgfp(struct crypto_alg *alg, u32 type, u32 mask, gfp_t gfp) { struct crypto_tfm *tfm; unsigned int tfm_size; int err = -ENOMEM; tfm_size = sizeof(*tfm) + crypto_ctxsize(alg, type, mask); tfm = kzalloc(tfm_size, gfp); if (tfm == NULL) goto out_err; tfm->__crt_alg = alg; refcount_set(&tfm->refcnt, 1); if (!tfm->exit && alg->cra_init && (err = alg->cra_init(tfm))) goto cra_init_failed; goto out; cra_init_failed: crypto_exit_ops(tfm); if (err == -EAGAIN) crypto_shoot_alg(alg); kfree(tfm); out_err: tfm = ERR_PTR(err); out: return tfm; } EXPORT_SYMBOL_GPL(__crypto_alloc_tfmgfp); struct crypto_tfm *__crypto_alloc_tfm(struct crypto_alg *alg, u32 type, u32 mask) { return __crypto_alloc_tfmgfp(alg, type, mask, GFP_KERNEL); } EXPORT_SYMBOL_GPL(__crypto_alloc_tfm); /* * crypto_alloc_base - Locate algorithm and allocate transform * @alg_name: Name of algorithm * @type: Type of algorithm * @mask: Mask for type comparison * * This function should not be used by new algorithm types. * Please use crypto_alloc_tfm instead. * * crypto_alloc_base() will first attempt to locate an already loaded * algorithm. If that fails and the kernel supports dynamically loadable * modules, it will then attempt to load a module of the same name or * alias. If that fails it will send a query to any loaded crypto manager * to construct an algorithm on the fly. A refcount is grabbed on the * algorithm which is then associated with the new transform. * * The returned transform is of a non-determinate type. Most people * should use one of the more specific allocation functions such as * crypto_alloc_skcipher(). * * In case of error the return value is an error pointer. */ struct crypto_tfm *crypto_alloc_base(const char *alg_name, u32 type, u32 mask) { struct crypto_tfm *tfm; int err; for (;;) { struct crypto_alg *alg; alg = crypto_alg_mod_lookup(alg_name, type, mask); if (IS_ERR(alg)) { err = PTR_ERR(alg); goto err; } tfm = __crypto_alloc_tfm(alg, type, mask); if (!IS_ERR(tfm)) return tfm; crypto_mod_put(alg); err = PTR_ERR(tfm); err: if (err != -EAGAIN) break; if (fatal_signal_pending(current)) { err = -EINTR; break; } } return ERR_PTR(err); } EXPORT_SYMBOL_GPL(crypto_alloc_base); static void *crypto_alloc_tfmmem(struct crypto_alg *alg, const struct crypto_type *frontend, int node, gfp_t gfp) { struct crypto_tfm *tfm; unsigned int tfmsize; unsigned int total; char *mem; tfmsize = frontend->tfmsize; total = tfmsize + sizeof(*tfm) + frontend->extsize(alg); mem = kzalloc_node(total, gfp, node); if (mem == NULL) return ERR_PTR(-ENOMEM); tfm = (struct crypto_tfm *)(mem + tfmsize); tfm->__crt_alg = alg; tfm->node = node; refcount_set(&tfm->refcnt, 1); return mem; } void *crypto_create_tfm_node(struct crypto_alg *alg, const struct crypto_type *frontend, int node) { struct crypto_tfm *tfm; char *mem; int err; mem = crypto_alloc_tfmmem(alg, frontend, node, GFP_KERNEL); if (IS_ERR(mem)) goto out; tfm = (struct crypto_tfm *)(mem + frontend->tfmsize); err = frontend->init_tfm(tfm); if (err) goto out_free_tfm; if (!tfm->exit && alg->cra_init && (err = alg->cra_init(tfm))) goto cra_init_failed; goto out; cra_init_failed: crypto_exit_ops(tfm); out_free_tfm: if (err == -EAGAIN) crypto_shoot_alg(alg); kfree(mem); mem = ERR_PTR(err); out: return mem; } EXPORT_SYMBOL_GPL(crypto_create_tfm_node); void *crypto_clone_tfm(const struct crypto_type *frontend, struct crypto_tfm *otfm) { struct crypto_alg *alg = otfm->__crt_alg; struct crypto_tfm *tfm; char *mem; mem = ERR_PTR(-ESTALE); if (unlikely(!crypto_mod_get(alg))) goto out; mem = crypto_alloc_tfmmem(alg, frontend, otfm->node, GFP_ATOMIC); if (IS_ERR(mem)) { crypto_mod_put(alg); goto out; } tfm = (struct crypto_tfm *)(mem + frontend->tfmsize); tfm->crt_flags = otfm->crt_flags; tfm->exit = otfm->exit; out: return mem; } EXPORT_SYMBOL_GPL(crypto_clone_tfm); struct crypto_alg *crypto_find_alg(const char *alg_name, const struct crypto_type *frontend, u32 type, u32 mask) { if (frontend) { type &= frontend->maskclear; mask &= frontend->maskclear; type |= frontend->type; mask |= frontend->maskset; } return crypto_alg_mod_lookup(alg_name, type, mask); } EXPORT_SYMBOL_GPL(crypto_find_alg); /* * crypto_alloc_tfm_node - Locate algorithm and allocate transform * @alg_name: Name of algorithm * @frontend: Frontend algorithm type * @type: Type of algorithm * @mask: Mask for type comparison * @node: NUMA node in which users desire to put requests, if node is * NUMA_NO_NODE, it means users have no special requirement. * * crypto_alloc_tfm() will first attempt to locate an already loaded * algorithm. If that fails and the kernel supports dynamically loadable * modules, it will then attempt to load a module of the same name or * alias. If that fails it will send a query to any loaded crypto manager * to construct an algorithm on the fly. A refcount is grabbed on the * algorithm which is then associated with the new transform. * * The returned transform is of a non-determinate type. Most people * should use one of the more specific allocation functions such as * crypto_alloc_skcipher(). * * In case of error the return value is an error pointer. */ void *crypto_alloc_tfm_node(const char *alg_name, const struct crypto_type *frontend, u32 type, u32 mask, int node) { void *tfm; int err; for (;;) { struct crypto_alg *alg; alg = crypto_find_alg(alg_name, frontend, type, mask); if (IS_ERR(alg)) { err = PTR_ERR(alg); goto err; } tfm = crypto_create_tfm_node(alg, frontend, node); if (!IS_ERR(tfm)) return tfm; crypto_mod_put(alg); err = PTR_ERR(tfm); err: if (err != -EAGAIN) break; if (fatal_signal_pending(current)) { err = -EINTR; break; } } return ERR_PTR(err); } EXPORT_SYMBOL_GPL(crypto_alloc_tfm_node); /* * crypto_destroy_tfm - Free crypto transform * @mem: Start of tfm slab * @tfm: Transform to free * * This function frees up the transform and any associated resources, * then drops the refcount on the associated algorithm. */ void crypto_destroy_tfm(void *mem, struct crypto_tfm *tfm) { struct crypto_alg *alg; if (IS_ERR_OR_NULL(mem)) return; if (!refcount_dec_and_test(&tfm->refcnt)) return; alg = tfm->__crt_alg; if (!tfm->exit && alg->cra_exit) alg->cra_exit(tfm); crypto_exit_ops(tfm); crypto_mod_put(alg); kfree_sensitive(mem); } EXPORT_SYMBOL_GPL(crypto_destroy_tfm); int crypto_has_alg(const char *name, u32 type, u32 mask) { int ret = 0; struct crypto_alg *alg = crypto_alg_mod_lookup(name, type, mask); if (!IS_ERR(alg)) { crypto_mod_put(alg); ret = 1; } return ret; } EXPORT_SYMBOL_GPL(crypto_has_alg); void crypto_req_done(void *data, int err) { struct crypto_wait *wait = data; if (err == -EINPROGRESS) return; wait->err = err; complete(&wait->completion); } EXPORT_SYMBOL_GPL(crypto_req_done); MODULE_DESCRIPTION("Cryptographic core API"); MODULE_LICENSE("GPL"); |
1 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 | // SPDX-License-Identifier: GPL-2.0-or-later /* * HID driver for some logitech "special" devices * * Copyright (c) 1999 Andreas Gal * Copyright (c) 2000-2005 Vojtech Pavlik <vojtech@suse.cz> * Copyright (c) 2005 Michael Haboustak <mike-@cinci.rr.com> for Concept2, Inc * Copyright (c) 2006-2007 Jiri Kosina * Copyright (c) 2008 Jiri Slaby * Copyright (c) 2010 Hendrik Iben */ /* */ #include <linux/device.h> #include <linux/hid.h> #include <linux/module.h> #include <linux/random.h> #include <linux/sched.h> #include <linux/usb.h> #include <linux/wait.h> #include "usbhid/usbhid.h" #include "hid-ids.h" #include "hid-lg.h" #include "hid-lg4ff.h" #define LG_RDESC 0x001 #define LG_BAD_RELATIVE_KEYS 0x002 #define LG_DUPLICATE_USAGES 0x004 #define LG_EXPANDED_KEYMAP 0x010 #define LG_IGNORE_DOUBLED_WHEEL 0x020 #define LG_WIRELESS 0x040 #define LG_INVERT_HWHEEL 0x080 #define LG_NOGET 0x100 #define LG_FF 0x200 #define LG_FF2 0x400 #define LG_RDESC_REL_ABS 0x800 #define LG_FF3 0x1000 #define LG_FF4 0x2000 /* Size of the original descriptors of the Driving Force (and Pro) wheels */ #define DF_RDESC_ORIG_SIZE 130 #define DFP_RDESC_ORIG_SIZE 97 #define FV_RDESC_ORIG_SIZE 130 #define MOMO_RDESC_ORIG_SIZE 87 #define MOMO2_RDESC_ORIG_SIZE 87 #define FFG_RDESC_ORIG_SIZE 85 #define FG_RDESC_ORIG_SIZE 82 /* Fixed report descriptors for Logitech Driving Force (and Pro) * wheel controllers * * The original descriptors hide the separate throttle and brake axes in * a custom vendor usage page, providing only a combined value as * GenericDesktop.Y. * These descriptors remove the combined Y axis and instead report * separate throttle (Y) and brake (RZ). */ static __u8 df_rdesc_fixed[] = { 0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x04, /* Usage (Joystick), */ 0xA1, 0x01, /* Collection (Application), */ 0xA1, 0x02, /* Collection (Logical), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x0A, /* Report Size (10), */ 0x14, /* Logical Minimum (0), */ 0x26, 0xFF, 0x03, /* Logical Maximum (1023), */ 0x34, /* Physical Minimum (0), */ 0x46, 0xFF, 0x03, /* Physical Maximum (1023), */ 0x09, 0x30, /* Usage (X), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x0C, /* Report Count (12), */ 0x75, 0x01, /* Report Size (1), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x05, 0x09, /* Usage (Buttons), */ 0x19, 0x01, /* Usage Minimum (1), */ 0x29, 0x0c, /* Usage Maximum (12), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x02, /* Report Count (2), */ 0x06, 0x00, 0xFF, /* Usage Page (Vendor: 65280), */ 0x09, 0x01, /* Usage (?: 1), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x08, /* Report Size (8), */ 0x81, 0x02, /* Input (Variable), */ 0x25, 0x07, /* Logical Maximum (7), */ 0x46, 0x3B, 0x01, /* Physical Maximum (315), */ 0x75, 0x04, /* Report Size (4), */ 0x65, 0x14, /* Unit (Degrees), */ 0x09, 0x39, /* Usage (Hat Switch), */ 0x81, 0x42, /* Input (Variable, Null State), */ 0x75, 0x01, /* Report Size (1), */ 0x95, 0x04, /* Report Count (4), */ 0x65, 0x00, /* Unit (none), */ 0x06, 0x00, 0xFF, /* Usage Page (Vendor: 65280), */ 0x09, 0x01, /* Usage (?: 1), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x08, /* Report Size (8), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x09, 0x31, /* Usage (Y), */ 0x81, 0x02, /* Input (Variable), */ 0x09, 0x35, /* Usage (Rz), */ 0x81, 0x02, /* Input (Variable), */ 0xC0, /* End Collection, */ 0xA1, 0x02, /* Collection (Logical), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x95, 0x07, /* Report Count (7), */ 0x75, 0x08, /* Report Size (8), */ 0x09, 0x03, /* Usage (?: 3), */ 0x91, 0x02, /* Output (Variable), */ 0xC0, /* End Collection, */ 0xC0 /* End Collection */ }; static __u8 dfp_rdesc_fixed[] = { 0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x04, /* Usage (Joystick), */ 0xA1, 0x01, /* Collection (Application), */ 0xA1, 0x02, /* Collection (Logical), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x0E, /* Report Size (14), */ 0x14, /* Logical Minimum (0), */ 0x26, 0xFF, 0x3F, /* Logical Maximum (16383), */ 0x34, /* Physical Minimum (0), */ 0x46, 0xFF, 0x3F, /* Physical Maximum (16383), */ 0x09, 0x30, /* Usage (X), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x0E, /* Report Count (14), */ 0x75, 0x01, /* Report Size (1), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x05, 0x09, /* Usage Page (Button), */ 0x19, 0x01, /* Usage Minimum (01h), */ 0x29, 0x0E, /* Usage Maximum (0Eh), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x04, /* Report Size (4), */ 0x25, 0x07, /* Logical Maximum (7), */ 0x46, 0x3B, 0x01, /* Physical Maximum (315), */ 0x65, 0x14, /* Unit (Degrees), */ 0x09, 0x39, /* Usage (Hat Switch), */ 0x81, 0x42, /* Input (Variable, Nullstate), */ 0x65, 0x00, /* Unit, */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x75, 0x08, /* Report Size (8), */ 0x81, 0x01, /* Input (Constant), */ 0x09, 0x31, /* Usage (Y), */ 0x81, 0x02, /* Input (Variable), */ 0x09, 0x35, /* Usage (Rz), */ 0x81, 0x02, /* Input (Variable), */ 0x81, 0x01, /* Input (Constant), */ 0xC0, /* End Collection, */ 0xA1, 0x02, /* Collection (Logical), */ 0x09, 0x02, /* Usage (02h), */ 0x95, 0x07, /* Report Count (7), */ 0x91, 0x02, /* Output (Variable), */ 0xC0, /* End Collection, */ 0xC0 /* End Collection */ }; static __u8 fv_rdesc_fixed[] = { 0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x04, /* Usage (Joystick), */ 0xA1, 0x01, /* Collection (Application), */ 0xA1, 0x02, /* Collection (Logical), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x0A, /* Report Size (10), */ 0x15, 0x00, /* Logical Minimum (0), */ 0x26, 0xFF, 0x03, /* Logical Maximum (1023), */ 0x35, 0x00, /* Physical Minimum (0), */ 0x46, 0xFF, 0x03, /* Physical Maximum (1023), */ 0x09, 0x30, /* Usage (X), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x0C, /* Report Count (12), */ 0x75, 0x01, /* Report Size (1), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x05, 0x09, /* Usage Page (Button), */ 0x19, 0x01, /* Usage Minimum (01h), */ 0x29, 0x0C, /* Usage Maximum (0Ch), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x02, /* Report Count (2), */ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */ 0x09, 0x01, /* Usage (01h), */ 0x81, 0x02, /* Input (Variable), */ 0x09, 0x02, /* Usage (02h), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x08, /* Report Size (8), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0x25, 0x07, /* Logical Maximum (7), */ 0x46, 0x3B, 0x01, /* Physical Maximum (315), */ 0x75, 0x04, /* Report Size (4), */ 0x65, 0x14, /* Unit (Degrees), */ 0x09, 0x39, /* Usage (Hat Switch), */ 0x81, 0x42, /* Input (Variable, Null State), */ 0x75, 0x01, /* Report Size (1), */ 0x95, 0x04, /* Report Count (4), */ 0x65, 0x00, /* Unit, */ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */ 0x09, 0x01, /* Usage (01h), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x08, /* Report Size (8), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x09, 0x31, /* Usage (Y), */ 0x81, 0x02, /* Input (Variable), */ 0x09, 0x32, /* Usage (Z), */ 0x81, 0x02, /* Input (Variable), */ 0xC0, /* End Collection, */ 0xA1, 0x02, /* Collection (Logical), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x95, 0x07, /* Report Count (7), */ 0x75, 0x08, /* Report Size (8), */ 0x09, 0x03, /* Usage (03h), */ 0x91, 0x02, /* Output (Variable), */ 0xC0, /* End Collection, */ 0xC0 /* End Collection */ }; static __u8 momo_rdesc_fixed[] = { 0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x04, /* Usage (Joystick), */ 0xA1, 0x01, /* Collection (Application), */ 0xA1, 0x02, /* Collection (Logical), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x0A, /* Report Size (10), */ 0x15, 0x00, /* Logical Minimum (0), */ 0x26, 0xFF, 0x03, /* Logical Maximum (1023), */ 0x35, 0x00, /* Physical Minimum (0), */ 0x46, 0xFF, 0x03, /* Physical Maximum (1023), */ 0x09, 0x30, /* Usage (X), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x08, /* Report Count (8), */ 0x75, 0x01, /* Report Size (1), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x05, 0x09, /* Usage Page (Button), */ 0x19, 0x01, /* Usage Minimum (01h), */ 0x29, 0x08, /* Usage Maximum (08h), */ 0x81, 0x02, /* Input (Variable), */ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */ 0x75, 0x0E, /* Report Size (14), */ 0x95, 0x01, /* Report Count (1), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x09, 0x00, /* Usage (00h), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0x75, 0x08, /* Report Size (8), */ 0x09, 0x31, /* Usage (Y), */ 0x81, 0x02, /* Input (Variable), */ 0x09, 0x32, /* Usage (Z), */ 0x81, 0x02, /* Input (Variable), */ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */ 0x09, 0x01, /* Usage (01h), */ 0x81, 0x02, /* Input (Variable), */ 0xC0, /* End Collection, */ 0xA1, 0x02, /* Collection (Logical), */ 0x09, 0x02, /* Usage (02h), */ 0x95, 0x07, /* Report Count (7), */ 0x91, 0x02, /* Output (Variable), */ 0xC0, /* End Collection, */ 0xC0 /* End Collection */ }; static __u8 momo2_rdesc_fixed[] = { 0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x04, /* Usage (Joystick), */ 0xA1, 0x01, /* Collection (Application), */ 0xA1, 0x02, /* Collection (Logical), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x0A, /* Report Size (10), */ 0x15, 0x00, /* Logical Minimum (0), */ 0x26, 0xFF, 0x03, /* Logical Maximum (1023), */ 0x35, 0x00, /* Physical Minimum (0), */ 0x46, 0xFF, 0x03, /* Physical Maximum (1023), */ 0x09, 0x30, /* Usage (X), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x0A, /* Report Count (10), */ 0x75, 0x01, /* Report Size (1), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x05, 0x09, /* Usage Page (Button), */ 0x19, 0x01, /* Usage Minimum (01h), */ 0x29, 0x0A, /* Usage Maximum (0Ah), */ 0x81, 0x02, /* Input (Variable), */ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */ 0x09, 0x00, /* Usage (00h), */ 0x95, 0x04, /* Report Count (4), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x08, /* Report Size (8), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x09, 0x01, /* Usage (01h), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x31, /* Usage (Y), */ 0x81, 0x02, /* Input (Variable), */ 0x09, 0x32, /* Usage (Z), */ 0x81, 0x02, /* Input (Variable), */ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */ 0x09, 0x00, /* Usage (00h), */ 0x81, 0x02, /* Input (Variable), */ 0xC0, /* End Collection, */ 0xA1, 0x02, /* Collection (Logical), */ 0x09, 0x02, /* Usage (02h), */ 0x95, 0x07, /* Report Count (7), */ 0x91, 0x02, /* Output (Variable), */ 0xC0, /* End Collection, */ 0xC0 /* End Collection */ }; static __u8 ffg_rdesc_fixed[] = { 0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x04, /* Usage (Joystik), */ 0xA1, 0x01, /* Collection (Application), */ 0xA1, 0x02, /* Collection (Logical), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x0A, /* Report Size (10), */ 0x15, 0x00, /* Logical Minimum (0), */ 0x26, 0xFF, 0x03, /* Logical Maximum (1023), */ 0x35, 0x00, /* Physical Minimum (0), */ 0x46, 0xFF, 0x03, /* Physical Maximum (1023), */ 0x09, 0x30, /* Usage (X), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x06, /* Report Count (6), */ 0x75, 0x01, /* Report Size (1), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x05, 0x09, /* Usage Page (Button), */ 0x19, 0x01, /* Usage Minimum (01h), */ 0x29, 0x06, /* Usage Maximum (06h), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x08, /* Report Size (8), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */ 0x09, 0x01, /* Usage (01h), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0x81, 0x01, /* Input (Constant), */ 0x09, 0x31, /* Usage (Y), */ 0x81, 0x02, /* Input (Variable), */ 0x09, 0x32, /* Usage (Z), */ 0x81, 0x02, /* Input (Variable), */ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */ 0x09, 0x01, /* Usage (01h), */ 0x81, 0x02, /* Input (Variable), */ 0xC0, /* End Collection, */ 0xA1, 0x02, /* Collection (Logical), */ 0x09, 0x02, /* Usage (02h), */ 0x95, 0x07, /* Report Count (7), */ 0x91, 0x02, /* Output (Variable), */ 0xC0, /* End Collection, */ 0xC0 /* End Collection */ }; static __u8 fg_rdesc_fixed[] = { 0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x04, /* Usage (Joystik), */ 0xA1, 0x01, /* Collection (Application), */ 0xA1, 0x02, /* Collection (Logical), */ 0x15, 0x00, /* Logical Minimum (0), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x35, 0x00, /* Physical Minimum (0), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x75, 0x08, /* Report Size (8), */ 0x95, 0x01, /* Report Count (1), */ 0x09, 0x30, /* Usage (X), */ 0x81, 0x02, /* Input (Variable), */ 0xA4, /* Push, */ 0x25, 0x01, /* Logical Maximum (1), */ 0x45, 0x01, /* Physical Maximum (1), */ 0x75, 0x01, /* Report Size (1), */ 0x95, 0x02, /* Report Count (2), */ 0x81, 0x01, /* Input (Constant), */ 0x95, 0x06, /* Report Count (6), */ 0x05, 0x09, /* Usage Page (Button), */ 0x19, 0x01, /* Usage Minimum (01h), */ 0x29, 0x06, /* Usage Maximum (06h), */ 0x81, 0x02, /* Input (Variable), */ 0x05, 0x01, /* Usage Page (Desktop), */ 0xB4, /* Pop, */ 0x81, 0x02, /* Input (Constant), */ 0x09, 0x31, /* Usage (Y), */ 0x81, 0x02, /* Input (Variable), */ 0x09, 0x32, /* Usage (Z), */ 0x81, 0x02, /* Input (Variable), */ 0xC0, /* End Collection, */ 0xA1, 0x02, /* Collection (Logical), */ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */ 0x46, 0xFF, 0x00, /* Physical Maximum (255), */ 0x75, 0x08, /* Report Size (8), */ 0x95, 0x04, /* Report Count (4), */ 0x09, 0x02, /* Usage (02h), */ 0xB1, 0x02, /* Feature (Variable), */ 0xC0, /* End Collection, */ 0xC0 /* End Collection, */ }; /* * Certain Logitech keyboards send in report #3 keys which are far * above the logical maximum described in descriptor. This extends * the original value of 0x28c of logical maximum to 0x104d */ static __u8 *lg_report_fixup(struct hid_device *hdev, __u8 *rdesc, unsigned int *rsize) { struct lg_drv_data *drv_data = hid_get_drvdata(hdev); if ((drv_data->quirks & LG_RDESC) && *rsize >= 91 && rdesc[83] == 0x26 && rdesc[84] == 0x8c && rdesc[85] == 0x02) { hid_info(hdev, "fixing up Logitech keyboard report descriptor\n"); rdesc[84] = rdesc[89] = 0x4d; rdesc[85] = rdesc[90] = 0x10; } if ((drv_data->quirks & LG_RDESC_REL_ABS) && *rsize >= 51 && rdesc[32] == 0x81 && rdesc[33] == 0x06 && rdesc[49] == 0x81 && rdesc[50] == 0x06) { hid_info(hdev, "fixing up rel/abs in Logitech report descriptor\n"); rdesc[33] = rdesc[50] = 0x02; } switch (hdev->product) { case USB_DEVICE_ID_LOGITECH_WINGMAN_FG: if (*rsize == FG_RDESC_ORIG_SIZE) { hid_info(hdev, "fixing up Logitech Wingman Formula GP report descriptor\n"); rdesc = fg_rdesc_fixed; *rsize = sizeof(fg_rdesc_fixed); } else { hid_info(hdev, "rdesc size test failed for formula gp\n"); } break; case USB_DEVICE_ID_LOGITECH_WINGMAN_FFG: if (*rsize == FFG_RDESC_ORIG_SIZE) { hid_info(hdev, "fixing up Logitech Wingman Formula Force GP report descriptor\n"); rdesc = ffg_rdesc_fixed; *rsize = sizeof(ffg_rdesc_fixed); } break; /* Several wheels report as this id when operating in emulation mode. */ case USB_DEVICE_ID_LOGITECH_WHEEL: if (*rsize == DF_RDESC_ORIG_SIZE) { hid_info(hdev, "fixing up Logitech Driving Force report descriptor\n"); rdesc = df_rdesc_fixed; *rsize = sizeof(df_rdesc_fixed); } break; case USB_DEVICE_ID_LOGITECH_MOMO_WHEEL: if (*rsize == MOMO_RDESC_ORIG_SIZE) { hid_info(hdev, "fixing up Logitech Momo Force (Red) report descriptor\n"); rdesc = momo_rdesc_fixed; *rsize = sizeof(momo_rdesc_fixed); } break; case USB_DEVICE_ID_LOGITECH_MOMO_WHEEL2: if (*rsize == MOMO2_RDESC_ORIG_SIZE) { hid_info(hdev, "fixing up Logitech Momo Racing Force (Black) report descriptor\n"); rdesc = momo2_rdesc_fixed; *rsize = sizeof(momo2_rdesc_fixed); } break; case USB_DEVICE_ID_LOGITECH_VIBRATION_WHEEL: if (*rsize == FV_RDESC_ORIG_SIZE) { hid_info(hdev, "fixing up Logitech Formula Vibration report descriptor\n"); rdesc = fv_rdesc_fixed; *rsize = sizeof(fv_rdesc_fixed); } break; case USB_DEVICE_ID_LOGITECH_DFP_WHEEL: if (*rsize == DFP_RDESC_ORIG_SIZE) { hid_info(hdev, "fixing up Logitech Driving Force Pro report descriptor\n"); rdesc = dfp_rdesc_fixed; *rsize = sizeof(dfp_rdesc_fixed); } break; case USB_DEVICE_ID_LOGITECH_WII_WHEEL: if (*rsize >= 101 && rdesc[41] == 0x95 && rdesc[42] == 0x0B && rdesc[47] == 0x05 && rdesc[48] == 0x09) { hid_info(hdev, "fixing up Logitech Speed Force Wireless report descriptor\n"); rdesc[41] = 0x05; rdesc[42] = 0x09; rdesc[47] = 0x95; rdesc[48] = 0x0B; } break; } return rdesc; } #define lg_map_key_clear(c) hid_map_usage_clear(hi, usage, bit, max, \ EV_KEY, (c)) static int lg_ultrax_remote_mapping(struct hid_input *hi, struct hid_usage *usage, unsigned long **bit, int *max) { if ((usage->hid & HID_USAGE_PAGE) != HID_UP_LOGIVENDOR) return 0; set_bit(EV_REP, hi->input->evbit); switch (usage->hid & HID_USAGE) { /* Reported on Logitech Ultra X Media Remote */ case 0x004: lg_map_key_clear(KEY_AGAIN); break; case 0x00d: lg_map_key_clear(KEY_HOME); break; case 0x024: lg_map_key_clear(KEY_SHUFFLE); break; case 0x025: lg_map_key_clear(KEY_TV); break; case 0x026: lg_map_key_clear(KEY_MENU); break; case 0x031: lg_map_key_clear(KEY_AUDIO); break; case 0x032: lg_map_key_clear(KEY_TEXT); break; case 0x033: lg_map_key_clear(KEY_LAST); break; case 0x047: lg_map_key_clear(KEY_MP3); break; case 0x048: lg_map_key_clear(KEY_DVD); break; case 0x049: lg_map_key_clear(KEY_MEDIA); break; case 0x04a: lg_map_key_clear(KEY_VIDEO); break; case 0x04b: lg_map_key_clear(KEY_ANGLE); break; case 0x04c: lg_map_key_clear(KEY_LANGUAGE); break; case 0x04d: lg_map_key_clear(KEY_SUBTITLE); break; case 0x051: lg_map_key_clear(KEY_RED); break; case 0x052: lg_map_key_clear(KEY_CLOSE); break; default: return 0; } return 1; } static int lg_wireless_mapping(struct hid_input *hi, struct hid_usage *usage, unsigned long **bit, int *max) { if ((usage->hid & HID_USAGE_PAGE) != HID_UP_CONSUMER) return 0; switch (usage->hid & HID_USAGE) { case 0x1001: lg_map_key_clear(KEY_MESSENGER); break; case 0x1003: lg_map_key_clear(KEY_SOUND); break; case 0x1004: lg_map_key_clear(KEY_VIDEO); break; case 0x1005: lg_map_key_clear(KEY_AUDIO); break; case 0x100a: lg_map_key_clear(KEY_DOCUMENTS); break; /* The following two entries are Playlist 1 and 2 on the MX3200 */ case 0x100f: lg_map_key_clear(KEY_FN_1); break; case 0x1010: lg_map_key_clear(KEY_FN_2); break; case 0x1011: lg_map_key_clear(KEY_PREVIOUSSONG); break; case 0x1012: lg_map_key_clear(KEY_NEXTSONG); break; case 0x1013: lg_map_key_clear(KEY_CAMERA); break; case 0x1014: lg_map_key_clear(KEY_MESSENGER); break; case 0x1015: lg_map_key_clear(KEY_RECORD); break; case 0x1016: lg_map_key_clear(KEY_PLAYER); break; case 0x1017: lg_map_key_clear(KEY_EJECTCD); break; case 0x1018: lg_map_key_clear(KEY_MEDIA); break; case 0x1019: lg_map_key_clear(KEY_PROG1); break; case 0x101a: lg_map_key_clear(KEY_PROG2); break; case 0x101b: lg_map_key_clear(KEY_PROG3); break; case 0x101c: lg_map_key_clear(KEY_CYCLEWINDOWS); break; case 0x101f: lg_map_key_clear(KEY_ZOOMIN); break; case 0x1020: lg_map_key_clear(KEY_ZOOMOUT); break; case 0x1021: lg_map_key_clear(KEY_ZOOMRESET); break; case 0x1023: lg_map_key_clear(KEY_CLOSE); break; case 0x1027: lg_map_key_clear(KEY_MENU); break; /* this one is marked as 'Rotate' */ case 0x1028: lg_map_key_clear(KEY_ANGLE); break; case 0x1029: lg_map_key_clear(KEY_SHUFFLE); break; case 0x102a: lg_map_key_clear(KEY_BACK); break; case 0x102b: lg_map_key_clear(KEY_CYCLEWINDOWS); break; case 0x102d: lg_map_key_clear(KEY_WWW); break; /* The following two are 'Start/answer call' and 'End/reject call' on the MX3200 */ case 0x1031: lg_map_key_clear(KEY_OK); break; case 0x1032: lg_map_key_clear(KEY_CANCEL); break; case 0x1041: lg_map_key_clear(KEY_BATTERY); break; case 0x1042: lg_map_key_clear(KEY_WORDPROCESSOR); break; case 0x1043: lg_map_key_clear(KEY_SPREADSHEET); break; case 0x1044: lg_map_key_clear(KEY_PRESENTATION); break; case 0x1045: lg_map_key_clear(KEY_UNDO); break; case 0x1046: lg_map_key_clear(KEY_REDO); break; case 0x1047: lg_map_key_clear(KEY_PRINT); break; case 0x1048: lg_map_key_clear(KEY_SAVE); break; case 0x1049: lg_map_key_clear(KEY_PROG1); break; case 0x104a: lg_map_key_clear(KEY_PROG2); break; case 0x104b: lg_map_key_clear(KEY_PROG3); break; case 0x104c: lg_map_key_clear(KEY_PROG4); break; default: return 0; } return 1; } static int lg_input_mapping(struct hid_device *hdev, struct hid_input *hi, struct hid_field *field, struct hid_usage *usage, unsigned long **bit, int *max) { /* extended mapping for certain Logitech hardware (Logitech cordless desktop LX500) */ static const u8 e_keymap[] = { 0,216, 0,213,175,156, 0, 0, 0, 0, 144, 0, 0, 0, 0, 0, 0, 0, 0,212, 174,167,152,161,112, 0, 0, 0,154, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,183,184,185,186,187, 188,189,190,191,192,193,194, 0, 0, 0 }; struct lg_drv_data *drv_data = hid_get_drvdata(hdev); unsigned int hid = usage->hid; if (hdev->product == USB_DEVICE_ID_LOGITECH_RECEIVER && lg_ultrax_remote_mapping(hi, usage, bit, max)) return 1; if ((drv_data->quirks & LG_WIRELESS) && lg_wireless_mapping(hi, usage, bit, max)) return 1; if ((hid & HID_USAGE_PAGE) != HID_UP_BUTTON) return 0; hid &= HID_USAGE; /* Special handling for Logitech Cordless Desktop */ if (field->application == HID_GD_MOUSE) { if ((drv_data->quirks & LG_IGNORE_DOUBLED_WHEEL) && (hid == 7 || hid == 8)) return -1; } else { if ((drv_data->quirks & LG_EXPANDED_KEYMAP) && hid < ARRAY_SIZE(e_keymap) && e_keymap[hid] != 0) { hid_map_usage(hi, usage, bit, max, EV_KEY, e_keymap[hid]); return 1; } } return 0; } static int lg_input_mapped(struct hid_device *hdev, struct hid_input *hi, struct hid_field *field, struct hid_usage *usage, unsigned long **bit, int *max) { struct lg_drv_data *drv_data = hid_get_drvdata(hdev); if ((drv_data->quirks & LG_BAD_RELATIVE_KEYS) && usage->type == EV_KEY && (field->flags & HID_MAIN_ITEM_RELATIVE)) field->flags &= ~HID_MAIN_ITEM_RELATIVE; if ((drv_data->quirks & LG_DUPLICATE_USAGES) && (usage->type == EV_KEY || usage->type == EV_REL || usage->type == EV_ABS)) clear_bit(usage->code, *bit); /* Ensure that Logitech wheels are not given a default fuzz/flat value */ if (usage->type == EV_ABS && (usage->code == ABS_X || usage->code == ABS_Y || usage->code == ABS_Z || usage->code == ABS_RZ)) { switch (hdev->product) { case USB_DEVICE_ID_LOGITECH_G29_WHEEL: case USB_DEVICE_ID_LOGITECH_WINGMAN_FG: case USB_DEVICE_ID_LOGITECH_WINGMAN_FFG: case USB_DEVICE_ID_LOGITECH_WHEEL: case USB_DEVICE_ID_LOGITECH_MOMO_WHEEL: case USB_DEVICE_ID_LOGITECH_DFP_WHEEL: case USB_DEVICE_ID_LOGITECH_G25_WHEEL: case USB_DEVICE_ID_LOGITECH_DFGT_WHEEL: case USB_DEVICE_ID_LOGITECH_G27_WHEEL: case USB_DEVICE_ID_LOGITECH_WII_WHEEL: case USB_DEVICE_ID_LOGITECH_MOMO_WHEEL2: case USB_DEVICE_ID_LOGITECH_VIBRATION_WHEEL: field->application = HID_GD_MULTIAXIS; break; default: break; } } return 0; } static int lg_event(struct hid_device *hdev, struct hid_field *field, struct hid_usage *usage, __s32 value) { struct lg_drv_data *drv_data = hid_get_drvdata(hdev); if ((drv_data->quirks & LG_INVERT_HWHEEL) && usage->code == REL_HWHEEL) { input_event(field->hidinput->input, usage->type, usage->code, -value); return 1; } if (drv_data->quirks & LG_FF4) { return lg4ff_adjust_input_event(hdev, field, usage, value, drv_data); } return 0; } static int lg_raw_event(struct hid_device *hdev, struct hid_report *report, u8 *rd, int size) { struct lg_drv_data *drv_data = hid_get_drvdata(hdev); if (drv_data->quirks & LG_FF4) return lg4ff_raw_event(hdev, report, rd, size, drv_data); return 0; } static int lg_probe(struct hid_device *hdev, const struct hid_device_id *id) { struct usb_interface *iface; __u8 iface_num; unsigned int connect_mask = HID_CONNECT_DEFAULT; struct lg_drv_data *drv_data; int ret; if (!hid_is_usb(hdev)) return -EINVAL; iface = to_usb_interface(hdev->dev.parent); iface_num = iface->cur_altsetting->desc.bInterfaceNumber; /* G29 only work with the 1st interface */ if ((hdev->product == USB_DEVICE_ID_LOGITECH_G29_WHEEL) && (iface_num != 0)) { dbg_hid("%s: ignoring ifnum %d\n", __func__, iface_num); return -ENODEV; } drv_data = kzalloc(sizeof(struct lg_drv_data), GFP_KERNEL); if (!drv_data) { hid_err(hdev, "Insufficient memory, cannot allocate driver data\n"); return -ENOMEM; } drv_data->quirks = id->driver_data; hid_set_drvdata(hdev, (void *)drv_data); if (drv_data->quirks & LG_NOGET) hdev->quirks |= HID_QUIRK_NOGET; ret = hid_parse(hdev); if (ret) { hid_err(hdev, "parse failed\n"); goto err_free; } if (drv_data->quirks & (LG_FF | LG_FF2 | LG_FF3 | LG_FF4)) connect_mask &= ~HID_CONNECT_FF; ret = hid_hw_start(hdev, connect_mask); if (ret) { hid_err(hdev, "hw start failed\n"); goto err_free; } /* Setup wireless link with Logitech Wii wheel */ if (hdev->product == USB_DEVICE_ID_LOGITECH_WII_WHEEL) { static const unsigned char cbuf[] = { 0x00, 0xAF, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }; u8 *buf = kmemdup(cbuf, sizeof(cbuf), GFP_KERNEL); if (!buf) { ret = -ENOMEM; goto err_stop; } ret = hid_hw_raw_request(hdev, buf[0], buf, sizeof(cbuf), HID_FEATURE_REPORT, HID_REQ_SET_REPORT); if (ret >= 0) { /* insert a little delay of 10 jiffies ~ 40ms */ wait_queue_head_t wait; init_waitqueue_head (&wait); wait_event_interruptible_timeout(wait, 0, msecs_to_jiffies(40)); /* Select random Address */ buf[1] = 0xB2; get_random_bytes(&buf[2], 2); ret = hid_hw_raw_request(hdev, buf[0], buf, sizeof(cbuf), HID_FEATURE_REPORT, HID_REQ_SET_REPORT); } kfree(buf); } if (drv_data->quirks & LG_FF) ret = lgff_init(hdev); else if (drv_data->quirks & LG_FF2) ret = lg2ff_init(hdev); else if (drv_data->quirks & LG_FF3) ret = lg3ff_init(hdev); else if (drv_data->quirks & LG_FF4) ret = lg4ff_init(hdev); if (ret) goto err_stop; return 0; err_stop: hid_hw_stop(hdev); err_free: kfree(drv_data); return ret; } static void lg_remove(struct hid_device *hdev) { struct lg_drv_data *drv_data = hid_get_drvdata(hdev); if (drv_data->quirks & LG_FF4) lg4ff_deinit(hdev); hid_hw_stop(hdev); kfree(drv_data); } static const struct hid_device_id lg_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_S510_RECEIVER), .driver_data = LG_RDESC | LG_WIRELESS }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_RECEIVER), .driver_data = LG_BAD_RELATIVE_KEYS }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_DINOVO_DESKTOP), .driver_data = LG_DUPLICATE_USAGES }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_ELITE_KBD), .driver_data = LG_IGNORE_DOUBLED_WHEEL | LG_EXPANDED_KEYMAP }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_CORDLESS_DESKTOP_LX500), .driver_data = LG_IGNORE_DOUBLED_WHEEL | LG_EXPANDED_KEYMAP }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_EXTREME_3D), .driver_data = LG_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_DUAL_ACTION), .driver_data = LG_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_WHEEL), .driver_data = LG_NOGET | LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_RUMBLEPAD_CORD), .driver_data = LG_FF2 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_RUMBLEPAD), .driver_data = LG_FF }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_RUMBLEPAD2_2), .driver_data = LG_FF }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_G29_WHEEL), .driver_data = LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_WINGMAN_F3D), .driver_data = LG_FF }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_FORCE3D_PRO), .driver_data = LG_FF }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_MOMO_WHEEL), .driver_data = LG_NOGET | LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_MOMO_WHEEL2), .driver_data = LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_VIBRATION_WHEEL), .driver_data = LG_FF2 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_G25_WHEEL), .driver_data = LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_DFGT_WHEEL), .driver_data = LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_G27_WHEEL), .driver_data = LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_DFP_WHEEL), .driver_data = LG_NOGET | LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_WII_WHEEL), .driver_data = LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_WINGMAN_FG), .driver_data = LG_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_WINGMAN_FFG), .driver_data = LG_NOGET | LG_FF4 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_RUMBLEPAD2), .driver_data = LG_NOGET | LG_FF2 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_FLIGHT_SYSTEM_G940), .driver_data = LG_FF3 }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_SPACENAVIGATOR), .driver_data = LG_RDESC_REL_ABS }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_SPACETRAVELLER), .driver_data = LG_RDESC_REL_ABS }, { } }; MODULE_DEVICE_TABLE(hid, lg_devices); static struct hid_driver lg_driver = { .name = "logitech", .id_table = lg_devices, .report_fixup = lg_report_fixup, .input_mapping = lg_input_mapping, .input_mapped = lg_input_mapped, .event = lg_event, .raw_event = lg_raw_event, .probe = lg_probe, .remove = lg_remove, }; module_hid_driver(lg_driver); #ifdef CONFIG_LOGIWHEELS_FF int lg4ff_no_autoswitch = 0; module_param_named(lg4ff_no_autoswitch, lg4ff_no_autoswitch, int, S_IRUGO); MODULE_PARM_DESC(lg4ff_no_autoswitch, "Do not switch multimode wheels to their native mode automatically"); #endif MODULE_LICENSE("GPL"); |
17 1 1 4 1 5 4 2 1 1 2 7 1 2 3 2 3 2 4 1 5 5 5 5 3 5 1 5 2 2 7 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 | // SPDX-License-Identifier: GPL-2.0+ /* net/sched/act_ctinfo.c netfilter ctinfo connmark actions * * Copyright (c) 2019 Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> */ #include <linux/module.h> #include <linux/init.h> #include <linux/kernel.h> #include <linux/skbuff.h> #include <linux/rtnetlink.h> #include <linux/pkt_cls.h> #include <linux/ip.h> #include <linux/ipv6.h> #include <net/netlink.h> #include <net/pkt_sched.h> #include <net/act_api.h> #include <net/pkt_cls.h> #include <uapi/linux/tc_act/tc_ctinfo.h> #include <net/tc_act/tc_ctinfo.h> #include <net/tc_wrapper.h> #include <net/netfilter/nf_conntrack.h> #include <net/netfilter/nf_conntrack_core.h> #include <net/netfilter/nf_conntrack_ecache.h> #include <net/netfilter/nf_conntrack_zones.h> static struct tc_action_ops act_ctinfo_ops; static void tcf_ctinfo_dscp_set(struct nf_conn *ct, struct tcf_ctinfo *ca, struct tcf_ctinfo_params *cp, struct sk_buff *skb, int wlen, int proto) { u8 dscp, newdscp; newdscp = (((READ_ONCE(ct->mark) & cp->dscpmask) >> cp->dscpmaskshift) << 2) & ~INET_ECN_MASK; switch (proto) { case NFPROTO_IPV4: dscp = ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK; if (dscp != newdscp) { if (likely(!skb_try_make_writable(skb, wlen))) { ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, newdscp); ca->stats_dscp_set++; } else { ca->stats_dscp_error++; } } break; case NFPROTO_IPV6: dscp = ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK; if (dscp != newdscp) { if (likely(!skb_try_make_writable(skb, wlen))) { ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, newdscp); ca->stats_dscp_set++; } else { ca->stats_dscp_error++; } } break; default: break; } } static void tcf_ctinfo_cpmark_set(struct nf_conn *ct, struct tcf_ctinfo *ca, struct tcf_ctinfo_params *cp, struct sk_buff *skb) { ca->stats_cpmark_set++; skb->mark = READ_ONCE(ct->mark) & cp->cpmarkmask; } TC_INDIRECT_SCOPE int tcf_ctinfo_act(struct sk_buff *skb, const struct tc_action *a, struct tcf_result *res) { const struct nf_conntrack_tuple_hash *thash = NULL; struct tcf_ctinfo *ca = to_ctinfo(a); struct nf_conntrack_tuple tuple; struct nf_conntrack_zone zone; enum ip_conntrack_info ctinfo; struct tcf_ctinfo_params *cp; struct nf_conn *ct; int proto, wlen; int action; cp = rcu_dereference_bh(ca->params); tcf_lastuse_update(&ca->tcf_tm); tcf_action_update_bstats(&ca->common, skb); action = READ_ONCE(ca->tcf_action); wlen = skb_network_offset(skb); switch (skb_protocol(skb, true)) { case htons(ETH_P_IP): wlen += sizeof(struct iphdr); if (!pskb_may_pull(skb, wlen)) goto out; proto = NFPROTO_IPV4; break; case htons(ETH_P_IPV6): wlen += sizeof(struct ipv6hdr); if (!pskb_may_pull(skb, wlen)) goto out; proto = NFPROTO_IPV6; break; default: goto out; } ct = nf_ct_get(skb, &ctinfo); if (!ct) { /* look harder, usually ingress */ if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), proto, cp->net, &tuple)) goto out; zone.id = cp->zone; zone.dir = NF_CT_DEFAULT_ZONE_DIR; thash = nf_conntrack_find_get(cp->net, &zone, &tuple); if (!thash) goto out; ct = nf_ct_tuplehash_to_ctrack(thash); } if (cp->mode & CTINFO_MODE_DSCP) if (!cp->dscpstatemask || (READ_ONCE(ct->mark) & cp->dscpstatemask)) tcf_ctinfo_dscp_set(ct, ca, cp, skb, wlen, proto); if (cp->mode & CTINFO_MODE_CPMARK) tcf_ctinfo_cpmark_set(ct, ca, cp, skb); if (thash) nf_ct_put(ct); out: return action; } static const struct nla_policy ctinfo_policy[TCA_CTINFO_MAX + 1] = { [TCA_CTINFO_ACT] = NLA_POLICY_EXACT_LEN(sizeof(struct tc_ctinfo)), [TCA_CTINFO_ZONE] = { .type = NLA_U16 }, [TCA_CTINFO_PARMS_DSCP_MASK] = { .type = NLA_U32 }, [TCA_CTINFO_PARMS_DSCP_STATEMASK] = { .type = NLA_U32 }, [TCA_CTINFO_PARMS_CPMARK_MASK] = { .type = NLA_U32 }, }; static int tcf_ctinfo_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, act_ctinfo_ops.net_id); bool bind = flags & TCA_ACT_FLAGS_BIND; u32 dscpmask = 0, dscpstatemask, index; struct nlattr *tb[TCA_CTINFO_MAX + 1]; struct tcf_ctinfo_params *cp_new; struct tcf_chain *goto_ch = NULL; struct tc_ctinfo *actparm; struct tcf_ctinfo *ci; u8 dscpmaskshift; int ret = 0, err; if (!nla) { NL_SET_ERR_MSG_MOD(extack, "ctinfo requires attributes to be passed"); return -EINVAL; } err = nla_parse_nested(tb, TCA_CTINFO_MAX, nla, ctinfo_policy, extack); if (err < 0) return err; if (!tb[TCA_CTINFO_ACT]) { NL_SET_ERR_MSG_MOD(extack, "Missing required TCA_CTINFO_ACT attribute"); return -EINVAL; } actparm = nla_data(tb[TCA_CTINFO_ACT]); /* do some basic validation here before dynamically allocating things */ /* that we would otherwise have to clean up. */ if (tb[TCA_CTINFO_PARMS_DSCP_MASK]) { dscpmask = nla_get_u32(tb[TCA_CTINFO_PARMS_DSCP_MASK]); /* need contiguous 6 bit mask */ dscpmaskshift = dscpmask ? __ffs(dscpmask) : 0; if ((~0 & (dscpmask >> dscpmaskshift)) != 0x3f) { NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CTINFO_PARMS_DSCP_MASK], "dscp mask must be 6 contiguous bits"); return -EINVAL; } dscpstatemask = tb[TCA_CTINFO_PARMS_DSCP_STATEMASK] ? nla_get_u32(tb[TCA_CTINFO_PARMS_DSCP_STATEMASK]) : 0; /* mask & statemask must not overlap */ if (dscpmask & dscpstatemask) { NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CTINFO_PARMS_DSCP_STATEMASK], "dscp statemask must not overlap dscp mask"); return -EINVAL; } } /* done the validation:now to the actual action allocation */ index = actparm->index; err = tcf_idr_check_alloc(tn, &index, a, bind); if (!err) { ret = tcf_idr_create_from_flags(tn, index, est, a, &act_ctinfo_ops, bind, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret; } ret = ACT_P_CREATED; } else if (err > 0) { if (bind) /* don't override defaults */ return ACT_P_BOUND; if (!(flags & TCA_ACT_FLAGS_REPLACE)) { tcf_idr_release(*a, bind); return -EEXIST; } } else { return err; } err = tcf_action_check_ctrlact(actparm->action, tp, &goto_ch, extack); if (err < 0) goto release_idr; ci = to_ctinfo(*a); cp_new = kzalloc(sizeof(*cp_new), GFP_KERNEL); if (unlikely(!cp_new)) { err = -ENOMEM; goto put_chain; } cp_new->net = net; cp_new->zone = tb[TCA_CTINFO_ZONE] ? nla_get_u16(tb[TCA_CTINFO_ZONE]) : 0; if (dscpmask) { cp_new->dscpmask = dscpmask; cp_new->dscpmaskshift = dscpmaskshift; cp_new->dscpstatemask = dscpstatemask; cp_new->mode |= CTINFO_MODE_DSCP; } if (tb[TCA_CTINFO_PARMS_CPMARK_MASK]) { cp_new->cpmarkmask = nla_get_u32(tb[TCA_CTINFO_PARMS_CPMARK_MASK]); cp_new->mode |= CTINFO_MODE_CPMARK; } spin_lock_bh(&ci->tcf_lock); goto_ch = tcf_action_set_ctrlact(*a, actparm->action, goto_ch); cp_new = rcu_replace_pointer(ci->params, cp_new, lockdep_is_held(&ci->tcf_lock)); spin_unlock_bh(&ci->tcf_lock); if (goto_ch) tcf_chain_put_by_act(goto_ch); if (cp_new) kfree_rcu(cp_new, rcu); return ret; put_chain: if (goto_ch) tcf_chain_put_by_act(goto_ch); release_idr: tcf_idr_release(*a, bind); return err; } static int tcf_ctinfo_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) { struct tcf_ctinfo *ci = to_ctinfo(a); struct tc_ctinfo opt = { .index = ci->tcf_index, .refcnt = refcount_read(&ci->tcf_refcnt) - ref, .bindcnt = atomic_read(&ci->tcf_bindcnt) - bind, }; unsigned char *b = skb_tail_pointer(skb); struct tcf_ctinfo_params *cp; struct tcf_t t; spin_lock_bh(&ci->tcf_lock); cp = rcu_dereference_protected(ci->params, lockdep_is_held(&ci->tcf_lock)); tcf_tm_dump(&t, &ci->tcf_tm); if (nla_put_64bit(skb, TCA_CTINFO_TM, sizeof(t), &t, TCA_CTINFO_PAD)) goto nla_put_failure; opt.action = ci->tcf_action; if (nla_put(skb, TCA_CTINFO_ACT, sizeof(opt), &opt)) goto nla_put_failure; if (nla_put_u16(skb, TCA_CTINFO_ZONE, cp->zone)) goto nla_put_failure; if (cp->mode & CTINFO_MODE_DSCP) { if (nla_put_u32(skb, TCA_CTINFO_PARMS_DSCP_MASK, cp->dscpmask)) goto nla_put_failure; if (nla_put_u32(skb, TCA_CTINFO_PARMS_DSCP_STATEMASK, cp->dscpstatemask)) goto nla_put_failure; } if (cp->mode & CTINFO_MODE_CPMARK) { if (nla_put_u32(skb, TCA_CTINFO_PARMS_CPMARK_MASK, cp->cpmarkmask)) goto nla_put_failure; } if (nla_put_u64_64bit(skb, TCA_CTINFO_STATS_DSCP_SET, ci->stats_dscp_set, TCA_CTINFO_PAD)) goto nla_put_failure; if (nla_put_u64_64bit(skb, TCA_CTINFO_STATS_DSCP_ERROR, ci->stats_dscp_error, TCA_CTINFO_PAD)) goto nla_put_failure; if (nla_put_u64_64bit(skb, TCA_CTINFO_STATS_CPMARK_SET, ci->stats_cpmark_set, TCA_CTINFO_PAD)) goto nla_put_failure; spin_unlock_bh(&ci->tcf_lock); return skb->len; nla_put_failure: spin_unlock_bh(&ci->tcf_lock); nlmsg_trim(skb, b); return -1; } static void tcf_ctinfo_cleanup(struct tc_action *a) { struct tcf_ctinfo *ci = to_ctinfo(a); struct tcf_ctinfo_params *cp; cp = rcu_dereference_protected(ci->params, 1); if (cp) kfree_rcu(cp, rcu); } static struct tc_action_ops act_ctinfo_ops = { .kind = "ctinfo", .id = TCA_ID_CTINFO, .owner = THIS_MODULE, .act = tcf_ctinfo_act, .dump = tcf_ctinfo_dump, .init = tcf_ctinfo_init, .cleanup= tcf_ctinfo_cleanup, .size = sizeof(struct tcf_ctinfo), }; MODULE_ALIAS_NET_ACT("ctinfo"); static __net_init int ctinfo_init_net(struct net *net) { struct tc_action_net *tn = net_generic(net, act_ctinfo_ops.net_id); return tc_action_net_init(net, tn, &act_ctinfo_ops); } static void __net_exit ctinfo_exit_net(struct list_head *net_list) { tc_action_net_exit(net_list, act_ctinfo_ops.net_id); } static struct pernet_operations ctinfo_net_ops = { .init = ctinfo_init_net, .exit_batch = ctinfo_exit_net, .id = &act_ctinfo_ops.net_id, .size = sizeof(struct tc_action_net), }; static int __init ctinfo_init_module(void) { return tcf_register_action(&act_ctinfo_ops, &ctinfo_net_ops); } static void __exit ctinfo_cleanup_module(void) { tcf_unregister_action(&act_ctinfo_ops, &ctinfo_net_ops); } module_init(ctinfo_init_module); module_exit(ctinfo_cleanup_module); MODULE_AUTHOR("Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>"); MODULE_DESCRIPTION("Connection tracking mark actions"); MODULE_LICENSE("GPL"); |
73 16 175 173 88 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Spanning tree protocol; timer-related code * Linux ethernet bridge * * Authors: * Lennert Buytenhek <buytenh@gnu.org> */ #include <linux/kernel.h> #include <linux/times.h> #include "br_private.h" #include "br_private_stp.h" /* called under bridge lock */ static int br_is_designated_for_some_port(const struct net_bridge *br) { struct net_bridge_port *p; list_for_each_entry(p, &br->port_list, list) { if (p->state != BR_STATE_DISABLED && !memcmp(&p->designated_bridge, &br->bridge_id, 8)) return 1; } return 0; } static void br_hello_timer_expired(struct timer_list *t) { struct net_bridge *br = from_timer(br, t, hello_timer); br_debug(br, "hello timer expired\n"); spin_lock(&br->lock); if (br->dev->flags & IFF_UP) { br_config_bpdu_generation(br); if (br->stp_enabled == BR_KERNEL_STP) mod_timer(&br->hello_timer, round_jiffies(jiffies + br->hello_time)); } spin_unlock(&br->lock); } static void br_message_age_timer_expired(struct timer_list *t) { struct net_bridge_port *p = from_timer(p, t, message_age_timer); struct net_bridge *br = p->br; const bridge_id *id = &p->designated_bridge; int was_root; if (p->state == BR_STATE_DISABLED) return; br_info(br, "port %u(%s) neighbor %.2x%.2x.%pM lost\n", (unsigned int) p->port_no, p->dev->name, id->prio[0], id->prio[1], &id->addr); /* * According to the spec, the message age timer cannot be * running when we are the root bridge. So.. this was_root * check is redundant. I'm leaving it in for now, though. */ spin_lock(&br->lock); if (p->state == BR_STATE_DISABLED) goto unlock; was_root = br_is_root_bridge(br); br_become_designated_port(p); br_configuration_update(br); br_port_state_selection(br); if (br_is_root_bridge(br) && !was_root) br_become_root_bridge(br); unlock: spin_unlock(&br->lock); } static void br_forward_delay_timer_expired(struct timer_list *t) { struct net_bridge_port *p = from_timer(p, t, forward_delay_timer); struct net_bridge *br = p->br; br_debug(br, "port %u(%s) forward delay timer\n", (unsigned int) p->port_no, p->dev->name); spin_lock(&br->lock); if (p->state == BR_STATE_LISTENING) { br_set_state(p, BR_STATE_LEARNING); mod_timer(&p->forward_delay_timer, jiffies + br->forward_delay); } else if (p->state == BR_STATE_LEARNING) { br_set_state(p, BR_STATE_FORWARDING); if (br_is_designated_for_some_port(br)) br_topology_change_detection(br); netif_carrier_on(br->dev); } rcu_read_lock(); br_ifinfo_notify(RTM_NEWLINK, NULL, p); rcu_read_unlock(); spin_unlock(&br->lock); } static void br_tcn_timer_expired(struct timer_list *t) { struct net_bridge *br = from_timer(br, t, tcn_timer); br_debug(br, "tcn timer expired\n"); spin_lock(&br->lock); if (!br_is_root_bridge(br) && (br->dev->flags & IFF_UP)) { br_transmit_tcn(br); mod_timer(&br->tcn_timer, jiffies + br->bridge_hello_time); } spin_unlock(&br->lock); } static void br_topology_change_timer_expired(struct timer_list *t) { struct net_bridge *br = from_timer(br, t, topology_change_timer); br_debug(br, "topo change timer expired\n"); spin_lock(&br->lock); br->topology_change_detected = 0; __br_set_topology_change(br, 0); spin_unlock(&br->lock); } static void br_hold_timer_expired(struct timer_list *t) { struct net_bridge_port *p = from_timer(p, t, hold_timer); br_debug(p->br, "port %u(%s) hold timer expired\n", (unsigned int) p->port_no, p->dev->name); spin_lock(&p->br->lock); if (p->config_pending) br_transmit_config(p); spin_unlock(&p->br->lock); } void br_stp_timer_init(struct net_bridge *br) { timer_setup(&br->hello_timer, br_hello_timer_expired, 0); timer_setup(&br->tcn_timer, br_tcn_timer_expired, 0); timer_setup(&br->topology_change_timer, br_topology_change_timer_expired, 0); } void br_stp_port_timer_init(struct net_bridge_port *p) { timer_setup(&p->message_age_timer, br_message_age_timer_expired, 0); timer_setup(&p->forward_delay_timer, br_forward_delay_timer_expired, 0); timer_setup(&p->hold_timer, br_hold_timer_expired, 0); } /* Report ticks left (in USER_HZ) used for API */ unsigned long br_timer_value(const struct timer_list *timer) { return timer_pending(timer) ? jiffies_delta_to_clock_t(timer->expires - jiffies) : 0; } |
1 1 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (c) 2000-2001 Christoph Hellwig. * Copyright (c) 2016 Krzysztof Blaszkowski */ /* * Veritas filesystem driver - superblock related routines. */ #include <linux/init.h> #include <linux/module.h> #include <linux/blkdev.h> #include <linux/fs.h> #include <linux/buffer_head.h> #include <linux/kernel.h> #include <linux/slab.h> #include <linux/stat.h> #include <linux/vfs.h> #include <linux/mount.h> #include "vxfs.h" #include "vxfs_extern.h" #include "vxfs_dir.h" #include "vxfs_inode.h" MODULE_AUTHOR("Christoph Hellwig, Krzysztof Blaszkowski"); MODULE_DESCRIPTION("Veritas Filesystem (VxFS) driver"); MODULE_LICENSE("Dual BSD/GPL"); static struct kmem_cache *vxfs_inode_cachep; /** * vxfs_put_super - free superblock resources * @sbp: VFS superblock. * * Description: * vxfs_put_super frees all resources allocated for @sbp * after the last instance of the filesystem is unmounted. */ static void vxfs_put_super(struct super_block *sbp) { struct vxfs_sb_info *infp = VXFS_SBI(sbp); iput(infp->vsi_fship); iput(infp->vsi_ilist); iput(infp->vsi_stilist); brelse(infp->vsi_bp); kfree(infp); } /** * vxfs_statfs - get filesystem information * @dentry: VFS dentry to locate superblock * @bufp: output buffer * * Description: * vxfs_statfs fills the statfs buffer @bufp with information * about the filesystem described by @dentry. * * Returns: * Zero. * * Locking: * No locks held. * * Notes: * This is everything but complete... */ static int vxfs_statfs(struct dentry *dentry, struct kstatfs *bufp) { struct vxfs_sb_info *infp = VXFS_SBI(dentry->d_sb); struct vxfs_sb *raw_sb = infp->vsi_raw; u64 id = huge_encode_dev(dentry->d_sb->s_bdev->bd_dev); bufp->f_type = VXFS_SUPER_MAGIC; bufp->f_bsize = dentry->d_sb->s_blocksize; bufp->f_blocks = fs32_to_cpu(infp, raw_sb->vs_dsize); bufp->f_bfree = fs32_to_cpu(infp, raw_sb->vs_free); bufp->f_bavail = 0; bufp->f_files = 0; bufp->f_ffree = fs32_to_cpu(infp, raw_sb->vs_ifree); bufp->f_fsid = u64_to_fsid(id); bufp->f_namelen = VXFS_NAMELEN; return 0; } static int vxfs_remount(struct super_block *sb, int *flags, char *data) { sync_filesystem(sb); *flags |= SB_RDONLY; return 0; } static struct inode *vxfs_alloc_inode(struct super_block *sb) { struct vxfs_inode_info *vi; vi = alloc_inode_sb(sb, vxfs_inode_cachep, GFP_KERNEL); if (!vi) return NULL; inode_init_once(&vi->vfs_inode); return &vi->vfs_inode; } static void vxfs_free_inode(struct inode *inode) { kmem_cache_free(vxfs_inode_cachep, VXFS_INO(inode)); } static const struct super_operations vxfs_super_ops = { .alloc_inode = vxfs_alloc_inode, .free_inode = vxfs_free_inode, .evict_inode = vxfs_evict_inode, .put_super = vxfs_put_super, .statfs = vxfs_statfs, .remount_fs = vxfs_remount, }; static int vxfs_try_sb_magic(struct super_block *sbp, int silent, unsigned blk, __fs32 magic) { struct buffer_head *bp; struct vxfs_sb *rsbp; struct vxfs_sb_info *infp = VXFS_SBI(sbp); int rc = -ENOMEM; bp = sb_bread(sbp, blk); do { if (!bp || !buffer_mapped(bp)) { if (!silent) { printk(KERN_WARNING "vxfs: unable to read disk superblock at %u\n", blk); } break; } rc = -EINVAL; rsbp = (struct vxfs_sb *)bp->b_data; if (rsbp->vs_magic != magic) { if (!silent) printk(KERN_NOTICE "vxfs: WRONG superblock magic %08x at %u\n", rsbp->vs_magic, blk); break; } rc = 0; infp->vsi_raw = rsbp; infp->vsi_bp = bp; } while (0); if (rc) { infp->vsi_raw = NULL; infp->vsi_bp = NULL; brelse(bp); } return rc; } /** * vxfs_fill_super - read superblock into memory and initialize filesystem * @sbp: VFS superblock (to fill) * @dp: fs private mount data * @silent: do not complain loudly when sth is wrong * * Description: * We are called on the first mount of a filesystem to read the * superblock into memory and do some basic setup. * * Returns: * The superblock on success, else %NULL. * * Locking: * We are under @sbp->s_lock. */ static int vxfs_fill_super(struct super_block *sbp, void *dp, int silent) { struct vxfs_sb_info *infp; struct vxfs_sb *rsbp; u_long bsize; struct inode *root; int ret = -EINVAL; u32 j; sbp->s_flags |= SB_RDONLY; infp = kzalloc(sizeof(*infp), GFP_KERNEL); if (!infp) { printk(KERN_WARNING "vxfs: unable to allocate incore superblock\n"); return -ENOMEM; } bsize = sb_min_blocksize(sbp, BLOCK_SIZE); if (!bsize) { printk(KERN_WARNING "vxfs: unable to set blocksize\n"); goto out; } sbp->s_op = &vxfs_super_ops; sbp->s_fs_info = infp; sbp->s_time_min = 0; sbp->s_time_max = U32_MAX; if (!vxfs_try_sb_magic(sbp, silent, 1, (__force __fs32)cpu_to_le32(VXFS_SUPER_MAGIC))) { /* Unixware, x86 */ infp->byte_order = VXFS_BO_LE; } else if (!vxfs_try_sb_magic(sbp, silent, 8, (__force __fs32)cpu_to_be32(VXFS_SUPER_MAGIC))) { /* HP-UX, parisc */ infp->byte_order = VXFS_BO_BE; } else { if (!silent) printk(KERN_NOTICE "vxfs: can't find superblock.\n"); goto out; } rsbp = infp->vsi_raw; j = fs32_to_cpu(infp, rsbp->vs_version); if ((j < 2 || j > 4) && !silent) { printk(KERN_NOTICE "vxfs: unsupported VxFS version (%d)\n", j); goto out; } #ifdef DIAGNOSTIC printk(KERN_DEBUG "vxfs: supported VxFS version (%d)\n", j); printk(KERN_DEBUG "vxfs: blocksize: %d\n", fs32_to_cpu(infp, rsbp->vs_bsize)); #endif sbp->s_magic = fs32_to_cpu(infp, rsbp->vs_magic); infp->vsi_oltext = fs32_to_cpu(infp, rsbp->vs_oltext[0]); infp->vsi_oltsize = fs32_to_cpu(infp, rsbp->vs_oltsize); j = fs32_to_cpu(infp, rsbp->vs_bsize); if (!sb_set_blocksize(sbp, j)) { printk(KERN_WARNING "vxfs: unable to set final block size\n"); goto out; } if (vxfs_read_olt(sbp, bsize)) { printk(KERN_WARNING "vxfs: unable to read olt\n"); goto out; } if (vxfs_read_fshead(sbp)) { printk(KERN_WARNING "vxfs: unable to read fshead\n"); goto out; } root = vxfs_iget(sbp, VXFS_ROOT_INO); if (IS_ERR(root)) { ret = PTR_ERR(root); goto out; } sbp->s_root = d_make_root(root); if (!sbp->s_root) { printk(KERN_WARNING "vxfs: unable to get root dentry.\n"); goto out_free_ilist; } return 0; out_free_ilist: iput(infp->vsi_fship); iput(infp->vsi_ilist); iput(infp->vsi_stilist); out: brelse(infp->vsi_bp); kfree(infp); return ret; } /* * The usual module blurb. */ static struct dentry *vxfs_mount(struct file_system_type *fs_type, int flags, const char *dev_name, void *data) { return mount_bdev(fs_type, flags, dev_name, data, vxfs_fill_super); } static struct file_system_type vxfs_fs_type = { .owner = THIS_MODULE, .name = "vxfs", .mount = vxfs_mount, .kill_sb = kill_block_super, .fs_flags = FS_REQUIRES_DEV, }; MODULE_ALIAS_FS("vxfs"); /* makes mount -t vxfs autoload the module */ MODULE_ALIAS("vxfs"); static int __init vxfs_init(void) { int rv; vxfs_inode_cachep = kmem_cache_create_usercopy("vxfs_inode", sizeof(struct vxfs_inode_info), 0, SLAB_RECLAIM_ACCOUNT, offsetof(struct vxfs_inode_info, vii_immed.vi_immed), sizeof_field(struct vxfs_inode_info, vii_immed.vi_immed), NULL); if (!vxfs_inode_cachep) return -ENOMEM; rv = register_filesystem(&vxfs_fs_type); if (rv < 0) kmem_cache_destroy(vxfs_inode_cachep); return rv; } static void __exit vxfs_cleanup(void) { unregister_filesystem(&vxfs_fs_type); /* * Make sure all delayed rcu free inodes are flushed before we * destroy cache. */ rcu_barrier(); kmem_cache_destroy(vxfs_inode_cachep); } module_init(vxfs_init); module_exit(vxfs_cleanup); |
33 11 24 24 2 23 1 1 1 4 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 | /* vcan.c - Virtual CAN interface * * Copyright (c) 2002-2017 Volkswagen Group Electronic Research * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of Volkswagen nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * Alternatively, provided that this notice is retained in full, this * software may be distributed under the terms of the GNU General * Public License ("GPL") version 2, in which case the provisions of the * GPL apply INSTEAD OF those given above. * * The provided data structures and external interfaces from this code * are not restricted to be used by modules with a GPL compatible license. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH * DAMAGE. * */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/ethtool.h> #include <linux/module.h> #include <linux/init.h> #include <linux/netdevice.h> #include <linux/if_arp.h> #include <linux/if_ether.h> #include <linux/can.h> #include <linux/can/can-ml.h> #include <linux/can/dev.h> #include <linux/can/skb.h> #include <linux/slab.h> #include <net/rtnetlink.h> #define DRV_NAME "vcan" MODULE_DESCRIPTION("virtual CAN interface"); MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Urs Thuermann <urs.thuermann@volkswagen.de>"); MODULE_ALIAS_RTNL_LINK(DRV_NAME); /* CAN test feature: * Enable the echo on driver level for testing the CAN core echo modes. * See Documentation/networking/can.rst for details. */ static bool echo; /* echo testing. Default: 0 (Off) */ module_param(echo, bool, 0444); MODULE_PARM_DESC(echo, "Echo sent frames (for testing). Default: 0 (Off)"); static void vcan_rx(struct sk_buff *skb, struct net_device *dev) { struct net_device_stats *stats = &dev->stats; stats->rx_packets++; stats->rx_bytes += can_skb_get_data_len(skb); skb->pkt_type = PACKET_BROADCAST; skb->dev = dev; skb->ip_summed = CHECKSUM_UNNECESSARY; netif_rx(skb); } static netdev_tx_t vcan_tx(struct sk_buff *skb, struct net_device *dev) { struct net_device_stats *stats = &dev->stats; unsigned int len; int loop; if (can_dropped_invalid_skb(dev, skb)) return NETDEV_TX_OK; len = can_skb_get_data_len(skb); stats->tx_packets++; stats->tx_bytes += len; /* set flag whether this packet has to be looped back */ loop = skb->pkt_type == PACKET_LOOPBACK; skb_tx_timestamp(skb); if (!echo) { /* no echo handling available inside this driver */ if (loop) { /* only count the packets here, because the * CAN core already did the echo for us */ stats->rx_packets++; stats->rx_bytes += len; } consume_skb(skb); return NETDEV_TX_OK; } /* perform standard echo handling for CAN network interfaces */ if (loop) { skb = can_create_echo_skb(skb); if (!skb) return NETDEV_TX_OK; /* receive with packet counting */ vcan_rx(skb, dev); } else { /* no looped packets => no counting */ consume_skb(skb); } return NETDEV_TX_OK; } static int vcan_change_mtu(struct net_device *dev, int new_mtu) { /* Do not allow changing the MTU while running */ if (dev->flags & IFF_UP) return -EBUSY; if (new_mtu != CAN_MTU && new_mtu != CANFD_MTU && !can_is_canxl_dev_mtu(new_mtu)) return -EINVAL; dev->mtu = new_mtu; return 0; } static const struct net_device_ops vcan_netdev_ops = { .ndo_start_xmit = vcan_tx, .ndo_change_mtu = vcan_change_mtu, }; static const struct ethtool_ops vcan_ethtool_ops = { .get_ts_info = ethtool_op_get_ts_info, }; static void vcan_setup(struct net_device *dev) { dev->type = ARPHRD_CAN; dev->mtu = CANFD_MTU; dev->hard_header_len = 0; dev->addr_len = 0; dev->tx_queue_len = 0; dev->flags = IFF_NOARP; can_set_ml_priv(dev, netdev_priv(dev)); /* set flags according to driver capabilities */ if (echo) dev->flags |= IFF_ECHO; dev->netdev_ops = &vcan_netdev_ops; dev->ethtool_ops = &vcan_ethtool_ops; dev->needs_free_netdev = true; } static struct rtnl_link_ops vcan_link_ops __read_mostly = { .kind = DRV_NAME, .priv_size = sizeof(struct can_ml_priv), .setup = vcan_setup, }; static __init int vcan_init_module(void) { pr_info("Virtual CAN interface driver\n"); if (echo) pr_info("enabled echo on driver level.\n"); return rtnl_link_register(&vcan_link_ops); } static __exit void vcan_cleanup_module(void) { rtnl_link_unregister(&vcan_link_ops); } module_init(vcan_init_module); module_exit(vcan_cleanup_module); |
12 14 13 4 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | // SPDX-License-Identifier: GPL-2.0-only /* * Scalar fixed time AES core transform * * Copyright (C) 2017 Linaro Ltd <ard.biesheuvel@linaro.org> */ #include <crypto/aes.h> #include <crypto/algapi.h> #include <linux/module.h> static int aesti_set_key(struct crypto_tfm *tfm, const u8 *in_key, unsigned int key_len) { struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); return aes_expandkey(ctx, in_key, key_len); } static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) { const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); unsigned long flags; /* * Temporarily disable interrupts to avoid races where cachelines are * evicted when the CPU is interrupted to do something else. */ local_irq_save(flags); aes_encrypt(ctx, out, in); local_irq_restore(flags); } static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) { const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); unsigned long flags; /* * Temporarily disable interrupts to avoid races where cachelines are * evicted when the CPU is interrupted to do something else. */ local_irq_save(flags); aes_decrypt(ctx, out, in); local_irq_restore(flags); } static struct crypto_alg aes_alg = { .cra_name = "aes", .cra_driver_name = "aes-fixed-time", .cra_priority = 100 + 1, .cra_flags = CRYPTO_ALG_TYPE_CIPHER, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), .cra_module = THIS_MODULE, .cra_cipher.cia_min_keysize = AES_MIN_KEY_SIZE, .cra_cipher.cia_max_keysize = AES_MAX_KEY_SIZE, .cra_cipher.cia_setkey = aesti_set_key, .cra_cipher.cia_encrypt = aesti_encrypt, .cra_cipher.cia_decrypt = aesti_decrypt }; static int __init aes_init(void) { return crypto_register_alg(&aes_alg); } static void __exit aes_fini(void) { crypto_unregister_alg(&aes_alg); } module_init(aes_init); module_exit(aes_fini); MODULE_DESCRIPTION("Generic fixed time AES"); MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>"); MODULE_LICENSE("GPL v2"); |
80 16 2 2 3 3 3 1 3 3 3 4 2 5 3 2 1 4 4 2 3 3 2 3 2 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _BPF_CGROUP_H #define _BPF_CGROUP_H #include <linux/bpf.h> #include <linux/bpf-cgroup-defs.h> #include <linux/errno.h> #include <linux/jump_label.h> #include <linux/percpu.h> #include <linux/rbtree.h> #include <net/sock.h> #include <uapi/linux/bpf.h> struct sock; struct sockaddr; struct cgroup; struct sk_buff; struct bpf_map; struct bpf_prog; struct bpf_sock_ops_kern; struct bpf_cgroup_storage; struct ctl_table; struct ctl_table_header; struct task_struct; unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx, const struct bpf_insn *insn); unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx, const struct bpf_insn *insn); unsigned int __cgroup_bpf_run_lsm_current(const void *ctx, const struct bpf_insn *insn); #ifdef CONFIG_CGROUP_BPF #define CGROUP_ATYPE(type) \ case BPF_##type: return type static inline enum cgroup_bpf_attach_type to_cgroup_bpf_attach_type(enum bpf_attach_type attach_type) { switch (attach_type) { CGROUP_ATYPE(CGROUP_INET_INGRESS); CGROUP_ATYPE(CGROUP_INET_EGRESS); CGROUP_ATYPE(CGROUP_INET_SOCK_CREATE); CGROUP_ATYPE(CGROUP_SOCK_OPS); CGROUP_ATYPE(CGROUP_DEVICE); CGROUP_ATYPE(CGROUP_INET4_BIND); CGROUP_ATYPE(CGROUP_INET6_BIND); CGROUP_ATYPE(CGROUP_INET4_CONNECT); CGROUP_ATYPE(CGROUP_INET6_CONNECT); CGROUP_ATYPE(CGROUP_UNIX_CONNECT); CGROUP_ATYPE(CGROUP_INET4_POST_BIND); CGROUP_ATYPE(CGROUP_INET6_POST_BIND); CGROUP_ATYPE(CGROUP_UDP4_SENDMSG); CGROUP_ATYPE(CGROUP_UDP6_SENDMSG); CGROUP_ATYPE(CGROUP_UNIX_SENDMSG); CGROUP_ATYPE(CGROUP_SYSCTL); CGROUP_ATYPE(CGROUP_UDP4_RECVMSG); CGROUP_ATYPE(CGROUP_UDP6_RECVMSG); CGROUP_ATYPE(CGROUP_UNIX_RECVMSG); CGROUP_ATYPE(CGROUP_GETSOCKOPT); CGROUP_ATYPE(CGROUP_SETSOCKOPT); CGROUP_ATYPE(CGROUP_INET4_GETPEERNAME); CGROUP_ATYPE(CGROUP_INET6_GETPEERNAME); CGROUP_ATYPE(CGROUP_UNIX_GETPEERNAME); CGROUP_ATYPE(CGROUP_INET4_GETSOCKNAME); CGROUP_ATYPE(CGROUP_INET6_GETSOCKNAME); CGROUP_ATYPE(CGROUP_UNIX_GETSOCKNAME); CGROUP_ATYPE(CGROUP_INET_SOCK_RELEASE); default: return CGROUP_BPF_ATTACH_TYPE_INVALID; } } #undef CGROUP_ATYPE extern struct static_key_false cgroup_bpf_enabled_key[MAX_CGROUP_BPF_ATTACH_TYPE]; #define cgroup_bpf_enabled(atype) static_branch_unlikely(&cgroup_bpf_enabled_key[atype]) #define for_each_cgroup_storage_type(stype) \ for (stype = 0; stype < MAX_BPF_CGROUP_STORAGE_TYPE; stype++) struct bpf_cgroup_storage_map; struct bpf_storage_buffer { struct rcu_head rcu; char data[]; }; struct bpf_cgroup_storage { union { struct bpf_storage_buffer *buf; void __percpu *percpu_buf; }; struct bpf_cgroup_storage_map *map; struct bpf_cgroup_storage_key key; struct list_head list_map; struct list_head list_cg; struct rb_node node; struct rcu_head rcu; }; struct bpf_cgroup_link { struct bpf_link link; struct cgroup *cgroup; enum bpf_attach_type type; }; struct bpf_prog_list { struct hlist_node node; struct bpf_prog *prog; struct bpf_cgroup_link *link; struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE]; }; int cgroup_bpf_inherit(struct cgroup *cgrp); void cgroup_bpf_offline(struct cgroup *cgrp); int __cgroup_bpf_run_filter_skb(struct sock *sk, struct sk_buff *skb, enum cgroup_bpf_attach_type atype); int __cgroup_bpf_run_filter_sk(struct sock *sk, enum cgroup_bpf_attach_type atype); int __cgroup_bpf_run_filter_sock_addr(struct sock *sk, struct sockaddr *uaddr, int *uaddrlen, enum cgroup_bpf_attach_type atype, void *t_ctx, u32 *flags); int __cgroup_bpf_run_filter_sock_ops(struct sock *sk, struct bpf_sock_ops_kern *sock_ops, enum cgroup_bpf_attach_type atype); int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor, short access, enum cgroup_bpf_attach_type atype); int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head, struct ctl_table *table, int write, char **buf, size_t *pcount, loff_t *ppos, enum cgroup_bpf_attach_type atype); int __cgroup_bpf_run_filter_setsockopt(struct sock *sock, int *level, int *optname, sockptr_t optval, int *optlen, char **kernel_optval); int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level, int optname, sockptr_t optval, sockptr_t optlen, int max_optlen, int retval); int __cgroup_bpf_run_filter_getsockopt_kern(struct sock *sk, int level, int optname, void *optval, int *optlen, int retval); static inline enum bpf_cgroup_storage_type cgroup_storage_type( struct bpf_map *map) { if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) return BPF_CGROUP_STORAGE_PERCPU; return BPF_CGROUP_STORAGE_SHARED; } struct bpf_cgroup_storage * cgroup_storage_lookup(struct bpf_cgroup_storage_map *map, void *key, bool locked); struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog, enum bpf_cgroup_storage_type stype); void bpf_cgroup_storage_free(struct bpf_cgroup_storage *storage); void bpf_cgroup_storage_link(struct bpf_cgroup_storage *storage, struct cgroup *cgroup, enum bpf_attach_type type); void bpf_cgroup_storage_unlink(struct bpf_cgroup_storage *storage); int bpf_cgroup_storage_assign(struct bpf_prog_aux *aux, struct bpf_map *map); int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value); int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, void *value, u64 flags); /* Opportunistic check to see whether we have any BPF program attached*/ static inline bool cgroup_bpf_sock_enabled(struct sock *sk, enum cgroup_bpf_attach_type type) { struct cgroup *cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data); struct bpf_prog_array *array; array = rcu_access_pointer(cgrp->bpf.effective[type]); return array != &bpf_empty_prog_array.hdr; } /* Wrappers for __cgroup_bpf_run_filter_skb() guarded by cgroup_bpf_enabled. */ #define BPF_CGROUP_RUN_PROG_INET_INGRESS(sk, skb) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_INET_INGRESS) && \ cgroup_bpf_sock_enabled(sk, CGROUP_INET_INGRESS) && sk && \ sk_fullsock(sk)) \ __ret = __cgroup_bpf_run_filter_skb(sk, skb, \ CGROUP_INET_INGRESS); \ \ __ret; \ }) #define BPF_CGROUP_RUN_PROG_INET_EGRESS(sk, skb) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_INET_EGRESS) && sk) { \ typeof(sk) __sk = sk_to_full_sk(sk); \ if (sk_fullsock(__sk) && __sk == skb_to_full_sk(skb) && \ cgroup_bpf_sock_enabled(__sk, CGROUP_INET_EGRESS)) \ __ret = __cgroup_bpf_run_filter_skb(__sk, skb, \ CGROUP_INET_EGRESS); \ } \ __ret; \ }) #define BPF_CGROUP_RUN_SK_PROG(sk, atype) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(atype)) { \ __ret = __cgroup_bpf_run_filter_sk(sk, atype); \ } \ __ret; \ }) #define BPF_CGROUP_RUN_PROG_INET_SOCK(sk) \ BPF_CGROUP_RUN_SK_PROG(sk, CGROUP_INET_SOCK_CREATE) #define BPF_CGROUP_RUN_PROG_INET_SOCK_RELEASE(sk) \ BPF_CGROUP_RUN_SK_PROG(sk, CGROUP_INET_SOCK_RELEASE) #define BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk) \ BPF_CGROUP_RUN_SK_PROG(sk, CGROUP_INET4_POST_BIND) #define BPF_CGROUP_RUN_PROG_INET6_POST_BIND(sk) \ BPF_CGROUP_RUN_SK_PROG(sk, CGROUP_INET6_POST_BIND) #define BPF_CGROUP_RUN_SA_PROG(sk, uaddr, uaddrlen, atype) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(atype)) \ __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, uaddrlen, \ atype, NULL, NULL); \ __ret; \ }) #define BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, atype, t_ctx) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(atype)) { \ lock_sock(sk); \ __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, uaddrlen, \ atype, t_ctx, NULL); \ release_sock(sk); \ } \ __ret; \ }) /* BPF_CGROUP_INET4_BIND and BPF_CGROUP_INET6_BIND can return extra flags * via upper bits of return code. The only flag that is supported * (at bit position 0) is to indicate CAP_NET_BIND_SERVICE capability check * should be bypassed (BPF_RET_BIND_NO_CAP_NET_BIND_SERVICE). */ #define BPF_CGROUP_RUN_PROG_INET_BIND_LOCK(sk, uaddr, uaddrlen, atype, bind_flags) \ ({ \ u32 __flags = 0; \ int __ret = 0; \ if (cgroup_bpf_enabled(atype)) { \ lock_sock(sk); \ __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, uaddrlen, \ atype, NULL, &__flags); \ release_sock(sk); \ if (__flags & BPF_RET_BIND_NO_CAP_NET_BIND_SERVICE) \ *bind_flags |= BIND_NO_CAP_NET_BIND_SERVICE; \ } \ __ret; \ }) #define BPF_CGROUP_PRE_CONNECT_ENABLED(sk) \ ((cgroup_bpf_enabled(CGROUP_INET4_CONNECT) || \ cgroup_bpf_enabled(CGROUP_INET6_CONNECT)) && \ (sk)->sk_prot->pre_connect) #define BPF_CGROUP_RUN_PROG_INET4_CONNECT(sk, uaddr, uaddrlen) \ BPF_CGROUP_RUN_SA_PROG(sk, uaddr, uaddrlen, CGROUP_INET4_CONNECT) #define BPF_CGROUP_RUN_PROG_INET6_CONNECT(sk, uaddr, uaddrlen) \ BPF_CGROUP_RUN_SA_PROG(sk, uaddr, uaddrlen, CGROUP_INET6_CONNECT) #define BPF_CGROUP_RUN_PROG_INET4_CONNECT_LOCK(sk, uaddr, uaddrlen) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_INET4_CONNECT, NULL) #define BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr, uaddrlen) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_INET6_CONNECT, NULL) #define BPF_CGROUP_RUN_PROG_UNIX_CONNECT_LOCK(sk, uaddr, uaddrlen) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_UNIX_CONNECT, NULL) #define BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk, uaddr, uaddrlen, t_ctx) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_UDP4_SENDMSG, t_ctx) #define BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk, uaddr, uaddrlen, t_ctx) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_UDP6_SENDMSG, t_ctx) #define BPF_CGROUP_RUN_PROG_UNIX_SENDMSG_LOCK(sk, uaddr, uaddrlen, t_ctx) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_UNIX_SENDMSG, t_ctx) #define BPF_CGROUP_RUN_PROG_UDP4_RECVMSG_LOCK(sk, uaddr, uaddrlen) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_UDP4_RECVMSG, NULL) #define BPF_CGROUP_RUN_PROG_UDP6_RECVMSG_LOCK(sk, uaddr, uaddrlen) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_UDP6_RECVMSG, NULL) #define BPF_CGROUP_RUN_PROG_UNIX_RECVMSG_LOCK(sk, uaddr, uaddrlen) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, CGROUP_UNIX_RECVMSG, NULL) /* The SOCK_OPS"_SK" macro should be used when sock_ops->sk is not a * fullsock and its parent fullsock cannot be traced by * sk_to_full_sk(). * * e.g. sock_ops->sk is a request_sock and it is under syncookie mode. * Its listener-sk is not attached to the rsk_listener. * In this case, the caller holds the listener-sk (unlocked), * set its sock_ops->sk to req_sk, and call this SOCK_OPS"_SK" with * the listener-sk such that the cgroup-bpf-progs of the * listener-sk will be run. * * Regardless of syncookie mode or not, * calling bpf_setsockopt on listener-sk will not make sense anyway, * so passing 'sock_ops->sk == req_sk' to the bpf prog is appropriate here. */ #define BPF_CGROUP_RUN_PROG_SOCK_OPS_SK(sock_ops, sk) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_SOCK_OPS)) \ __ret = __cgroup_bpf_run_filter_sock_ops(sk, \ sock_ops, \ CGROUP_SOCK_OPS); \ __ret; \ }) #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_SOCK_OPS) && (sock_ops)->sk) { \ typeof(sk) __sk = sk_to_full_sk((sock_ops)->sk); \ if (__sk && sk_fullsock(__sk)) \ __ret = __cgroup_bpf_run_filter_sock_ops(__sk, \ sock_ops, \ CGROUP_SOCK_OPS); \ } \ __ret; \ }) #define BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(atype, major, minor, access) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_DEVICE)) \ __ret = __cgroup_bpf_check_dev_permission(atype, major, minor, \ access, \ CGROUP_DEVICE); \ \ __ret; \ }) #define BPF_CGROUP_RUN_PROG_SYSCTL(head, table, write, buf, count, pos) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_SYSCTL)) \ __ret = __cgroup_bpf_run_filter_sysctl(head, table, write, \ buf, count, pos, \ CGROUP_SYSCTL); \ __ret; \ }) #define BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock, level, optname, optval, optlen, \ kernel_optval) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_SETSOCKOPT) && \ cgroup_bpf_sock_enabled(sock, CGROUP_SETSOCKOPT)) \ __ret = __cgroup_bpf_run_filter_setsockopt(sock, level, \ optname, optval, \ optlen, \ kernel_optval); \ __ret; \ }) #define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen) \ ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT)) \ copy_from_sockptr(&__ret, optlen, sizeof(int)); \ __ret; \ }) #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname, optval, optlen, \ max_optlen, retval) \ ({ \ int __ret = retval; \ if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) && \ cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT)) \ if (!(sock)->sk_prot->bpf_bypass_getsockopt || \ !INDIRECT_CALL_INET_1((sock)->sk_prot->bpf_bypass_getsockopt, \ tcp_bpf_bypass_getsockopt, \ level, optname)) \ __ret = __cgroup_bpf_run_filter_getsockopt( \ sock, level, optname, optval, optlen, \ max_optlen, retval); \ __ret; \ }) #define BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sock, level, optname, optval, \ optlen, retval) \ ({ \ int __ret = retval; \ if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT)) \ __ret = __cgroup_bpf_run_filter_getsockopt_kern( \ sock, level, optname, optval, optlen, retval); \ __ret; \ }) int cgroup_bpf_prog_attach(const union bpf_attr *attr, enum bpf_prog_type ptype, struct bpf_prog *prog); int cgroup_bpf_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype); int cgroup_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog); int cgroup_bpf_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr); const struct bpf_func_proto * cgroup_common_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog); const struct bpf_func_proto * cgroup_current_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog); #else static inline int cgroup_bpf_inherit(struct cgroup *cgrp) { return 0; } static inline void cgroup_bpf_offline(struct cgroup *cgrp) {} static inline int cgroup_bpf_prog_attach(const union bpf_attr *attr, enum bpf_prog_type ptype, struct bpf_prog *prog) { return -EINVAL; } static inline int cgroup_bpf_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype) { return -EINVAL; } static inline int cgroup_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) { return -EINVAL; } static inline int cgroup_bpf_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr) { return -EINVAL; } static inline const struct bpf_func_proto * cgroup_common_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { return NULL; } static inline const struct bpf_func_proto * cgroup_current_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { return NULL; } static inline int bpf_cgroup_storage_assign(struct bpf_prog_aux *aux, struct bpf_map *map) { return 0; } static inline struct bpf_cgroup_storage *bpf_cgroup_storage_alloc( struct bpf_prog *prog, enum bpf_cgroup_storage_type stype) { return NULL; } static inline void bpf_cgroup_storage_free( struct bpf_cgroup_storage *storage) {} static inline int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value) { return 0; } static inline int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, void *value, u64 flags) { return 0; } #define cgroup_bpf_enabled(atype) (0) #define BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, uaddrlen, atype, t_ctx) ({ 0; }) #define BPF_CGROUP_RUN_SA_PROG(sk, uaddr, uaddrlen, atype) ({ 0; }) #define BPF_CGROUP_PRE_CONNECT_ENABLED(sk) (0) #define BPF_CGROUP_RUN_PROG_INET_INGRESS(sk,skb) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET_EGRESS(sk,skb) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET_SOCK(sk) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET_SOCK_RELEASE(sk) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET_BIND_LOCK(sk, uaddr, uaddrlen, atype, flags) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET6_POST_BIND(sk) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET4_CONNECT(sk, uaddr, uaddrlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET4_CONNECT_LOCK(sk, uaddr, uaddrlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET6_CONNECT(sk, uaddr, uaddrlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr, uaddrlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_UNIX_CONNECT_LOCK(sk, uaddr, uaddrlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk, uaddr, uaddrlen, t_ctx) ({ 0; }) #define BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk, uaddr, uaddrlen, t_ctx) ({ 0; }) #define BPF_CGROUP_RUN_PROG_UNIX_SENDMSG_LOCK(sk, uaddr, uaddrlen, t_ctx) ({ 0; }) #define BPF_CGROUP_RUN_PROG_UDP4_RECVMSG_LOCK(sk, uaddr, uaddrlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_UDP6_RECVMSG_LOCK(sk, uaddr, uaddrlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_UNIX_RECVMSG_LOCK(sk, uaddr, uaddrlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) ({ 0; }) #define BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(atype, major, minor, access) ({ 0; }) #define BPF_CGROUP_RUN_PROG_SYSCTL(head,table,write,buf,count,pos) ({ 0; }) #define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen) ({ 0; }) #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname, optval, \ optlen, max_optlen, retval) ({ retval; }) #define BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sock, level, optname, optval, \ optlen, retval) ({ retval; }) #define BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock, level, optname, optval, optlen, \ kernel_optval) ({ 0; }) #define for_each_cgroup_storage_type(stype) for (; false; ) #endif /* CONFIG_CGROUP_BPF */ #endif /* _BPF_CGROUP_H */ |
25 27 3 10 34 16 50 2 34 3 62 2 58 61 34 29 34 31 5 22 6 7 6 24 31 1 1 6 6 6 1 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 | // SPDX-License-Identifier: GPL-2.0-only /* * DCCP connection tracking protocol helper * * Copyright (c) 2005, 2006, 2008 Patrick McHardy <kaber@trash.net> */ #include <linux/kernel.h> #include <linux/init.h> #include <linux/sysctl.h> #include <linux/spinlock.h> #include <linux/skbuff.h> #include <linux/dccp.h> #include <linux/slab.h> #include <net/net_namespace.h> #include <net/netns/generic.h> #include <linux/netfilter/nfnetlink_conntrack.h> #include <net/netfilter/nf_conntrack.h> #include <net/netfilter/nf_conntrack_l4proto.h> #include <net/netfilter/nf_conntrack_ecache.h> #include <net/netfilter/nf_conntrack_timeout.h> #include <net/netfilter/nf_log.h> /* Timeouts are based on values from RFC4340: * * - REQUEST: * * 8.1.2. Client Request * * A client MAY give up on its DCCP-Requests after some time * (3 minutes, for example). * * - RESPOND: * * 8.1.3. Server Response * * It MAY also leave the RESPOND state for CLOSED after a timeout of * not less than 4MSL (8 minutes); * * - PARTOPEN: * * 8.1.5. Handshake Completion * * If the client remains in PARTOPEN for more than 4MSL (8 minutes), * it SHOULD reset the connection with Reset Code 2, "Aborted". * * - OPEN: * * The DCCP timestamp overflows after 11.9 hours. If the connection * stays idle this long the sequence number won't be recognized * as valid anymore. * * - CLOSEREQ/CLOSING: * * 8.3. Termination * * The retransmission timer should initially be set to go off in two * round-trip times and should back off to not less than once every * 64 seconds ... * * - TIMEWAIT: * * 4.3. States * * A server or client socket remains in this state for 2MSL (4 minutes) * after the connection has been town down, ... */ #define DCCP_MSL (2 * 60 * HZ) #ifdef CONFIG_NF_CONNTRACK_PROCFS static const char * const dccp_state_names[] = { [CT_DCCP_NONE] = "NONE", [CT_DCCP_REQUEST] = "REQUEST", [CT_DCCP_RESPOND] = "RESPOND", [CT_DCCP_PARTOPEN] = "PARTOPEN", [CT_DCCP_OPEN] = "OPEN", [CT_DCCP_CLOSEREQ] = "CLOSEREQ", [CT_DCCP_CLOSING] = "CLOSING", [CT_DCCP_TIMEWAIT] = "TIMEWAIT", [CT_DCCP_IGNORE] = "IGNORE", [CT_DCCP_INVALID] = "INVALID", }; #endif #define sNO CT_DCCP_NONE #define sRQ CT_DCCP_REQUEST #define sRS CT_DCCP_RESPOND #define sPO CT_DCCP_PARTOPEN #define sOP CT_DCCP_OPEN #define sCR CT_DCCP_CLOSEREQ #define sCG CT_DCCP_CLOSING #define sTW CT_DCCP_TIMEWAIT #define sIG CT_DCCP_IGNORE #define sIV CT_DCCP_INVALID /* * DCCP state transition table * * The assumption is the same as for TCP tracking: * * We are the man in the middle. All the packets go through us but might * get lost in transit to the destination. It is assumed that the destination * can't receive segments we haven't seen. * * The following states exist: * * NONE: Initial state, expecting Request * REQUEST: Request seen, waiting for Response from server * RESPOND: Response from server seen, waiting for Ack from client * PARTOPEN: Ack after Response seen, waiting for packet other than Response, * Reset or Sync from server * OPEN: Packet other than Response, Reset or Sync seen * CLOSEREQ: CloseReq from server seen, expecting Close from client * CLOSING: Close seen, expecting Reset * TIMEWAIT: Reset seen * IGNORE: Not determinable whether packet is valid * * Some states exist only on one side of the connection: REQUEST, RESPOND, * PARTOPEN, CLOSEREQ. For the other side these states are equivalent to * the one it was in before. * * Packets are marked as ignored (sIG) if we don't know if they're valid * (for example a reincarnation of a connection we didn't notice is dead * already) and the server may send back a connection closing Reset or a * Response. They're also used for Sync/SyncAck packets, which we don't * care about. */ static const u_int8_t dccp_state_table[CT_DCCP_ROLE_MAX + 1][DCCP_PKT_SYNCACK + 1][CT_DCCP_MAX + 1] = { [CT_DCCP_ROLE_CLIENT] = { [DCCP_PKT_REQUEST] = { /* * sNO -> sRQ Regular Request * sRQ -> sRQ Retransmitted Request or reincarnation * sRS -> sRS Retransmitted Request (apparently Response * got lost after we saw it) or reincarnation * sPO -> sIG Ignore, conntrack might be out of sync * sOP -> sIG Ignore, conntrack might be out of sync * sCR -> sIG Ignore, conntrack might be out of sync * sCG -> sIG Ignore, conntrack might be out of sync * sTW -> sRQ Reincarnation * * sNO, sRQ, sRS, sPO. sOP, sCR, sCG, sTW, */ sRQ, sRQ, sRS, sIG, sIG, sIG, sIG, sRQ, }, [DCCP_PKT_RESPONSE] = { /* * sNO -> sIV Invalid * sRQ -> sIG Ignore, might be response to ignored Request * sRS -> sIG Ignore, might be response to ignored Request * sPO -> sIG Ignore, might be response to ignored Request * sOP -> sIG Ignore, might be response to ignored Request * sCR -> sIG Ignore, might be response to ignored Request * sCG -> sIG Ignore, might be response to ignored Request * sTW -> sIV Invalid, reincarnation in reverse direction * goes through sRQ * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIG, sIG, sIG, sIG, sIG, sIG, sIV, }, [DCCP_PKT_ACK] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sPO Ack for Response, move to PARTOPEN (8.1.5.) * sPO -> sPO Retransmitted Ack for Response, remain in PARTOPEN * sOP -> sOP Regular ACK, remain in OPEN * sCR -> sCR Ack in CLOSEREQ MAY be processed (8.3.) * sCG -> sCG Ack in CLOSING MAY be processed (8.3.) * sTW -> sIV * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sPO, sPO, sOP, sCR, sCG, sIV }, [DCCP_PKT_DATA] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sIV No connection * sPO -> sIV MUST use DataAck in PARTOPEN state (8.1.5.) * sOP -> sOP Regular Data packet * sCR -> sCR Data in CLOSEREQ MAY be processed (8.3.) * sCG -> sCG Data in CLOSING MAY be processed (8.3.) * sTW -> sIV * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sIV, sIV, sOP, sCR, sCG, sIV, }, [DCCP_PKT_DATAACK] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sPO Ack for Response, move to PARTOPEN (8.1.5.) * sPO -> sPO Remain in PARTOPEN state * sOP -> sOP Regular DataAck packet in OPEN state * sCR -> sCR DataAck in CLOSEREQ MAY be processed (8.3.) * sCG -> sCG DataAck in CLOSING MAY be processed (8.3.) * sTW -> sIV * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sPO, sPO, sOP, sCR, sCG, sIV }, [DCCP_PKT_CLOSEREQ] = { /* * CLOSEREQ may only be sent by the server. * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sIV, sIV, sIV, sIV, sIV, sIV }, [DCCP_PKT_CLOSE] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sIV No connection * sPO -> sCG Client-initiated close * sOP -> sCG Client-initiated close * sCR -> sCG Close in response to CloseReq (8.3.) * sCG -> sCG Retransmit * sTW -> sIV Late retransmit, already in TIME_WAIT * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sIV, sCG, sCG, sCG, sIV, sIV }, [DCCP_PKT_RESET] = { /* * sNO -> sIV No connection * sRQ -> sTW Sync received or timeout, SHOULD send Reset (8.1.1.) * sRS -> sTW Response received without Request * sPO -> sTW Timeout, SHOULD send Reset (8.1.5.) * sOP -> sTW Connection reset * sCR -> sTW Connection reset * sCG -> sTW Connection reset * sTW -> sIG Ignore (don't refresh timer) * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sTW, sTW, sTW, sTW, sTW, sTW, sIG }, [DCCP_PKT_SYNC] = { /* * We currently ignore Sync packets * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIG, sIG, sIG, sIG, sIG, sIG, sIG, }, [DCCP_PKT_SYNCACK] = { /* * We currently ignore SyncAck packets * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIG, sIG, sIG, sIG, sIG, sIG, sIG, }, }, [CT_DCCP_ROLE_SERVER] = { [DCCP_PKT_REQUEST] = { /* * sNO -> sIV Invalid * sRQ -> sIG Ignore, conntrack might be out of sync * sRS -> sIG Ignore, conntrack might be out of sync * sPO -> sIG Ignore, conntrack might be out of sync * sOP -> sIG Ignore, conntrack might be out of sync * sCR -> sIG Ignore, conntrack might be out of sync * sCG -> sIG Ignore, conntrack might be out of sync * sTW -> sRQ Reincarnation, must reverse roles * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIG, sIG, sIG, sIG, sIG, sIG, sRQ }, [DCCP_PKT_RESPONSE] = { /* * sNO -> sIV Response without Request * sRQ -> sRS Response to clients Request * sRS -> sRS Retransmitted Response (8.1.3. SHOULD NOT) * sPO -> sIG Response to an ignored Request or late retransmit * sOP -> sIG Ignore, might be response to ignored Request * sCR -> sIG Ignore, might be response to ignored Request * sCG -> sIG Ignore, might be response to ignored Request * sTW -> sIV Invalid, Request from client in sTW moves to sRQ * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sRS, sRS, sIG, sIG, sIG, sIG, sIV }, [DCCP_PKT_ACK] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sIV No connection * sPO -> sOP Enter OPEN state (8.1.5.) * sOP -> sOP Regular Ack in OPEN state * sCR -> sIV Waiting for Close from client * sCG -> sCG Ack in CLOSING MAY be processed (8.3.) * sTW -> sIV * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sIV, sOP, sOP, sIV, sCG, sIV }, [DCCP_PKT_DATA] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sIV No connection * sPO -> sOP Enter OPEN state (8.1.5.) * sOP -> sOP Regular Data packet in OPEN state * sCR -> sIV Waiting for Close from client * sCG -> sCG Data in CLOSING MAY be processed (8.3.) * sTW -> sIV * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sIV, sOP, sOP, sIV, sCG, sIV }, [DCCP_PKT_DATAACK] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sIV No connection * sPO -> sOP Enter OPEN state (8.1.5.) * sOP -> sOP Regular DataAck in OPEN state * sCR -> sIV Waiting for Close from client * sCG -> sCG Data in CLOSING MAY be processed (8.3.) * sTW -> sIV * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sIV, sOP, sOP, sIV, sCG, sIV }, [DCCP_PKT_CLOSEREQ] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sIV No connection * sPO -> sOP -> sCR Move directly to CLOSEREQ (8.1.5.) * sOP -> sCR CloseReq in OPEN state * sCR -> sCR Retransmit * sCG -> sCR Simultaneous close, client sends another Close * sTW -> sIV Already closed * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sIV, sCR, sCR, sCR, sCR, sIV }, [DCCP_PKT_CLOSE] = { /* * sNO -> sIV No connection * sRQ -> sIV No connection * sRS -> sIV No connection * sPO -> sOP -> sCG Move direcly to CLOSING * sOP -> sCG Move to CLOSING * sCR -> sIV Close after CloseReq is invalid * sCG -> sCG Retransmit * sTW -> sIV Already closed * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIV, sIV, sCG, sCG, sIV, sCG, sIV }, [DCCP_PKT_RESET] = { /* * sNO -> sIV No connection * sRQ -> sTW Reset in response to Request * sRS -> sTW Timeout, SHOULD send Reset (8.1.3.) * sPO -> sTW Timeout, SHOULD send Reset (8.1.3.) * sOP -> sTW * sCR -> sTW * sCG -> sTW * sTW -> sIG Ignore (don't refresh timer) * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW, sTW */ sIV, sTW, sTW, sTW, sTW, sTW, sTW, sTW, sIG }, [DCCP_PKT_SYNC] = { /* * We currently ignore Sync packets * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIG, sIG, sIG, sIG, sIG, sIG, sIG, }, [DCCP_PKT_SYNCACK] = { /* * We currently ignore SyncAck packets * * sNO, sRQ, sRS, sPO, sOP, sCR, sCG, sTW */ sIV, sIG, sIG, sIG, sIG, sIG, sIG, sIG, }, }, }; static noinline bool dccp_new(struct nf_conn *ct, const struct sk_buff *skb, const struct dccp_hdr *dh, const struct nf_hook_state *hook_state) { struct net *net = nf_ct_net(ct); struct nf_dccp_net *dn; const char *msg; u_int8_t state; state = dccp_state_table[CT_DCCP_ROLE_CLIENT][dh->dccph_type][CT_DCCP_NONE]; switch (state) { default: dn = nf_dccp_pernet(net); if (dn->dccp_loose == 0) { msg = "not picking up existing connection "; goto out_invalid; } break; case CT_DCCP_REQUEST: break; case CT_DCCP_INVALID: msg = "invalid state transition "; goto out_invalid; } ct->proto.dccp.role[IP_CT_DIR_ORIGINAL] = CT_DCCP_ROLE_CLIENT; ct->proto.dccp.role[IP_CT_DIR_REPLY] = CT_DCCP_ROLE_SERVER; ct->proto.dccp.state = CT_DCCP_NONE; ct->proto.dccp.last_pkt = DCCP_PKT_REQUEST; ct->proto.dccp.last_dir = IP_CT_DIR_ORIGINAL; ct->proto.dccp.handshake_seq = 0; return true; out_invalid: nf_ct_l4proto_log_invalid(skb, ct, hook_state, "%s", msg); return false; } static u64 dccp_ack_seq(const struct dccp_hdr *dh) { const struct dccp_hdr_ack_bits *dhack; dhack = (void *)dh + __dccp_basic_hdr_len(dh); return ((u64)ntohs(dhack->dccph_ack_nr_high) << 32) + ntohl(dhack->dccph_ack_nr_low); } static bool dccp_error(const struct dccp_hdr *dh, struct sk_buff *skb, unsigned int dataoff, const struct nf_hook_state *state) { static const unsigned long require_seq48 = 1 << DCCP_PKT_REQUEST | 1 << DCCP_PKT_RESPONSE | 1 << DCCP_PKT_CLOSEREQ | 1 << DCCP_PKT_CLOSE | 1 << DCCP_PKT_RESET | 1 << DCCP_PKT_SYNC | 1 << DCCP_PKT_SYNCACK; unsigned int dccp_len = skb->len - dataoff; unsigned int cscov; const char *msg; u8 type; BUILD_BUG_ON(DCCP_PKT_INVALID >= BITS_PER_LONG); if (dh->dccph_doff * 4 < sizeof(struct dccp_hdr) || dh->dccph_doff * 4 > dccp_len) { msg = "nf_ct_dccp: truncated/malformed packet "; goto out_invalid; } cscov = dccp_len; if (dh->dccph_cscov) { cscov = (dh->dccph_cscov - 1) * 4; if (cscov > dccp_len) { msg = "nf_ct_dccp: bad checksum coverage "; goto out_invalid; } } if (state->hook == NF_INET_PRE_ROUTING && state->net->ct.sysctl_checksum && nf_checksum_partial(skb, state->hook, dataoff, cscov, IPPROTO_DCCP, state->pf)) { msg = "nf_ct_dccp: bad checksum "; goto out_invalid; } type = dh->dccph_type; if (type >= DCCP_PKT_INVALID) { msg = "nf_ct_dccp: reserved packet type "; goto out_invalid; } if (test_bit(type, &require_seq48) && !dh->dccph_x) { msg = "nf_ct_dccp: type lacks 48bit sequence numbers"; goto out_invalid; } return false; out_invalid: nf_l4proto_log_invalid(skb, state, IPPROTO_DCCP, "%s", msg); return true; } struct nf_conntrack_dccp_buf { struct dccp_hdr dh; /* generic header part */ struct dccp_hdr_ext ext; /* optional depending dh->dccph_x */ union { /* depends on header type */ struct dccp_hdr_ack_bits ack; struct dccp_hdr_request req; struct dccp_hdr_response response; struct dccp_hdr_reset rst; } u; }; static struct dccp_hdr * dccp_header_pointer(const struct sk_buff *skb, int offset, const struct dccp_hdr *dh, struct nf_conntrack_dccp_buf *buf) { unsigned int hdrlen = __dccp_hdr_len(dh); if (hdrlen > sizeof(*buf)) return NULL; return skb_header_pointer(skb, offset, hdrlen, buf); } int nf_conntrack_dccp_packet(struct nf_conn *ct, struct sk_buff *skb, unsigned int dataoff, enum ip_conntrack_info ctinfo, const struct nf_hook_state *state) { enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo); struct nf_conntrack_dccp_buf _dh; u_int8_t type, old_state, new_state; enum ct_dccp_roles role; unsigned int *timeouts; struct dccp_hdr *dh; dh = skb_header_pointer(skb, dataoff, sizeof(*dh), &_dh.dh); if (!dh) return NF_DROP; if (dccp_error(dh, skb, dataoff, state)) return -NF_ACCEPT; /* pull again, including possible 48 bit sequences and subtype header */ dh = dccp_header_pointer(skb, dataoff, dh, &_dh); if (!dh) return NF_DROP; type = dh->dccph_type; if (!nf_ct_is_confirmed(ct) && !dccp_new(ct, skb, dh, state)) return -NF_ACCEPT; if (type == DCCP_PKT_RESET && !test_bit(IPS_SEEN_REPLY_BIT, &ct->status)) { /* Tear down connection immediately if only reply is a RESET */ nf_ct_kill_acct(ct, ctinfo, skb); return NF_ACCEPT; } spin_lock_bh(&ct->lock); role = ct->proto.dccp.role[dir]; old_state = ct->proto.dccp.state; new_state = dccp_state_table[role][type][old_state]; switch (new_state) { case CT_DCCP_REQUEST: if (old_state == CT_DCCP_TIMEWAIT && role == CT_DCCP_ROLE_SERVER) { /* Reincarnation in the reverse direction: reopen and * reverse client/server roles. */ ct->proto.dccp.role[dir] = CT_DCCP_ROLE_CLIENT; ct->proto.dccp.role[!dir] = CT_DCCP_ROLE_SERVER; } break; case CT_DCCP_RESPOND: if (old_state == CT_DCCP_REQUEST) ct->proto.dccp.handshake_seq = dccp_hdr_seq(dh); break; case CT_DCCP_PARTOPEN: if (old_state == CT_DCCP_RESPOND && type == DCCP_PKT_ACK && dccp_ack_seq(dh) == ct->proto.dccp.handshake_seq) set_bit(IPS_ASSURED_BIT, &ct->status); break; case CT_DCCP_IGNORE: /* * Connection tracking might be out of sync, so we ignore * packets that might establish a new connection and resync * if the server responds with a valid Response. */ if (ct->proto.dccp.last_dir == !dir && ct->proto.dccp.last_pkt == DCCP_PKT_REQUEST && type == DCCP_PKT_RESPONSE) { ct->proto.dccp.role[!dir] = CT_DCCP_ROLE_CLIENT; ct->proto.dccp.role[dir] = CT_DCCP_ROLE_SERVER; ct->proto.dccp.handshake_seq = dccp_hdr_seq(dh); new_state = CT_DCCP_RESPOND; break; } ct->proto.dccp.last_dir = dir; ct->proto.dccp.last_pkt = type; spin_unlock_bh(&ct->lock); nf_ct_l4proto_log_invalid(skb, ct, state, "%s", "invalid packet"); return NF_ACCEPT; case CT_DCCP_INVALID: spin_unlock_bh(&ct->lock); nf_ct_l4proto_log_invalid(skb, ct, state, "%s", "invalid state transition"); return -NF_ACCEPT; } ct->proto.dccp.last_dir = dir; ct->proto.dccp.last_pkt = type; ct->proto.dccp.state = new_state; spin_unlock_bh(&ct->lock); if (new_state != old_state) nf_conntrack_event_cache(IPCT_PROTOINFO, ct); timeouts = nf_ct_timeout_lookup(ct); if (!timeouts) timeouts = nf_dccp_pernet(nf_ct_net(ct))->dccp_timeout; nf_ct_refresh_acct(ct, ctinfo, skb, timeouts[new_state]); return NF_ACCEPT; } static bool dccp_can_early_drop(const struct nf_conn *ct) { switch (ct->proto.dccp.state) { case CT_DCCP_CLOSEREQ: case CT_DCCP_CLOSING: case CT_DCCP_TIMEWAIT: return true; default: break; } return false; } #ifdef CONFIG_NF_CONNTRACK_PROCFS static void dccp_print_conntrack(struct seq_file *s, struct nf_conn *ct) { seq_printf(s, "%s ", dccp_state_names[ct->proto.dccp.state]); } #endif #if IS_ENABLED(CONFIG_NF_CT_NETLINK) static int dccp_to_nlattr(struct sk_buff *skb, struct nlattr *nla, struct nf_conn *ct, bool destroy) { struct nlattr *nest_parms; spin_lock_bh(&ct->lock); nest_parms = nla_nest_start(skb, CTA_PROTOINFO_DCCP); if (!nest_parms) goto nla_put_failure; if (nla_put_u8(skb, CTA_PROTOINFO_DCCP_STATE, ct->proto.dccp.state)) goto nla_put_failure; if (destroy) goto skip_state; if (nla_put_u8(skb, CTA_PROTOINFO_DCCP_ROLE, ct->proto.dccp.role[IP_CT_DIR_ORIGINAL]) || nla_put_be64(skb, CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ, cpu_to_be64(ct->proto.dccp.handshake_seq), CTA_PROTOINFO_DCCP_PAD)) goto nla_put_failure; skip_state: nla_nest_end(skb, nest_parms); spin_unlock_bh(&ct->lock); return 0; nla_put_failure: spin_unlock_bh(&ct->lock); return -1; } static const struct nla_policy dccp_nla_policy[CTA_PROTOINFO_DCCP_MAX + 1] = { [CTA_PROTOINFO_DCCP_STATE] = { .type = NLA_U8 }, [CTA_PROTOINFO_DCCP_ROLE] = { .type = NLA_U8 }, [CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ] = { .type = NLA_U64 }, [CTA_PROTOINFO_DCCP_PAD] = { .type = NLA_UNSPEC }, }; #define DCCP_NLATTR_SIZE ( \ NLA_ALIGN(NLA_HDRLEN + 1) + \ NLA_ALIGN(NLA_HDRLEN + 1) + \ NLA_ALIGN(NLA_HDRLEN + sizeof(u64)) + \ NLA_ALIGN(NLA_HDRLEN + 0)) static int nlattr_to_dccp(struct nlattr *cda[], struct nf_conn *ct) { struct nlattr *attr = cda[CTA_PROTOINFO_DCCP]; struct nlattr *tb[CTA_PROTOINFO_DCCP_MAX + 1]; int err; if (!attr) return 0; err = nla_parse_nested_deprecated(tb, CTA_PROTOINFO_DCCP_MAX, attr, dccp_nla_policy, NULL); if (err < 0) return err; if (!tb[CTA_PROTOINFO_DCCP_STATE] || !tb[CTA_PROTOINFO_DCCP_ROLE] || nla_get_u8(tb[CTA_PROTOINFO_DCCP_ROLE]) > CT_DCCP_ROLE_MAX || nla_get_u8(tb[CTA_PROTOINFO_DCCP_STATE]) >= CT_DCCP_IGNORE) { return -EINVAL; } spin_lock_bh(&ct->lock); ct->proto.dccp.state = nla_get_u8(tb[CTA_PROTOINFO_DCCP_STATE]); if (nla_get_u8(tb[CTA_PROTOINFO_DCCP_ROLE]) == CT_DCCP_ROLE_CLIENT) { ct->proto.dccp.role[IP_CT_DIR_ORIGINAL] = CT_DCCP_ROLE_CLIENT; ct->proto.dccp.role[IP_CT_DIR_REPLY] = CT_DCCP_ROLE_SERVER; } else { ct->proto.dccp.role[IP_CT_DIR_ORIGINAL] = CT_DCCP_ROLE_SERVER; ct->proto.dccp.role[IP_CT_DIR_REPLY] = CT_DCCP_ROLE_CLIENT; } if (tb[CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ]) { ct->proto.dccp.handshake_seq = be64_to_cpu(nla_get_be64(tb[CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ])); } spin_unlock_bh(&ct->lock); return 0; } #endif #ifdef CONFIG_NF_CONNTRACK_TIMEOUT #include <linux/netfilter/nfnetlink.h> #include <linux/netfilter/nfnetlink_cttimeout.h> static int dccp_timeout_nlattr_to_obj(struct nlattr *tb[], struct net *net, void *data) { struct nf_dccp_net *dn = nf_dccp_pernet(net); unsigned int *timeouts = data; int i; if (!timeouts) timeouts = dn->dccp_timeout; /* set default DCCP timeouts. */ for (i=0; i<CT_DCCP_MAX; i++) timeouts[i] = dn->dccp_timeout[i]; /* there's a 1:1 mapping between attributes and protocol states. */ for (i=CTA_TIMEOUT_DCCP_UNSPEC+1; i<CTA_TIMEOUT_DCCP_MAX+1; i++) { if (tb[i]) { timeouts[i] = ntohl(nla_get_be32(tb[i])) * HZ; } } timeouts[CTA_TIMEOUT_DCCP_UNSPEC] = timeouts[CTA_TIMEOUT_DCCP_REQUEST]; return 0; } static int dccp_timeout_obj_to_nlattr(struct sk_buff *skb, const void *data) { const unsigned int *timeouts = data; int i; for (i=CTA_TIMEOUT_DCCP_UNSPEC+1; i<CTA_TIMEOUT_DCCP_MAX+1; i++) { if (nla_put_be32(skb, i, htonl(timeouts[i] / HZ))) goto nla_put_failure; } return 0; nla_put_failure: return -ENOSPC; } static const struct nla_policy dccp_timeout_nla_policy[CTA_TIMEOUT_DCCP_MAX+1] = { [CTA_TIMEOUT_DCCP_REQUEST] = { .type = NLA_U32 }, [CTA_TIMEOUT_DCCP_RESPOND] = { .type = NLA_U32 }, [CTA_TIMEOUT_DCCP_PARTOPEN] = { .type = NLA_U32 }, [CTA_TIMEOUT_DCCP_OPEN] = { .type = NLA_U32 }, [CTA_TIMEOUT_DCCP_CLOSEREQ] = { .type = NLA_U32 }, [CTA_TIMEOUT_DCCP_CLOSING] = { .type = NLA_U32 }, [CTA_TIMEOUT_DCCP_TIMEWAIT] = { .type = NLA_U32 }, }; #endif /* CONFIG_NF_CONNTRACK_TIMEOUT */ void nf_conntrack_dccp_init_net(struct net *net) { struct nf_dccp_net *dn = nf_dccp_pernet(net); /* default values */ dn->dccp_loose = 1; dn->dccp_timeout[CT_DCCP_REQUEST] = 2 * DCCP_MSL; dn->dccp_timeout[CT_DCCP_RESPOND] = 4 * DCCP_MSL; dn->dccp_timeout[CT_DCCP_PARTOPEN] = 4 * DCCP_MSL; dn->dccp_timeout[CT_DCCP_OPEN] = 12 * 3600 * HZ; dn->dccp_timeout[CT_DCCP_CLOSEREQ] = 64 * HZ; dn->dccp_timeout[CT_DCCP_CLOSING] = 64 * HZ; dn->dccp_timeout[CT_DCCP_TIMEWAIT] = 2 * DCCP_MSL; /* timeouts[0] is unused, make it same as SYN_SENT so * ->timeouts[0] contains 'new' timeout, like udp or icmp. */ dn->dccp_timeout[CT_DCCP_NONE] = dn->dccp_timeout[CT_DCCP_REQUEST]; } const struct nf_conntrack_l4proto nf_conntrack_l4proto_dccp = { .l4proto = IPPROTO_DCCP, .can_early_drop = dccp_can_early_drop, #ifdef CONFIG_NF_CONNTRACK_PROCFS .print_conntrack = dccp_print_conntrack, #endif #if IS_ENABLED(CONFIG_NF_CT_NETLINK) .nlattr_size = DCCP_NLATTR_SIZE, .to_nlattr = dccp_to_nlattr, .from_nlattr = nlattr_to_dccp, .tuple_to_nlattr = nf_ct_port_tuple_to_nlattr, .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size, .nlattr_to_tuple = nf_ct_port_nlattr_to_tuple, .nla_policy = nf_ct_port_nla_policy, #endif #ifdef CONFIG_NF_CONNTRACK_TIMEOUT .ctnl_timeout = { .nlattr_to_obj = dccp_timeout_nlattr_to_obj, .obj_to_nlattr = dccp_timeout_obj_to_nlattr, .nlattr_max = CTA_TIMEOUT_DCCP_MAX, .obj_size = sizeof(unsigned int) * CT_DCCP_MAX, .nla_policy = dccp_timeout_nla_policy, }, #endif /* CONFIG_NF_CONNTRACK_TIMEOUT */ }; |
9 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 | /* * net/tipc/core.h: Include file for TIPC global declarations * * Copyright (c) 2005-2006, 2013-2018 Ericsson AB * Copyright (c) 2005-2007, 2010-2013, Wind River Systems * Copyright (c) 2020, Red Hat Inc * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the names of the copyright holders nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * Alternatively, this software may be distributed under the terms of the * GNU General Public License ("GPL") version 2 as published by the Free * Software Foundation. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ #ifndef _TIPC_CORE_H #define _TIPC_CORE_H #include <linux/tipc.h> #include <linux/tipc_config.h> #include <linux/tipc_netlink.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/errno.h> #include <linux/mm.h> #include <linux/timer.h> #include <linux/string.h> #include <linux/uaccess.h> #include <linux/interrupt.h> #include <linux/atomic.h> #include <linux/netdevice.h> #include <linux/in.h> #include <linux/list.h> #include <linux/slab.h> #include <linux/vmalloc.h> #include <linux/rtnetlink.h> #include <linux/etherdevice.h> #include <net/netns/generic.h> #include <linux/rhashtable.h> #include <net/genetlink.h> #include <net/netns/hash.h> #ifdef pr_fmt #undef pr_fmt #endif #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt struct tipc_node; struct tipc_bearer; struct tipc_bc_base; struct tipc_link; struct tipc_name_table; struct tipc_topsrv; struct tipc_monitor; #ifdef CONFIG_TIPC_CRYPTO struct tipc_crypto; #endif #define TIPC_MOD_VER "2.0.0" #define NODE_HTABLE_SIZE 512 #define MAX_BEARERS 3 #define TIPC_DEF_MON_THRESHOLD 32 #define NODE_ID_LEN 16 #define NODE_ID_STR_LEN (NODE_ID_LEN * 2 + 1) extern unsigned int tipc_net_id __read_mostly; extern int sysctl_tipc_rmem[3] __read_mostly; extern int sysctl_tipc_named_timeout __read_mostly; struct tipc_net { u8 node_id[NODE_ID_LEN]; u32 node_addr; u32 trial_addr; unsigned long addr_trial_end; char node_id_string[NODE_ID_STR_LEN]; int net_id; int random; bool legacy_addr_format; /* Node table and node list */ spinlock_t node_list_lock; struct hlist_head node_htable[NODE_HTABLE_SIZE]; struct list_head node_list; u32 num_nodes; u32 num_links; /* Neighbor monitoring list */ struct tipc_monitor *monitors[MAX_BEARERS]; int mon_threshold; /* Bearer list */ struct tipc_bearer __rcu *bearer_list[MAX_BEARERS + 1]; /* Broadcast link */ spinlock_t bclock; struct tipc_bc_base *bcbase; struct tipc_link *bcl; /* Socket hash table */ struct rhashtable sk_rht; /* Name table */ spinlock_t nametbl_lock; struct name_table *nametbl; /* Topology subscription server */ struct tipc_topsrv *topsrv; atomic_t subscription_count; /* Cluster capabilities */ u16 capabilities; /* Tracing of node internal messages */ struct packet_type loopback_pt; #ifdef CONFIG_TIPC_CRYPTO /* TX crypto handler */ struct tipc_crypto *crypto_tx; #endif /* Work item for net finalize */ struct work_struct work; /* The numbers of work queues in schedule */ atomic_t wq_count; }; static inline struct tipc_net *tipc_net(struct net *net) { return net_generic(net, tipc_net_id); } static inline int tipc_netid(struct net *net) { return tipc_net(net)->net_id; } static inline struct list_head *tipc_nodes(struct net *net) { return &tipc_net(net)->node_list; } static inline struct name_table *tipc_name_table(struct net *net) { return tipc_net(net)->nametbl; } static inline struct tipc_topsrv *tipc_topsrv(struct net *net) { return tipc_net(net)->topsrv; } static inline unsigned int tipc_hashfn(u32 addr) { return addr & (NODE_HTABLE_SIZE - 1); } static inline u16 mod(u16 x) { return x & 0xffffu; } static inline int less_eq(u16 left, u16 right) { return mod(right - left) < 32768u; } static inline int more(u16 left, u16 right) { return !less_eq(left, right); } static inline int less(u16 left, u16 right) { return less_eq(left, right) && (mod(right) != mod(left)); } static inline int tipc_in_range(u16 val, u16 min, u16 max) { return !less(val, min) && !more(val, max); } static inline u32 tipc_net_hash_mixes(struct net *net, int tn_rand) { return net_hash_mix(&init_net) ^ net_hash_mix(net) ^ tn_rand; } static inline u32 hash128to32(char *bytes) { __be32 *tmp = (__be32 *)bytes; u32 res; res = ntohl(tmp[0] ^ tmp[1] ^ tmp[2] ^ tmp[3]); if (likely(res)) return res; return ntohl(tmp[0] | tmp[1] | tmp[2] | tmp[3]); } #ifdef CONFIG_SYSCTL int tipc_register_sysctl(void); void tipc_unregister_sysctl(void); #else #define tipc_register_sysctl() 0 #define tipc_unregister_sysctl() #endif #endif |
3 3 3 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 | // SPDX-License-Identifier: GPL-2.0-only /* * SCSI RDMA (SRP) transport class * * Copyright (C) 2007 FUJITA Tomonori <tomof@acm.org> */ #include <linux/init.h> #include <linux/module.h> #include <linux/jiffies.h> #include <linux/err.h> #include <linux/slab.h> #include <linux/string.h> #include <scsi/scsi.h> #include <scsi/scsi_cmnd.h> #include <scsi/scsi_device.h> #include <scsi/scsi_host.h> #include <scsi/scsi_transport.h> #include <scsi/scsi_transport_srp.h> #include "scsi_priv.h" struct srp_host_attrs { atomic_t next_port_id; }; #define to_srp_host_attrs(host) ((struct srp_host_attrs *)(host)->shost_data) #define SRP_HOST_ATTRS 0 #define SRP_RPORT_ATTRS 8 struct srp_internal { struct scsi_transport_template t; struct srp_function_template *f; struct device_attribute *host_attrs[SRP_HOST_ATTRS + 1]; struct device_attribute *rport_attrs[SRP_RPORT_ATTRS + 1]; struct transport_container rport_attr_cont; }; static int scsi_is_srp_rport(const struct device *dev); #define to_srp_internal(tmpl) container_of(tmpl, struct srp_internal, t) #define dev_to_rport(d) container_of(d, struct srp_rport, dev) #define transport_class_to_srp_rport(dev) dev_to_rport((dev)->parent) static inline struct Scsi_Host *rport_to_shost(struct srp_rport *r) { return dev_to_shost(r->dev.parent); } static int find_child_rport(struct device *dev, void *data) { struct device **child = data; if (scsi_is_srp_rport(dev)) { WARN_ON_ONCE(*child); *child = dev; } return 0; } static inline struct srp_rport *shost_to_rport(struct Scsi_Host *shost) { struct device *child = NULL; WARN_ON_ONCE(device_for_each_child(&shost->shost_gendev, &child, find_child_rport) < 0); return child ? dev_to_rport(child) : NULL; } /** * srp_tmo_valid() - check timeout combination validity * @reconnect_delay: Reconnect delay in seconds. * @fast_io_fail_tmo: Fast I/O fail timeout in seconds. * @dev_loss_tmo: Device loss timeout in seconds. * * The combination of the timeout parameters must be such that SCSI commands * are finished in a reasonable time. Hence do not allow the fast I/O fail * timeout to exceed SCSI_DEVICE_BLOCK_MAX_TIMEOUT nor allow dev_loss_tmo to * exceed that limit if failing I/O fast has been disabled. Furthermore, these * parameters must be such that multipath can detect failed paths timely. * Hence do not allow all three parameters to be disabled simultaneously. */ int srp_tmo_valid(int reconnect_delay, int fast_io_fail_tmo, long dev_loss_tmo) { if (reconnect_delay < 0 && fast_io_fail_tmo < 0 && dev_loss_tmo < 0) return -EINVAL; if (reconnect_delay == 0) return -EINVAL; if (fast_io_fail_tmo > SCSI_DEVICE_BLOCK_MAX_TIMEOUT) return -EINVAL; if (fast_io_fail_tmo < 0 && dev_loss_tmo > SCSI_DEVICE_BLOCK_MAX_TIMEOUT) return -EINVAL; if (dev_loss_tmo >= LONG_MAX / HZ) return -EINVAL; if (fast_io_fail_tmo >= 0 && dev_loss_tmo >= 0 && fast_io_fail_tmo >= dev_loss_tmo) return -EINVAL; return 0; } EXPORT_SYMBOL_GPL(srp_tmo_valid); static int srp_host_setup(struct transport_container *tc, struct device *dev, struct device *cdev) { struct Scsi_Host *shost = dev_to_shost(dev); struct srp_host_attrs *srp_host = to_srp_host_attrs(shost); atomic_set(&srp_host->next_port_id, 0); return 0; } static DECLARE_TRANSPORT_CLASS(srp_host_class, "srp_host", srp_host_setup, NULL, NULL); static DECLARE_TRANSPORT_CLASS(srp_rport_class, "srp_remote_ports", NULL, NULL, NULL); static ssize_t show_srp_rport_id(struct device *dev, struct device_attribute *attr, char *buf) { struct srp_rport *rport = transport_class_to_srp_rport(dev); return sprintf(buf, "%16phC\n", rport->port_id); } static DEVICE_ATTR(port_id, S_IRUGO, show_srp_rport_id, NULL); static const struct { u32 value; char *name; } srp_rport_role_names[] = { {SRP_RPORT_ROLE_INITIATOR, "SRP Initiator"}, {SRP_RPORT_ROLE_TARGET, "SRP Target"}, }; static ssize_t show_srp_rport_roles(struct device *dev, struct device_attribute *attr, char *buf) { struct srp_rport *rport = transport_class_to_srp_rport(dev); int i; char *name = NULL; for (i = 0; i < ARRAY_SIZE(srp_rport_role_names); i++) if (srp_rport_role_names[i].value == rport->roles) { name = srp_rport_role_names[i].name; break; } return sprintf(buf, "%s\n", name ? : "unknown"); } static DEVICE_ATTR(roles, S_IRUGO, show_srp_rport_roles, NULL); static ssize_t store_srp_rport_delete(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { struct srp_rport *rport = transport_class_to_srp_rport(dev); struct Scsi_Host *shost = dev_to_shost(dev); struct srp_internal *i = to_srp_internal(shost->transportt); if (i->f->rport_delete) { i->f->rport_delete(rport); return count; } else { return -ENOSYS; } } static DEVICE_ATTR(delete, S_IWUSR, NULL, store_srp_rport_delete); static ssize_t show_srp_rport_state(struct device *dev, struct device_attribute *attr, char *buf) { static const char *const state_name[] = { [SRP_RPORT_RUNNING] = "running", [SRP_RPORT_BLOCKED] = "blocked", [SRP_RPORT_FAIL_FAST] = "fail-fast", [SRP_RPORT_LOST] = "lost", }; struct srp_rport *rport = transport_class_to_srp_rport(dev); enum srp_rport_state state = rport->state; return sprintf(buf, "%s\n", (unsigned)state < ARRAY_SIZE(state_name) ? state_name[state] : "???"); } static DEVICE_ATTR(state, S_IRUGO, show_srp_rport_state, NULL); static ssize_t srp_show_tmo(char *buf, int tmo) { return tmo >= 0 ? sprintf(buf, "%d\n", tmo) : sprintf(buf, "off\n"); } int srp_parse_tmo(int *tmo, const char *buf) { int res = 0; if (strncmp(buf, "off", 3) != 0) res = kstrtoint(buf, 0, tmo); else *tmo = -1; return res; } EXPORT_SYMBOL(srp_parse_tmo); static ssize_t show_reconnect_delay(struct device *dev, struct device_attribute *attr, char *buf) { struct srp_rport *rport = transport_class_to_srp_rport(dev); return srp_show_tmo(buf, rport->reconnect_delay); } static ssize_t store_reconnect_delay(struct device *dev, struct device_attribute *attr, const char *buf, const size_t count) { struct srp_rport *rport = transport_class_to_srp_rport(dev); int res, delay; res = srp_parse_tmo(&delay, buf); if (res) goto out; res = srp_tmo_valid(delay, rport->fast_io_fail_tmo, rport->dev_loss_tmo); if (res) goto out; if (rport->reconnect_delay <= 0 && delay > 0 && rport->state != SRP_RPORT_RUNNING) { queue_delayed_work(system_long_wq, &rport->reconnect_work, delay * HZ); } else if (delay <= 0) { cancel_delayed_work(&rport->reconnect_work); } rport->reconnect_delay = delay; res = count; out: return res; } static DEVICE_ATTR(reconnect_delay, S_IRUGO | S_IWUSR, show_reconnect_delay, store_reconnect_delay); static ssize_t show_failed_reconnects(struct device *dev, struct device_attribute *attr, char *buf) { struct srp_rport *rport = transport_class_to_srp_rport(dev); return sprintf(buf, "%d\n", rport->failed_reconnects); } static DEVICE_ATTR(failed_reconnects, S_IRUGO, show_failed_reconnects, NULL); static ssize_t show_srp_rport_fast_io_fail_tmo(struct device *dev, struct device_attribute *attr, char *buf) { struct srp_rport *rport = transport_class_to_srp_rport(dev); return srp_show_tmo(buf, rport->fast_io_fail_tmo); } static ssize_t store_srp_rport_fast_io_fail_tmo(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { struct srp_rport *rport = transport_class_to_srp_rport(dev); int res; int fast_io_fail_tmo; res = srp_parse_tmo(&fast_io_fail_tmo, buf); if (res) goto out; res = srp_tmo_valid(rport->reconnect_delay, fast_io_fail_tmo, rport->dev_loss_tmo); if (res) goto out; rport->fast_io_fail_tmo = fast_io_fail_tmo; res = count; out: return res; } static DEVICE_ATTR(fast_io_fail_tmo, S_IRUGO | S_IWUSR, show_srp_rport_fast_io_fail_tmo, store_srp_rport_fast_io_fail_tmo); static ssize_t show_srp_rport_dev_loss_tmo(struct device *dev, struct device_attribute *attr, char *buf) { struct srp_rport *rport = transport_class_to_srp_rport(dev); return srp_show_tmo(buf, rport->dev_loss_tmo); } static ssize_t store_srp_rport_dev_loss_tmo(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { struct srp_rport *rport = transport_class_to_srp_rport(dev); int res; int dev_loss_tmo; res = srp_parse_tmo(&dev_loss_tmo, buf); if (res) goto out; res = srp_tmo_valid(rport->reconnect_delay, rport->fast_io_fail_tmo, dev_loss_tmo); if (res) goto out; rport->dev_loss_tmo = dev_loss_tmo; res = count; out: return res; } static DEVICE_ATTR(dev_loss_tmo, S_IRUGO | S_IWUSR, show_srp_rport_dev_loss_tmo, store_srp_rport_dev_loss_tmo); static int srp_rport_set_state(struct srp_rport *rport, enum srp_rport_state new_state) { enum srp_rport_state old_state = rport->state; lockdep_assert_held(&rport->mutex); switch (new_state) { case SRP_RPORT_RUNNING: switch (old_state) { case SRP_RPORT_LOST: goto invalid; default: break; } break; case SRP_RPORT_BLOCKED: switch (old_state) { case SRP_RPORT_RUNNING: break; default: goto invalid; } break; case SRP_RPORT_FAIL_FAST: switch (old_state) { case SRP_RPORT_LOST: goto invalid; default: break; } break; case SRP_RPORT_LOST: break; } rport->state = new_state; return 0; invalid: return -EINVAL; } /** * srp_reconnect_work() - reconnect and schedule a new attempt if necessary * @work: Work structure used for scheduling this operation. */ static void srp_reconnect_work(struct work_struct *work) { struct srp_rport *rport = container_of(to_delayed_work(work), struct srp_rport, reconnect_work); struct Scsi_Host *shost = rport_to_shost(rport); int delay, res; res = srp_reconnect_rport(rport); if (res != 0) { shost_printk(KERN_ERR, shost, "reconnect attempt %d failed (%d)\n", ++rport->failed_reconnects, res); delay = rport->reconnect_delay * min(100, max(1, rport->failed_reconnects - 10)); if (delay > 0) queue_delayed_work(system_long_wq, &rport->reconnect_work, delay * HZ); } } /* * scsi_block_targets() must have been called before this function is * called to guarantee that no .queuecommand() calls are in progress. */ static void __rport_fail_io_fast(struct srp_rport *rport) { struct Scsi_Host *shost = rport_to_shost(rport); struct srp_internal *i; lockdep_assert_held(&rport->mutex); if (srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST)) return; scsi_target_unblock(rport->dev.parent, SDEV_TRANSPORT_OFFLINE); /* Involve the LLD if possible to terminate all I/O on the rport. */ i = to_srp_internal(shost->transportt); if (i->f->terminate_rport_io) i->f->terminate_rport_io(rport); } /** * rport_fast_io_fail_timedout() - fast I/O failure timeout handler * @work: Work structure used for scheduling this operation. */ static void rport_fast_io_fail_timedout(struct work_struct *work) { struct srp_rport *rport = container_of(to_delayed_work(work), struct srp_rport, fast_io_fail_work); struct Scsi_Host *shost = rport_to_shost(rport); pr_info("fast_io_fail_tmo expired for SRP %s / %s.\n", dev_name(&rport->dev), dev_name(&shost->shost_gendev)); mutex_lock(&rport->mutex); if (rport->state == SRP_RPORT_BLOCKED) __rport_fail_io_fast(rport); mutex_unlock(&rport->mutex); } /** * rport_dev_loss_timedout() - device loss timeout handler * @work: Work structure used for scheduling this operation. */ static void rport_dev_loss_timedout(struct work_struct *work) { struct srp_rport *rport = container_of(to_delayed_work(work), struct srp_rport, dev_loss_work); struct Scsi_Host *shost = rport_to_shost(rport); struct srp_internal *i = to_srp_internal(shost->transportt); pr_info("dev_loss_tmo expired for SRP %s / %s.\n", dev_name(&rport->dev), dev_name(&shost->shost_gendev)); mutex_lock(&rport->mutex); WARN_ON(srp_rport_set_state(rport, SRP_RPORT_LOST) != 0); scsi_target_unblock(rport->dev.parent, SDEV_TRANSPORT_OFFLINE); mutex_unlock(&rport->mutex); i->f->rport_delete(rport); } static void __srp_start_tl_fail_timers(struct srp_rport *rport) { struct Scsi_Host *shost = rport_to_shost(rport); int delay, fast_io_fail_tmo, dev_loss_tmo; lockdep_assert_held(&rport->mutex); delay = rport->reconnect_delay; fast_io_fail_tmo = rport->fast_io_fail_tmo; dev_loss_tmo = rport->dev_loss_tmo; pr_debug("%s current state: %d\n", dev_name(&shost->shost_gendev), rport->state); if (rport->state == SRP_RPORT_LOST) return; if (delay > 0) queue_delayed_work(system_long_wq, &rport->reconnect_work, 1UL * delay * HZ); if ((fast_io_fail_tmo >= 0 || dev_loss_tmo >= 0) && srp_rport_set_state(rport, SRP_RPORT_BLOCKED) == 0) { pr_debug("%s new state: %d\n", dev_name(&shost->shost_gendev), rport->state); scsi_block_targets(shost, &shost->shost_gendev); if (fast_io_fail_tmo >= 0) queue_delayed_work(system_long_wq, &rport->fast_io_fail_work, 1UL * fast_io_fail_tmo * HZ); if (dev_loss_tmo >= 0) queue_delayed_work(system_long_wq, &rport->dev_loss_work, 1UL * dev_loss_tmo * HZ); } } /** * srp_start_tl_fail_timers() - start the transport layer failure timers * @rport: SRP target port. * * Start the transport layer fast I/O failure and device loss timers. Do not * modify a timer that was already started. */ void srp_start_tl_fail_timers(struct srp_rport *rport) { mutex_lock(&rport->mutex); __srp_start_tl_fail_timers(rport); mutex_unlock(&rport->mutex); } EXPORT_SYMBOL(srp_start_tl_fail_timers); /** * srp_reconnect_rport() - reconnect to an SRP target port * @rport: SRP target port. * * Blocks SCSI command queueing before invoking reconnect() such that * queuecommand() won't be invoked concurrently with reconnect() from outside * the SCSI EH. This is important since a reconnect() implementation may * reallocate resources needed by queuecommand(). * * Notes: * - This function neither waits until outstanding requests have finished nor * tries to abort these. It is the responsibility of the reconnect() * function to finish outstanding commands before reconnecting to the target * port. * - It is the responsibility of the caller to ensure that the resources * reallocated by the reconnect() function won't be used while this function * is in progress. One possible strategy is to invoke this function from * the context of the SCSI EH thread only. Another possible strategy is to * lock the rport mutex inside each SCSI LLD callback that can be invoked by * the SCSI EH (the scsi_host_template.eh_*() functions and also the * scsi_host_template.queuecommand() function). */ int srp_reconnect_rport(struct srp_rport *rport) { struct Scsi_Host *shost = rport_to_shost(rport); struct srp_internal *i = to_srp_internal(shost->transportt); struct scsi_device *sdev; int res; pr_debug("SCSI host %s\n", dev_name(&shost->shost_gendev)); res = mutex_lock_interruptible(&rport->mutex); if (res) goto out; if (rport->state != SRP_RPORT_FAIL_FAST && rport->state != SRP_RPORT_LOST) /* * sdev state must be SDEV_TRANSPORT_OFFLINE, transition * to SDEV_BLOCK is illegal. Calling scsi_target_unblock() * later is ok though, scsi_internal_device_unblock_nowait() * treats SDEV_TRANSPORT_OFFLINE like SDEV_BLOCK. */ scsi_block_targets(shost, &shost->shost_gendev); res = rport->state != SRP_RPORT_LOST ? i->f->reconnect(rport) : -ENODEV; pr_debug("%s (state %d): transport.reconnect() returned %d\n", dev_name(&shost->shost_gendev), rport->state, res); if (res == 0) { cancel_delayed_work(&rport->fast_io_fail_work); cancel_delayed_work(&rport->dev_loss_work); rport->failed_reconnects = 0; srp_rport_set_state(rport, SRP_RPORT_RUNNING); scsi_target_unblock(&shost->shost_gendev, SDEV_RUNNING); /* * If the SCSI error handler has offlined one or more devices, * invoking scsi_target_unblock() won't change the state of * these devices into running so do that explicitly. */ shost_for_each_device(sdev, shost) { mutex_lock(&sdev->state_mutex); if (sdev->sdev_state == SDEV_OFFLINE) sdev->sdev_state = SDEV_RUNNING; mutex_unlock(&sdev->state_mutex); } } else if (rport->state == SRP_RPORT_RUNNING) { /* * srp_reconnect_rport() has been invoked with fast_io_fail * and dev_loss off. Mark the port as failed and start the TL * failure timers if these had not yet been started. */ __rport_fail_io_fast(rport); __srp_start_tl_fail_timers(rport); } else if (rport->state != SRP_RPORT_BLOCKED) { scsi_target_unblock(&shost->shost_gendev, SDEV_TRANSPORT_OFFLINE); } mutex_unlock(&rport->mutex); out: return res; } EXPORT_SYMBOL(srp_reconnect_rport); /** * srp_timed_out() - SRP transport intercept of the SCSI timeout EH * @scmd: SCSI command. * * If a timeout occurs while an rport is in the blocked state, ask the SCSI * EH to continue waiting (SCSI_EH_RESET_TIMER). Otherwise let the SCSI core * handle the timeout (SCSI_EH_NOT_HANDLED). * * Note: This function is called from soft-IRQ context and with the request * queue lock held. */ enum scsi_timeout_action srp_timed_out(struct scsi_cmnd *scmd) { struct scsi_device *sdev = scmd->device; struct Scsi_Host *shost = sdev->host; struct srp_internal *i = to_srp_internal(shost->transportt); struct srp_rport *rport = shost_to_rport(shost); pr_debug("timeout for sdev %s\n", dev_name(&sdev->sdev_gendev)); return rport && rport->fast_io_fail_tmo < 0 && rport->dev_loss_tmo < 0 && i->f->reset_timer_if_blocked && scsi_device_blocked(sdev) ? SCSI_EH_RESET_TIMER : SCSI_EH_NOT_HANDLED; } EXPORT_SYMBOL(srp_timed_out); static void srp_rport_release(struct device *dev) { struct srp_rport *rport = dev_to_rport(dev); put_device(dev->parent); kfree(rport); } static int scsi_is_srp_rport(const struct device *dev) { return dev->release == srp_rport_release; } static int srp_rport_match(struct attribute_container *cont, struct device *dev) { struct Scsi_Host *shost; struct srp_internal *i; if (!scsi_is_srp_rport(dev)) return 0; shost = dev_to_shost(dev->parent); if (!shost->transportt) return 0; if (shost->transportt->host_attrs.ac.class != &srp_host_class.class) return 0; i = to_srp_internal(shost->transportt); return &i->rport_attr_cont.ac == cont; } static int srp_host_match(struct attribute_container *cont, struct device *dev) { struct Scsi_Host *shost; struct srp_internal *i; if (!scsi_is_host_device(dev)) return 0; shost = dev_to_shost(dev); if (!shost->transportt) return 0; if (shost->transportt->host_attrs.ac.class != &srp_host_class.class) return 0; i = to_srp_internal(shost->transportt); return &i->t.host_attrs.ac == cont; } /** * srp_rport_get() - increment rport reference count * @rport: SRP target port. */ void srp_rport_get(struct srp_rport *rport) { get_device(&rport->dev); } EXPORT_SYMBOL(srp_rport_get); /** * srp_rport_put() - decrement rport reference count * @rport: SRP target port. */ void srp_rport_put(struct srp_rport *rport) { put_device(&rport->dev); } EXPORT_SYMBOL(srp_rport_put); /** * srp_rport_add - add a SRP remote port to the device hierarchy * @shost: scsi host the remote port is connected to. * @ids: The port id for the remote port. * * Publishes a port to the rest of the system. */ struct srp_rport *srp_rport_add(struct Scsi_Host *shost, struct srp_rport_identifiers *ids) { struct srp_rport *rport; struct device *parent = &shost->shost_gendev; struct srp_internal *i = to_srp_internal(shost->transportt); int id, ret; rport = kzalloc(sizeof(*rport), GFP_KERNEL); if (!rport) return ERR_PTR(-ENOMEM); mutex_init(&rport->mutex); device_initialize(&rport->dev); rport->dev.parent = get_device(parent); rport->dev.release = srp_rport_release; memcpy(rport->port_id, ids->port_id, sizeof(rport->port_id)); rport->roles = ids->roles; if (i->f->reconnect) rport->reconnect_delay = i->f->reconnect_delay ? *i->f->reconnect_delay : 10; INIT_DELAYED_WORK(&rport->reconnect_work, srp_reconnect_work); rport->fast_io_fail_tmo = i->f->fast_io_fail_tmo ? *i->f->fast_io_fail_tmo : 15; rport->dev_loss_tmo = i->f->dev_loss_tmo ? *i->f->dev_loss_tmo : 60; INIT_DELAYED_WORK(&rport->fast_io_fail_work, rport_fast_io_fail_timedout); INIT_DELAYED_WORK(&rport->dev_loss_work, rport_dev_loss_timedout); id = atomic_inc_return(&to_srp_host_attrs(shost)->next_port_id); dev_set_name(&rport->dev, "port-%d:%d", shost->host_no, id); transport_setup_device(&rport->dev); ret = device_add(&rport->dev); if (ret) { transport_destroy_device(&rport->dev); put_device(&rport->dev); return ERR_PTR(ret); } transport_add_device(&rport->dev); transport_configure_device(&rport->dev); return rport; } EXPORT_SYMBOL_GPL(srp_rport_add); /** * srp_rport_del - remove a SRP remote port * @rport: SRP remote port to remove * * Removes the specified SRP remote port. */ void srp_rport_del(struct srp_rport *rport) { struct device *dev = &rport->dev; transport_remove_device(dev); device_del(dev); transport_destroy_device(dev); put_device(dev); } EXPORT_SYMBOL_GPL(srp_rport_del); static int do_srp_rport_del(struct device *dev, void *data) { if (scsi_is_srp_rport(dev)) srp_rport_del(dev_to_rport(dev)); return 0; } /** * srp_remove_host - tear down a Scsi_Host's SRP data structures * @shost: Scsi Host that is torn down * * Removes all SRP remote ports for a given Scsi_Host. * Must be called just before scsi_remove_host for SRP HBAs. */ void srp_remove_host(struct Scsi_Host *shost) { device_for_each_child(&shost->shost_gendev, NULL, do_srp_rport_del); } EXPORT_SYMBOL_GPL(srp_remove_host); /** * srp_stop_rport_timers - stop the transport layer recovery timers * @rport: SRP remote port for which to stop the timers. * * Must be called after srp_remove_host() and scsi_remove_host(). The caller * must hold a reference on the rport (rport->dev) and on the SCSI host * (rport->dev.parent). */ void srp_stop_rport_timers(struct srp_rport *rport) { mutex_lock(&rport->mutex); if (rport->state == SRP_RPORT_BLOCKED) __rport_fail_io_fast(rport); srp_rport_set_state(rport, SRP_RPORT_LOST); mutex_unlock(&rport->mutex); cancel_delayed_work_sync(&rport->reconnect_work); cancel_delayed_work_sync(&rport->fast_io_fail_work); cancel_delayed_work_sync(&rport->dev_loss_work); } EXPORT_SYMBOL_GPL(srp_stop_rport_timers); /** * srp_attach_transport - instantiate SRP transport template * @ft: SRP transport class function template */ struct scsi_transport_template * srp_attach_transport(struct srp_function_template *ft) { int count; struct srp_internal *i; i = kzalloc(sizeof(*i), GFP_KERNEL); if (!i) return NULL; i->t.host_size = sizeof(struct srp_host_attrs); i->t.host_attrs.ac.attrs = &i->host_attrs[0]; i->t.host_attrs.ac.class = &srp_host_class.class; i->t.host_attrs.ac.match = srp_host_match; i->host_attrs[0] = NULL; transport_container_register(&i->t.host_attrs); i->rport_attr_cont.ac.attrs = &i->rport_attrs[0]; i->rport_attr_cont.ac.class = &srp_rport_class.class; i->rport_attr_cont.ac.match = srp_rport_match; count = 0; i->rport_attrs[count++] = &dev_attr_port_id; i->rport_attrs[count++] = &dev_attr_roles; if (ft->has_rport_state) { i->rport_attrs[count++] = &dev_attr_state; i->rport_attrs[count++] = &dev_attr_fast_io_fail_tmo; i->rport_attrs[count++] = &dev_attr_dev_loss_tmo; } if (ft->reconnect) { i->rport_attrs[count++] = &dev_attr_reconnect_delay; i->rport_attrs[count++] = &dev_attr_failed_reconnects; } if (ft->rport_delete) i->rport_attrs[count++] = &dev_attr_delete; i->rport_attrs[count++] = NULL; BUG_ON(count > ARRAY_SIZE(i->rport_attrs)); transport_container_register(&i->rport_attr_cont); i->f = ft; return &i->t; } EXPORT_SYMBOL_GPL(srp_attach_transport); /** * srp_release_transport - release SRP transport template instance * @t: transport template instance */ void srp_release_transport(struct scsi_transport_template *t) { struct srp_internal *i = to_srp_internal(t); transport_container_unregister(&i->t.host_attrs); transport_container_unregister(&i->rport_attr_cont); kfree(i); } EXPORT_SYMBOL_GPL(srp_release_transport); static __init int srp_transport_init(void) { int ret; ret = transport_class_register(&srp_host_class); if (ret) return ret; ret = transport_class_register(&srp_rport_class); if (ret) goto unregister_host_class; return 0; unregister_host_class: transport_class_unregister(&srp_host_class); return ret; } static void __exit srp_transport_exit(void) { transport_class_unregister(&srp_host_class); transport_class_unregister(&srp_rport_class); } MODULE_AUTHOR("FUJITA Tomonori"); MODULE_DESCRIPTION("SRP Transport Attributes"); MODULE_LICENSE("GPL"); module_init(srp_transport_init); module_exit(srp_transport_exit); |
2 2 2 2 2 2 2 2 2 2 2 2 22 22 22 21 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 | // SPDX-License-Identifier: GPL-2.0-or-later /* */ #include <linux/gfp.h> #include <linux/init.h> #include <linux/ratelimit.h> #include <linux/usb.h> #include <linux/usb/audio.h> #include <linux/slab.h> #include <sound/core.h> #include <sound/pcm.h> #include <sound/pcm_params.h> #include "usbaudio.h" #include "helper.h" #include "card.h" #include "endpoint.h" #include "pcm.h" #include "clock.h" #include "quirks.h" enum { EP_STATE_STOPPED, EP_STATE_RUNNING, EP_STATE_STOPPING, }; /* interface refcounting */ struct snd_usb_iface_ref { unsigned char iface; bool need_setup; int opened; int altset; struct list_head list; }; /* clock refcounting */ struct snd_usb_clock_ref { unsigned char clock; atomic_t locked; int opened; int rate; bool need_setup; struct list_head list; }; /* * snd_usb_endpoint is a model that abstracts everything related to an * USB endpoint and its streaming. * * There are functions to activate and deactivate the streaming URBs and * optional callbacks to let the pcm logic handle the actual content of the * packets for playback and record. Thus, the bus streaming and the audio * handlers are fully decoupled. * * There are two different types of endpoints in audio applications. * * SND_USB_ENDPOINT_TYPE_DATA handles full audio data payload for both * inbound and outbound traffic. * * SND_USB_ENDPOINT_TYPE_SYNC endpoints are for inbound traffic only and * expect the payload to carry Q10.14 / Q16.16 formatted sync information * (3 or 4 bytes). * * Each endpoint has to be configured prior to being used by calling * snd_usb_endpoint_set_params(). * * The model incorporates a reference counting, so that multiple users * can call snd_usb_endpoint_start() and snd_usb_endpoint_stop(), and * only the first user will effectively start the URBs, and only the last * one to stop it will tear the URBs down again. */ /* * convert a sampling rate into our full speed format (fs/1000 in Q16.16) * this will overflow at approx 524 kHz */ static inline unsigned get_usb_full_speed_rate(unsigned int rate) { return ((rate << 13) + 62) / 125; } /* * convert a sampling rate into USB high speed format (fs/8000 in Q16.16) * this will overflow at approx 4 MHz */ static inline unsigned get_usb_high_speed_rate(unsigned int rate) { return ((rate << 10) + 62) / 125; } /* * release a urb data */ static void release_urb_ctx(struct snd_urb_ctx *u) { if (u->urb && u->buffer_size) usb_free_coherent(u->ep->chip->dev, u->buffer_size, u->urb->transfer_buffer, u->urb->transfer_dma); usb_free_urb(u->urb); u->urb = NULL; u->buffer_size = 0; } static const char *usb_error_string(int err) { switch (err) { case -ENODEV: return "no device"; case -ENOENT: return "endpoint not enabled"; case -EPIPE: return "endpoint stalled"; case -ENOSPC: return "not enough bandwidth"; case -ESHUTDOWN: return "device disabled"; case -EHOSTUNREACH: return "device suspended"; case -EINVAL: case -EAGAIN: case -EFBIG: case -EMSGSIZE: return "internal error"; default: return "unknown error"; } } static inline bool ep_state_running(struct snd_usb_endpoint *ep) { return atomic_read(&ep->state) == EP_STATE_RUNNING; } static inline bool ep_state_update(struct snd_usb_endpoint *ep, int old, int new) { return atomic_try_cmpxchg(&ep->state, &old, new); } /** * snd_usb_endpoint_implicit_feedback_sink: Report endpoint usage type * * @ep: The snd_usb_endpoint * * Determine whether an endpoint is driven by an implicit feedback * data endpoint source. */ int snd_usb_endpoint_implicit_feedback_sink(struct snd_usb_endpoint *ep) { return ep->implicit_fb_sync && usb_pipeout(ep->pipe); } /* * Return the number of samples to be sent in the next packet * for streaming based on information derived from sync endpoints * * This won't be used for implicit feedback which takes the packet size * returned from the sync source */ static int slave_next_packet_size(struct snd_usb_endpoint *ep, unsigned int avail) { unsigned long flags; unsigned int phase; int ret; if (ep->fill_max) return ep->maxframesize; spin_lock_irqsave(&ep->lock, flags); phase = (ep->phase & 0xffff) + (ep->freqm << ep->datainterval); ret = min(phase >> 16, ep->maxframesize); if (avail && ret >= avail) ret = -EAGAIN; else ep->phase = phase; spin_unlock_irqrestore(&ep->lock, flags); return ret; } /* * Return the number of samples to be sent in the next packet * for adaptive and synchronous endpoints */ static int next_packet_size(struct snd_usb_endpoint *ep, unsigned int avail) { unsigned int sample_accum; int ret; if (ep->fill_max) return ep->maxframesize; sample_accum = ep->sample_accum + ep->sample_rem; if (sample_accum >= ep->pps) { sample_accum -= ep->pps; ret = ep->packsize[1]; } else { ret = ep->packsize[0]; } if (avail && ret >= avail) ret = -EAGAIN; else ep->sample_accum = sample_accum; return ret; } /* * snd_usb_endpoint_next_packet_size: Return the number of samples to be sent * in the next packet * * If the size is equal or exceeds @avail, don't proceed but return -EAGAIN * Exception: @avail = 0 for skipping the check. */ int snd_usb_endpoint_next_packet_size(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx, int idx, unsigned int avail) { unsigned int packet; packet = ctx->packet_size[idx]; if (packet) { if (avail && packet >= avail) return -EAGAIN; return packet; } if (ep->sync_source) return slave_next_packet_size(ep, avail); else return next_packet_size(ep, avail); } static void call_retire_callback(struct snd_usb_endpoint *ep, struct urb *urb) { struct snd_usb_substream *data_subs; data_subs = READ_ONCE(ep->data_subs); if (data_subs && ep->retire_data_urb) ep->retire_data_urb(data_subs, urb); } static void retire_outbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *urb_ctx) { call_retire_callback(ep, urb_ctx->urb); } static void snd_usb_handle_sync_urb(struct snd_usb_endpoint *ep, struct snd_usb_endpoint *sender, const struct urb *urb); static void retire_inbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *urb_ctx) { struct urb *urb = urb_ctx->urb; struct snd_usb_endpoint *sync_sink; if (unlikely(ep->skip_packets > 0)) { ep->skip_packets--; return; } sync_sink = READ_ONCE(ep->sync_sink); if (sync_sink) snd_usb_handle_sync_urb(sync_sink, ep, urb); call_retire_callback(ep, urb); } static inline bool has_tx_length_quirk(struct snd_usb_audio *chip) { return chip->quirk_flags & QUIRK_FLAG_TX_LENGTH; } static void prepare_silent_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx) { struct urb *urb = ctx->urb; unsigned int offs = 0; unsigned int extra = 0; __le32 packet_length; int i; /* For tx_length_quirk, put packet length at start of packet */ if (has_tx_length_quirk(ep->chip)) extra = sizeof(packet_length); for (i = 0; i < ctx->packets; ++i) { unsigned int offset; unsigned int length; int counts; counts = snd_usb_endpoint_next_packet_size(ep, ctx, i, 0); length = counts * ep->stride; /* number of silent bytes */ offset = offs * ep->stride + extra * i; urb->iso_frame_desc[i].offset = offset; urb->iso_frame_desc[i].length = length + extra; if (extra) { packet_length = cpu_to_le32(length); memcpy(urb->transfer_buffer + offset, &packet_length, sizeof(packet_length)); } memset(urb->transfer_buffer + offset + extra, ep->silence_value, length); offs += counts; } urb->number_of_packets = ctx->packets; urb->transfer_buffer_length = offs * ep->stride + ctx->packets * extra; ctx->queued = 0; } /* * Prepare a PLAYBACK urb for submission to the bus. */ static int prepare_outbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx, bool in_stream_lock) { struct urb *urb = ctx->urb; unsigned char *cp = urb->transfer_buffer; struct snd_usb_substream *data_subs; urb->dev = ep->chip->dev; /* we need to set this at each time */ switch (ep->type) { case SND_USB_ENDPOINT_TYPE_DATA: data_subs = READ_ONCE(ep->data_subs); if (data_subs && ep->prepare_data_urb) return ep->prepare_data_urb(data_subs, urb, in_stream_lock); /* no data provider, so send silence */ prepare_silent_urb(ep, ctx); break; case SND_USB_ENDPOINT_TYPE_SYNC: if (snd_usb_get_speed(ep->chip->dev) >= USB_SPEED_HIGH) { /* * fill the length and offset of each urb descriptor. * the fixed 12.13 frequency is passed as 16.16 through the pipe. */ urb->iso_frame_desc[0].length = 4; urb->iso_frame_desc[0].offset = 0; cp[0] = ep->freqn; cp[1] = ep->freqn >> 8; cp[2] = ep->freqn >> 16; cp[3] = ep->freqn >> 24; } else { /* * fill the length and offset of each urb descriptor. * the fixed 10.14 frequency is passed through the pipe. */ urb->iso_frame_desc[0].length = 3; urb->iso_frame_desc[0].offset = 0; cp[0] = ep->freqn >> 2; cp[1] = ep->freqn >> 10; cp[2] = ep->freqn >> 18; } break; } return 0; } /* * Prepare a CAPTURE or SYNC urb for submission to the bus. */ static int prepare_inbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *urb_ctx) { int i, offs; struct urb *urb = urb_ctx->urb; urb->dev = ep->chip->dev; /* we need to set this at each time */ switch (ep->type) { case SND_USB_ENDPOINT_TYPE_DATA: offs = 0; for (i = 0; i < urb_ctx->packets; i++) { urb->iso_frame_desc[i].offset = offs; urb->iso_frame_desc[i].length = ep->curpacksize; offs += ep->curpacksize; } urb->transfer_buffer_length = offs; urb->number_of_packets = urb_ctx->packets; break; case SND_USB_ENDPOINT_TYPE_SYNC: urb->iso_frame_desc[0].length = min(4u, ep->syncmaxsize); urb->iso_frame_desc[0].offset = 0; break; } return 0; } /* notify an error as XRUN to the assigned PCM data substream */ static void notify_xrun(struct snd_usb_endpoint *ep) { struct snd_usb_substream *data_subs; data_subs = READ_ONCE(ep->data_subs); if (data_subs && data_subs->pcm_substream) snd_pcm_stop_xrun(data_subs->pcm_substream); } static struct snd_usb_packet_info * next_packet_fifo_enqueue(struct snd_usb_endpoint *ep) { struct snd_usb_packet_info *p; p = ep->next_packet + (ep->next_packet_head + ep->next_packet_queued) % ARRAY_SIZE(ep->next_packet); ep->next_packet_queued++; return p; } static struct snd_usb_packet_info * next_packet_fifo_dequeue(struct snd_usb_endpoint *ep) { struct snd_usb_packet_info *p; p = ep->next_packet + ep->next_packet_head; ep->next_packet_head++; ep->next_packet_head %= ARRAY_SIZE(ep->next_packet); ep->next_packet_queued--; return p; } static void push_back_to_ready_list(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx) { unsigned long flags; spin_lock_irqsave(&ep->lock, flags); list_add_tail(&ctx->ready_list, &ep->ready_playback_urbs); spin_unlock_irqrestore(&ep->lock, flags); } /* * Send output urbs that have been prepared previously. URBs are dequeued * from ep->ready_playback_urbs and in case there aren't any available * or there are no packets that have been prepared, this function does * nothing. * * The reason why the functionality of sending and preparing URBs is separated * is that host controllers don't guarantee the order in which they return * inbound and outbound packets to their submitters. * * This function is used both for implicit feedback endpoints and in low- * latency playback mode. */ int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, bool in_stream_lock) { bool implicit_fb = snd_usb_endpoint_implicit_feedback_sink(ep); while (ep_state_running(ep)) { unsigned long flags; struct snd_usb_packet_info *packet; struct snd_urb_ctx *ctx = NULL; int err, i; spin_lock_irqsave(&ep->lock, flags); if ((!implicit_fb || ep->next_packet_queued > 0) && !list_empty(&ep->ready_playback_urbs)) { /* take URB out of FIFO */ ctx = list_first_entry(&ep->ready_playback_urbs, struct snd_urb_ctx, ready_list); list_del_init(&ctx->ready_list); if (implicit_fb) packet = next_packet_fifo_dequeue(ep); } spin_unlock_irqrestore(&ep->lock, flags); if (ctx == NULL) break; /* copy over the length information */ if (implicit_fb) { for (i = 0; i < packet->packets; i++) ctx->packet_size[i] = packet->packet_size[i]; } /* call the data handler to fill in playback data */ err = prepare_outbound_urb(ep, ctx, in_stream_lock); /* can be stopped during prepare callback */ if (unlikely(!ep_state_running(ep))) break; if (err < 0) { /* push back to ready list again for -EAGAIN */ if (err == -EAGAIN) { push_back_to_ready_list(ep, ctx); break; } if (!in_stream_lock) notify_xrun(ep); return -EPIPE; } if (!atomic_read(&ep->chip->shutdown)) err = usb_submit_urb(ctx->urb, GFP_ATOMIC); else err = -ENODEV; if (err < 0) { if (!atomic_read(&ep->chip->shutdown)) { usb_audio_err(ep->chip, "Unable to submit urb #%d: %d at %s\n", ctx->index, err, __func__); if (!in_stream_lock) notify_xrun(ep); } return -EPIPE; } set_bit(ctx->index, &ep->active_mask); atomic_inc(&ep->submitted_urbs); } return 0; } /* * complete callback for urbs */ static void snd_complete_urb(struct urb *urb) { struct snd_urb_ctx *ctx = urb->context; struct snd_usb_endpoint *ep = ctx->ep; int err; if (unlikely(urb->status == -ENOENT || /* unlinked */ urb->status == -ENODEV || /* device removed */ urb->status == -ECONNRESET || /* unlinked */ urb->status == -ESHUTDOWN)) /* device disabled */ goto exit_clear; /* device disconnected */ if (unlikely(atomic_read(&ep->chip->shutdown))) goto exit_clear; if (unlikely(!ep_state_running(ep))) goto exit_clear; if (usb_pipeout(ep->pipe)) { retire_outbound_urb(ep, ctx); /* can be stopped during retire callback */ if (unlikely(!ep_state_running(ep))) goto exit_clear; /* in low-latency and implicit-feedback modes, push back the * URB to ready list at first, then process as much as possible */ if (ep->lowlatency_playback || snd_usb_endpoint_implicit_feedback_sink(ep)) { push_back_to_ready_list(ep, ctx); clear_bit(ctx->index, &ep->active_mask); snd_usb_queue_pending_output_urbs(ep, false); atomic_dec(&ep->submitted_urbs); /* decrement at last */ return; } /* in non-lowlatency mode, no error handling for prepare */ prepare_outbound_urb(ep, ctx, false); /* can be stopped during prepare callback */ if (unlikely(!ep_state_running(ep))) goto exit_clear; } else { retire_inbound_urb(ep, ctx); /* can be stopped during retire callback */ if (unlikely(!ep_state_running(ep))) goto exit_clear; prepare_inbound_urb(ep, ctx); } if (!atomic_read(&ep->chip->shutdown)) err = usb_submit_urb(urb, GFP_ATOMIC); else err = -ENODEV; if (err == 0) return; if (!atomic_read(&ep->chip->shutdown)) { usb_audio_err(ep->chip, "cannot submit urb (err = %d)\n", err); notify_xrun(ep); } exit_clear: clear_bit(ctx->index, &ep->active_mask); atomic_dec(&ep->submitted_urbs); } /* * Find or create a refcount object for the given interface * * The objects are released altogether in snd_usb_endpoint_free_all() */ static struct snd_usb_iface_ref * iface_ref_find(struct snd_usb_audio *chip, int iface) { struct snd_usb_iface_ref *ip; list_for_each_entry(ip, &chip->iface_ref_list, list) if (ip->iface == iface) return ip; ip = kzalloc(sizeof(*ip), GFP_KERNEL); if (!ip) return NULL; ip->iface = iface; list_add_tail(&ip->list, &chip->iface_ref_list); return ip; } /* Similarly, a refcount object for clock */ static struct snd_usb_clock_ref * clock_ref_find(struct snd_usb_audio *chip, int clock) { struct snd_usb_clock_ref *ref; list_for_each_entry(ref, &chip->clock_ref_list, list) if (ref->clock == clock) return ref; ref = kzalloc(sizeof(*ref), GFP_KERNEL); if (!ref) return NULL; ref->clock = clock; atomic_set(&ref->locked, 0); list_add_tail(&ref->list, &chip->clock_ref_list); return ref; } /* * Get the existing endpoint object corresponding EP * Returns NULL if not present. */ struct snd_usb_endpoint * snd_usb_get_endpoint(struct snd_usb_audio *chip, int ep_num) { struct snd_usb_endpoint *ep; list_for_each_entry(ep, &chip->ep_list, list) { if (ep->ep_num == ep_num) return ep; } return NULL; } #define ep_type_name(type) \ (type == SND_USB_ENDPOINT_TYPE_DATA ? "data" : "sync") /** * snd_usb_add_endpoint: Add an endpoint to an USB audio chip * * @chip: The chip * @ep_num: The number of the endpoint to use * @type: SND_USB_ENDPOINT_TYPE_DATA or SND_USB_ENDPOINT_TYPE_SYNC * * If the requested endpoint has not been added to the given chip before, * a new instance is created. * * Returns zero on success or a negative error code. * * New endpoints will be added to chip->ep_list and freed by * calling snd_usb_endpoint_free_all(). * * For SND_USB_ENDPOINT_TYPE_SYNC, the caller needs to guarantee that * bNumEndpoints > 1 beforehand. */ int snd_usb_add_endpoint(struct snd_usb_audio *chip, int ep_num, int type) { struct snd_usb_endpoint *ep; bool is_playback; ep = snd_usb_get_endpoint(chip, ep_num); if (ep) return 0; usb_audio_dbg(chip, "Creating new %s endpoint #%x\n", ep_type_name(type), ep_num); ep = kzalloc(sizeof(*ep), GFP_KERNEL); if (!ep) return -ENOMEM; ep->chip = chip; spin_lock_init(&ep->lock); ep->type = type; ep->ep_num = ep_num; INIT_LIST_HEAD(&ep->ready_playback_urbs); atomic_set(&ep->submitted_urbs, 0); is_playback = ((ep_num & USB_ENDPOINT_DIR_MASK) == USB_DIR_OUT); ep_num &= USB_ENDPOINT_NUMBER_MASK; if (is_playback) ep->pipe = usb_sndisocpipe(chip->dev, ep_num); else ep->pipe = usb_rcvisocpipe(chip->dev, ep_num); list_add_tail(&ep->list, &chip->ep_list); return 0; } /* Set up syncinterval and maxsyncsize for a sync EP */ static void endpoint_set_syncinterval(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { struct usb_host_interface *alts; struct usb_endpoint_descriptor *desc; alts = snd_usb_get_host_interface(chip, ep->iface, ep->altsetting); if (!alts) return; desc = get_endpoint(alts, ep->ep_idx); if (desc->bLength >= USB_DT_ENDPOINT_AUDIO_SIZE && desc->bRefresh >= 1 && desc->bRefresh <= 9) ep->syncinterval = desc->bRefresh; else if (snd_usb_get_speed(chip->dev) == USB_SPEED_FULL) ep->syncinterval = 1; else if (desc->bInterval >= 1 && desc->bInterval <= 16) ep->syncinterval = desc->bInterval - 1; else ep->syncinterval = 3; ep->syncmaxsize = le16_to_cpu(desc->wMaxPacketSize); } static bool endpoint_compatible(struct snd_usb_endpoint *ep, const struct audioformat *fp, const struct snd_pcm_hw_params *params) { if (!ep->opened) return false; if (ep->cur_audiofmt != fp) return false; if (ep->cur_rate != params_rate(params) || ep->cur_format != params_format(params) || ep->cur_period_frames != params_period_size(params) || ep->cur_buffer_periods != params_periods(params)) return false; return true; } /* * Check whether the given fp and hw params are compatible with the current * setup of the target EP for implicit feedback sync */ bool snd_usb_endpoint_compatible(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep, const struct audioformat *fp, const struct snd_pcm_hw_params *params) { bool ret; mutex_lock(&chip->mutex); ret = endpoint_compatible(ep, fp, params); mutex_unlock(&chip->mutex); return ret; } /* * snd_usb_endpoint_open: Open the endpoint * * Called from hw_params to assign the endpoint to the substream. * It's reference-counted, and only the first opener is allowed to set up * arbitrary parameters. The later opener must be compatible with the * former opened parameters. * The endpoint needs to be closed via snd_usb_endpoint_close() later. * * Note that this function doesn't configure the endpoint. The substream * needs to set it up later via snd_usb_endpoint_set_params() and * snd_usb_endpoint_prepare(). */ struct snd_usb_endpoint * snd_usb_endpoint_open(struct snd_usb_audio *chip, const struct audioformat *fp, const struct snd_pcm_hw_params *params, bool is_sync_ep, bool fixed_rate) { struct snd_usb_endpoint *ep; int ep_num = is_sync_ep ? fp->sync_ep : fp->endpoint; mutex_lock(&chip->mutex); ep = snd_usb_get_endpoint(chip, ep_num); if (!ep) { usb_audio_err(chip, "Cannot find EP 0x%x to open\n", ep_num); goto unlock; } if (!ep->opened) { if (is_sync_ep) { ep->iface = fp->sync_iface; ep->altsetting = fp->sync_altsetting; ep->ep_idx = fp->sync_ep_idx; } else { ep->iface = fp->iface; ep->altsetting = fp->altsetting; ep->ep_idx = fp->ep_idx; } usb_audio_dbg(chip, "Open EP 0x%x, iface=%d:%d, idx=%d\n", ep_num, ep->iface, ep->altsetting, ep->ep_idx); ep->iface_ref = iface_ref_find(chip, ep->iface); if (!ep->iface_ref) { ep = NULL; goto unlock; } if (fp->protocol != UAC_VERSION_1) { ep->clock_ref = clock_ref_find(chip, fp->clock); if (!ep->clock_ref) { ep = NULL; goto unlock; } ep->clock_ref->opened++; } ep->cur_audiofmt = fp; ep->cur_channels = fp->channels; ep->cur_rate = params_rate(params); ep->cur_format = params_format(params); ep->cur_frame_bytes = snd_pcm_format_physical_width(ep->cur_format) * ep->cur_channels / 8; ep->cur_period_frames = params_period_size(params); ep->cur_period_bytes = ep->cur_period_frames * ep->cur_frame_bytes; ep->cur_buffer_periods = params_periods(params); if (ep->type == SND_USB_ENDPOINT_TYPE_SYNC) endpoint_set_syncinterval(chip, ep); ep->implicit_fb_sync = fp->implicit_fb; ep->need_setup = true; ep->need_prepare = true; ep->fixed_rate = fixed_rate; usb_audio_dbg(chip, " channels=%d, rate=%d, format=%s, period_bytes=%d, periods=%d, implicit_fb=%d\n", ep->cur_channels, ep->cur_rate, snd_pcm_format_name(ep->cur_format), ep->cur_period_bytes, ep->cur_buffer_periods, ep->implicit_fb_sync); } else { if (WARN_ON(!ep->iface_ref)) { ep = NULL; goto unlock; } if (!endpoint_compatible(ep, fp, params)) { usb_audio_err(chip, "Incompatible EP setup for 0x%x\n", ep_num); ep = NULL; goto unlock; } usb_audio_dbg(chip, "Reopened EP 0x%x (count %d)\n", ep_num, ep->opened); } if (!ep->iface_ref->opened++) ep->iface_ref->need_setup = true; ep->opened++; unlock: mutex_unlock(&chip->mutex); return ep; } /* * snd_usb_endpoint_set_sync: Link data and sync endpoints * * Pass NULL to sync_ep to unlink again */ void snd_usb_endpoint_set_sync(struct snd_usb_audio *chip, struct snd_usb_endpoint *data_ep, struct snd_usb_endpoint *sync_ep) { data_ep->sync_source = sync_ep; } /* * Set data endpoint callbacks and the assigned data stream * * Called at PCM trigger and cleanups. * Pass NULL to deactivate each callback. */ void snd_usb_endpoint_set_callback(struct snd_usb_endpoint *ep, int (*prepare)(struct snd_usb_substream *subs, struct urb *urb, bool in_stream_lock), void (*retire)(struct snd_usb_substream *subs, struct urb *urb), struct snd_usb_substream *data_subs) { ep->prepare_data_urb = prepare; ep->retire_data_urb = retire; if (data_subs) ep->lowlatency_playback = data_subs->lowlatency_playback; else ep->lowlatency_playback = false; WRITE_ONCE(ep->data_subs, data_subs); } static int endpoint_set_interface(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep, bool set) { int altset = set ? ep->altsetting : 0; int err; if (ep->iface_ref->altset == altset) return 0; usb_audio_dbg(chip, "Setting usb interface %d:%d for EP 0x%x\n", ep->iface, altset, ep->ep_num); err = usb_set_interface(chip->dev, ep->iface, altset); if (err < 0) { usb_audio_err_ratelimited( chip, "%d:%d: usb_set_interface failed (%d)\n", ep->iface, altset, err); return err; } if (chip->quirk_flags & QUIRK_FLAG_IFACE_DELAY) msleep(50); ep->iface_ref->altset = altset; return 0; } /* * snd_usb_endpoint_close: Close the endpoint * * Unreference the already opened endpoint via snd_usb_endpoint_open(). */ void snd_usb_endpoint_close(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { mutex_lock(&chip->mutex); usb_audio_dbg(chip, "Closing EP 0x%x (count %d)\n", ep->ep_num, ep->opened); if (!--ep->iface_ref->opened && !(chip->quirk_flags & QUIRK_FLAG_IFACE_SKIP_CLOSE)) endpoint_set_interface(chip, ep, false); if (!--ep->opened) { if (ep->clock_ref) { if (!--ep->clock_ref->opened) ep->clock_ref->rate = 0; } ep->iface = 0; ep->altsetting = 0; ep->cur_audiofmt = NULL; ep->cur_rate = 0; ep->iface_ref = NULL; ep->clock_ref = NULL; usb_audio_dbg(chip, "EP 0x%x closed\n", ep->ep_num); } mutex_unlock(&chip->mutex); } /* Prepare for suspening EP, called from the main suspend handler */ void snd_usb_endpoint_suspend(struct snd_usb_endpoint *ep) { ep->need_prepare = true; if (ep->iface_ref) ep->iface_ref->need_setup = true; if (ep->clock_ref) ep->clock_ref->rate = 0; } /* * wait until all urbs are processed. */ static int wait_clear_urbs(struct snd_usb_endpoint *ep) { unsigned long end_time = jiffies + msecs_to_jiffies(1000); int alive; if (atomic_read(&ep->state) != EP_STATE_STOPPING) return 0; do { alive = atomic_read(&ep->submitted_urbs); if (!alive) break; schedule_timeout_uninterruptible(1); } while (time_before(jiffies, end_time)); if (alive) usb_audio_err(ep->chip, "timeout: still %d active urbs on EP #%x\n", alive, ep->ep_num); if (ep_state_update(ep, EP_STATE_STOPPING, EP_STATE_STOPPED)) { ep->sync_sink = NULL; snd_usb_endpoint_set_callback(ep, NULL, NULL, NULL); } return 0; } /* sync the pending stop operation; * this function itself doesn't trigger the stop operation */ void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep) { if (ep) wait_clear_urbs(ep); } /* * Stop active urbs * * This function moves the EP to STOPPING state if it's being RUNNING. */ static int stop_urbs(struct snd_usb_endpoint *ep, bool force, bool keep_pending) { unsigned int i; unsigned long flags; if (!force && atomic_read(&ep->running)) return -EBUSY; if (!ep_state_update(ep, EP_STATE_RUNNING, EP_STATE_STOPPING)) return 0; spin_lock_irqsave(&ep->lock, flags); INIT_LIST_HEAD(&ep->ready_playback_urbs); ep->next_packet_head = 0; ep->next_packet_queued = 0; spin_unlock_irqrestore(&ep->lock, flags); if (keep_pending) return 0; for (i = 0; i < ep->nurbs; i++) { if (test_bit(i, &ep->active_mask)) { if (!test_and_set_bit(i, &ep->unlink_mask)) { struct urb *u = ep->urb[i].urb; usb_unlink_urb(u); } } } return 0; } /* * release an endpoint's urbs */ static int release_urbs(struct snd_usb_endpoint *ep, bool force) { int i, err; /* route incoming urbs to nirvana */ snd_usb_endpoint_set_callback(ep, NULL, NULL, NULL); /* stop and unlink urbs */ err = stop_urbs(ep, force, false); if (err) return err; wait_clear_urbs(ep); for (i = 0; i < ep->nurbs; i++) release_urb_ctx(&ep->urb[i]); usb_free_coherent(ep->chip->dev, SYNC_URBS * 4, ep->syncbuf, ep->sync_dma); ep->syncbuf = NULL; ep->nurbs = 0; return 0; } /* * configure a data endpoint */ static int data_ep_set_params(struct snd_usb_endpoint *ep) { struct snd_usb_audio *chip = ep->chip; unsigned int maxsize, minsize, packs_per_ms, max_packs_per_urb; unsigned int max_packs_per_period, urbs_per_period, urb_packs; unsigned int max_urbs, i; const struct audioformat *fmt = ep->cur_audiofmt; int frame_bits = ep->cur_frame_bytes * 8; int tx_length_quirk = (has_tx_length_quirk(chip) && usb_pipeout(ep->pipe)); usb_audio_dbg(chip, "Setting params for data EP 0x%x, pipe 0x%x\n", ep->ep_num, ep->pipe); if (ep->cur_format == SNDRV_PCM_FORMAT_DSD_U16_LE && fmt->dsd_dop) { /* * When operating in DSD DOP mode, the size of a sample frame * in hardware differs from the actual physical format width * because we need to make room for the DOP markers. */ frame_bits += ep->cur_channels << 3; } ep->datainterval = fmt->datainterval; ep->stride = frame_bits >> 3; switch (ep->cur_format) { case SNDRV_PCM_FORMAT_U8: ep->silence_value = 0x80; break; case SNDRV_PCM_FORMAT_DSD_U8: case SNDRV_PCM_FORMAT_DSD_U16_LE: case SNDRV_PCM_FORMAT_DSD_U32_LE: case SNDRV_PCM_FORMAT_DSD_U16_BE: case SNDRV_PCM_FORMAT_DSD_U32_BE: ep->silence_value = 0x69; break; default: ep->silence_value = 0; } /* assume max. frequency is 50% higher than nominal */ ep->freqmax = ep->freqn + (ep->freqn >> 1); /* Round up freqmax to nearest integer in order to calculate maximum * packet size, which must represent a whole number of frames. * This is accomplished by adding 0x0.ffff before converting the * Q16.16 format into integer. * In order to accurately calculate the maximum packet size when * the data interval is more than 1 (i.e. ep->datainterval > 0), * multiply by the data interval prior to rounding. For instance, * a freqmax of 41 kHz will result in a max packet size of 6 (5.125) * frames with a data interval of 1, but 11 (10.25) frames with a * data interval of 2. * (ep->freqmax << ep->datainterval overflows at 8.192 MHz for the * maximum datainterval value of 3, at USB full speed, higher for * USB high speed, noting that ep->freqmax is in units of * frames per packet in Q16.16 format.) */ maxsize = (((ep->freqmax << ep->datainterval) + 0xffff) >> 16) * (frame_bits >> 3); if (tx_length_quirk) maxsize += sizeof(__le32); /* Space for length descriptor */ /* but wMaxPacketSize might reduce this */ if (ep->maxpacksize && ep->maxpacksize < maxsize) { /* whatever fits into a max. size packet */ unsigned int data_maxsize = maxsize = ep->maxpacksize; if (tx_length_quirk) /* Need to remove the length descriptor to calc freq */ data_maxsize -= sizeof(__le32); ep->freqmax = (data_maxsize / (frame_bits >> 3)) << (16 - ep->datainterval); } if (ep->fill_max) ep->curpacksize = ep->maxpacksize; else ep->curpacksize = maxsize; if (snd_usb_get_speed(chip->dev) != USB_SPEED_FULL) { packs_per_ms = 8 >> ep->datainterval; max_packs_per_urb = MAX_PACKS_HS; } else { packs_per_ms = 1; max_packs_per_urb = MAX_PACKS; } if (ep->sync_source && !ep->implicit_fb_sync) max_packs_per_urb = min(max_packs_per_urb, 1U << ep->sync_source->syncinterval); max_packs_per_urb = max(1u, max_packs_per_urb >> ep->datainterval); /* * Capture endpoints need to use small URBs because there's no way * to tell in advance where the next period will end, and we don't * want the next URB to complete much after the period ends. * * Playback endpoints with implicit sync much use the same parameters * as their corresponding capture endpoint. */ if (usb_pipein(ep->pipe) || ep->implicit_fb_sync) { /* make capture URBs <= 1 ms and smaller than a period */ urb_packs = min(max_packs_per_urb, packs_per_ms); while (urb_packs > 1 && urb_packs * maxsize >= ep->cur_period_bytes) urb_packs >>= 1; ep->nurbs = MAX_URBS; /* * Playback endpoints without implicit sync are adjusted so that * a period fits as evenly as possible in the smallest number of * URBs. The total number of URBs is adjusted to the size of the * ALSA buffer, subject to the MAX_URBS and MAX_QUEUE limits. */ } else { /* determine how small a packet can be */ minsize = (ep->freqn >> (16 - ep->datainterval)) * (frame_bits >> 3); /* with sync from device, assume it can be 12% lower */ if (ep->sync_source) minsize -= minsize >> 3; minsize = max(minsize, 1u); /* how many packets will contain an entire ALSA period? */ max_packs_per_period = DIV_ROUND_UP(ep->cur_period_bytes, minsize); /* how many URBs will contain a period? */ urbs_per_period = DIV_ROUND_UP(max_packs_per_period, max_packs_per_urb); /* how many packets are needed in each URB? */ urb_packs = DIV_ROUND_UP(max_packs_per_period, urbs_per_period); /* limit the number of frames in a single URB */ ep->max_urb_frames = DIV_ROUND_UP(ep->cur_period_frames, urbs_per_period); /* try to use enough URBs to contain an entire ALSA buffer */ max_urbs = min((unsigned) MAX_URBS, MAX_QUEUE * packs_per_ms / urb_packs); ep->nurbs = min(max_urbs, urbs_per_period * ep->cur_buffer_periods); } /* allocate and initialize data urbs */ for (i = 0; i < ep->nurbs; i++) { struct snd_urb_ctx *u = &ep->urb[i]; u->index = i; u->ep = ep; u->packets = urb_packs; u->buffer_size = maxsize * u->packets; if (fmt->fmt_type == UAC_FORMAT_TYPE_II) u->packets++; /* for transfer delimiter */ u->urb = usb_alloc_urb(u->packets, GFP_KERNEL); if (!u->urb) goto out_of_memory; u->urb->transfer_buffer = usb_alloc_coherent(chip->dev, u->buffer_size, GFP_KERNEL, &u->urb->transfer_dma); if (!u->urb->transfer_buffer) goto out_of_memory; u->urb->pipe = ep->pipe; u->urb->transfer_flags = URB_NO_TRANSFER_DMA_MAP; u->urb->interval = 1 << ep->datainterval; u->urb->context = u; u->urb->complete = snd_complete_urb; INIT_LIST_HEAD(&u->ready_list); } return 0; out_of_memory: release_urbs(ep, false); return -ENOMEM; } /* * configure a sync endpoint */ static int sync_ep_set_params(struct snd_usb_endpoint *ep) { struct snd_usb_audio *chip = ep->chip; int i; usb_audio_dbg(chip, "Setting params for sync EP 0x%x, pipe 0x%x\n", ep->ep_num, ep->pipe); ep->syncbuf = usb_alloc_coherent(chip->dev, SYNC_URBS * 4, GFP_KERNEL, &ep->sync_dma); if (!ep->syncbuf) return -ENOMEM; ep->nurbs = SYNC_URBS; for (i = 0; i < SYNC_URBS; i++) { struct snd_urb_ctx *u = &ep->urb[i]; u->index = i; u->ep = ep; u->packets = 1; u->urb = usb_alloc_urb(1, GFP_KERNEL); if (!u->urb) goto out_of_memory; u->urb->transfer_buffer = ep->syncbuf + i * 4; u->urb->transfer_dma = ep->sync_dma + i * 4; u->urb->transfer_buffer_length = 4; u->urb->pipe = ep->pipe; u->urb->transfer_flags = URB_NO_TRANSFER_DMA_MAP; u->urb->number_of_packets = 1; u->urb->interval = 1 << ep->syncinterval; u->urb->context = u; u->urb->complete = snd_complete_urb; } return 0; out_of_memory: release_urbs(ep, false); return -ENOMEM; } /* update the rate of the referred clock; return the actual rate */ static int update_clock_ref_rate(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { struct snd_usb_clock_ref *clock = ep->clock_ref; int rate = ep->cur_rate; if (!clock || clock->rate == rate) return rate; if (clock->rate) { if (atomic_read(&clock->locked)) return clock->rate; if (clock->rate != rate) { usb_audio_err(chip, "Mismatched sample rate %d vs %d for EP 0x%x\n", clock->rate, rate, ep->ep_num); return clock->rate; } } clock->rate = rate; clock->need_setup = true; return rate; } /* * snd_usb_endpoint_set_params: configure an snd_usb_endpoint * * It's called either from hw_params callback. * Determine the number of URBs to be used on this endpoint. * An endpoint must be configured before it can be started. * An endpoint that is already running can not be reconfigured. */ int snd_usb_endpoint_set_params(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { const struct audioformat *fmt = ep->cur_audiofmt; int err = 0; mutex_lock(&chip->mutex); if (!ep->need_setup) goto unlock; /* release old buffers, if any */ err = release_urbs(ep, false); if (err < 0) goto unlock; ep->datainterval = fmt->datainterval; ep->maxpacksize = fmt->maxpacksize; ep->fill_max = !!(fmt->attributes & UAC_EP_CS_ATTR_FILL_MAX); if (snd_usb_get_speed(chip->dev) == USB_SPEED_FULL) { ep->freqn = get_usb_full_speed_rate(ep->cur_rate); ep->pps = 1000 >> ep->datainterval; } else { ep->freqn = get_usb_high_speed_rate(ep->cur_rate); ep->pps = 8000 >> ep->datainterval; } ep->sample_rem = ep->cur_rate % ep->pps; ep->packsize[0] = ep->cur_rate / ep->pps; ep->packsize[1] = (ep->cur_rate + (ep->pps - 1)) / ep->pps; /* calculate the frequency in 16.16 format */ ep->freqm = ep->freqn; ep->freqshift = INT_MIN; ep->phase = 0; switch (ep->type) { case SND_USB_ENDPOINT_TYPE_DATA: err = data_ep_set_params(ep); break; case SND_USB_ENDPOINT_TYPE_SYNC: err = sync_ep_set_params(ep); break; default: err = -EINVAL; } usb_audio_dbg(chip, "Set up %d URBS, ret=%d\n", ep->nurbs, err); if (err < 0) goto unlock; /* some unit conversions in runtime */ ep->maxframesize = ep->maxpacksize / ep->cur_frame_bytes; ep->curframesize = ep->curpacksize / ep->cur_frame_bytes; err = update_clock_ref_rate(chip, ep); if (err >= 0) { ep->need_setup = false; err = 0; } unlock: mutex_unlock(&chip->mutex); return err; } static int init_sample_rate(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { struct snd_usb_clock_ref *clock = ep->clock_ref; int rate, err; rate = update_clock_ref_rate(chip, ep); if (rate < 0) return rate; if (clock && !clock->need_setup) return 0; if (!ep->fixed_rate) { err = snd_usb_init_sample_rate(chip, ep->cur_audiofmt, rate); if (err < 0) { if (clock) clock->rate = 0; /* reset rate */ return err; } } if (clock) clock->need_setup = false; return 0; } /* * snd_usb_endpoint_prepare: Prepare the endpoint * * This function sets up the EP to be fully usable state. * It's called either from prepare callback. * The function checks need_setup flag, and performs nothing unless needed, * so it's safe to call this multiple times. * * This returns zero if unchanged, 1 if the configuration has changed, * or a negative error code. */ int snd_usb_endpoint_prepare(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { bool iface_first; int err = 0; mutex_lock(&chip->mutex); if (WARN_ON(!ep->iface_ref)) goto unlock; if (!ep->need_prepare) goto unlock; /* If the interface has been already set up, just set EP parameters */ if (!ep->iface_ref->need_setup) { /* sample rate setup of UAC1 is per endpoint, and we need * to update at each EP configuration */ if (ep->cur_audiofmt->protocol == UAC_VERSION_1) { err = init_sample_rate(chip, ep); if (err < 0) goto unlock; } goto done; } /* Need to deselect altsetting at first */ endpoint_set_interface(chip, ep, false); /* Some UAC1 devices (e.g. Yamaha THR10) need the host interface * to be set up before parameter setups */ iface_first = ep->cur_audiofmt->protocol == UAC_VERSION_1; /* Workaround for devices that require the interface setup at first like UAC1 */ if (chip->quirk_flags & QUIRK_FLAG_SET_IFACE_FIRST) iface_first = true; if (iface_first) { err = endpoint_set_interface(chip, ep, true); if (err < 0) goto unlock; } err = snd_usb_init_pitch(chip, ep->cur_audiofmt); if (err < 0) goto unlock; err = init_sample_rate(chip, ep); if (err < 0) goto unlock; err = snd_usb_select_mode_quirk(chip, ep->cur_audiofmt); if (err < 0) goto unlock; /* for UAC2/3, enable the interface altset here at last */ if (!iface_first) { err = endpoint_set_interface(chip, ep, true); if (err < 0) goto unlock; } ep->iface_ref->need_setup = false; done: ep->need_prepare = false; err = 1; unlock: mutex_unlock(&chip->mutex); return err; } /* get the current rate set to the given clock by any endpoint */ int snd_usb_endpoint_get_clock_rate(struct snd_usb_audio *chip, int clock) { struct snd_usb_clock_ref *ref; int rate = 0; if (!clock) return 0; mutex_lock(&chip->mutex); list_for_each_entry(ref, &chip->clock_ref_list, list) { if (ref->clock == clock) { rate = ref->rate; break; } } mutex_unlock(&chip->mutex); return rate; } /** * snd_usb_endpoint_start: start an snd_usb_endpoint * * @ep: the endpoint to start * * A call to this function will increment the running count of the endpoint. * In case it is not already running, the URBs for this endpoint will be * submitted. Otherwise, this function does nothing. * * Must be balanced to calls of snd_usb_endpoint_stop(). * * Returns an error if the URB submission failed, 0 in all other cases. */ int snd_usb_endpoint_start(struct snd_usb_endpoint *ep) { bool is_playback = usb_pipeout(ep->pipe); int err; unsigned int i; if (atomic_read(&ep->chip->shutdown)) return -EBADFD; if (ep->sync_source) WRITE_ONCE(ep->sync_source->sync_sink, ep); usb_audio_dbg(ep->chip, "Starting %s EP 0x%x (running %d)\n", ep_type_name(ep->type), ep->ep_num, atomic_read(&ep->running)); /* already running? */ if (atomic_inc_return(&ep->running) != 1) return 0; if (ep->clock_ref) atomic_inc(&ep->clock_ref->locked); ep->active_mask = 0; ep->unlink_mask = 0; ep->phase = 0; ep->sample_accum = 0; snd_usb_endpoint_start_quirk(ep); /* * If this endpoint has a data endpoint as implicit feedback source, * don't start the urbs here. Instead, mark them all as available, * wait for the record urbs to return and queue the playback urbs * from that context. */ if (!ep_state_update(ep, EP_STATE_STOPPED, EP_STATE_RUNNING)) goto __error; if (snd_usb_endpoint_implicit_feedback_sink(ep) && !(ep->chip->quirk_flags & QUIRK_FLAG_PLAYBACK_FIRST)) { usb_audio_dbg(ep->chip, "No URB submission due to implicit fb sync\n"); i = 0; goto fill_rest; } for (i = 0; i < ep->nurbs; i++) { struct urb *urb = ep->urb[i].urb; if (snd_BUG_ON(!urb)) goto __error; if (is_playback) err = prepare_outbound_urb(ep, urb->context, true); else err = prepare_inbound_urb(ep, urb->context); if (err < 0) { /* stop filling at applptr */ if (err == -EAGAIN) break; usb_audio_dbg(ep->chip, "EP 0x%x: failed to prepare urb: %d\n", ep->ep_num, err); goto __error; } if (!atomic_read(&ep->chip->shutdown)) err = usb_submit_urb(urb, GFP_ATOMIC); else err = -ENODEV; if (err < 0) { if (!atomic_read(&ep->chip->shutdown)) usb_audio_err(ep->chip, "cannot submit urb %d, error %d: %s\n", i, err, usb_error_string(err)); goto __error; } set_bit(i, &ep->active_mask); atomic_inc(&ep->submitted_urbs); } if (!i) { usb_audio_dbg(ep->chip, "XRUN at starting EP 0x%x\n", ep->ep_num); goto __error; } usb_audio_dbg(ep->chip, "%d URBs submitted for EP 0x%x\n", i, ep->ep_num); fill_rest: /* put the remaining URBs to ready list */ if (is_playback) { for (; i < ep->nurbs; i++) push_back_to_ready_list(ep, ep->urb + i); } return 0; __error: snd_usb_endpoint_stop(ep, false); return -EPIPE; } /** * snd_usb_endpoint_stop: stop an snd_usb_endpoint * * @ep: the endpoint to stop (may be NULL) * @keep_pending: keep in-flight URBs * * A call to this function will decrement the running count of the endpoint. * In case the last user has requested the endpoint stop, the URBs will * actually be deactivated. * * Must be balanced to calls of snd_usb_endpoint_start(). * * The caller needs to synchronize the pending stop operation via * snd_usb_endpoint_sync_pending_stop(). */ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, bool keep_pending) { if (!ep) return; usb_audio_dbg(ep->chip, "Stopping %s EP 0x%x (running %d)\n", ep_type_name(ep->type), ep->ep_num, atomic_read(&ep->running)); if (snd_BUG_ON(!atomic_read(&ep->running))) return; if (!atomic_dec_return(&ep->running)) { if (ep->sync_source) WRITE_ONCE(ep->sync_source->sync_sink, NULL); stop_urbs(ep, false, keep_pending); if (ep->clock_ref) atomic_dec(&ep->clock_ref->locked); if (ep->chip->quirk_flags & QUIRK_FLAG_FORCE_IFACE_RESET && usb_pipeout(ep->pipe)) { ep->need_prepare = true; if (ep->iface_ref) ep->iface_ref->need_setup = true; } } } /** * snd_usb_endpoint_release: Tear down an snd_usb_endpoint * * @ep: the endpoint to release * * This function does not care for the endpoint's running count but will tear * down all the streaming URBs immediately. */ void snd_usb_endpoint_release(struct snd_usb_endpoint *ep) { release_urbs(ep, true); } /** * snd_usb_endpoint_free_all: Free the resources of an snd_usb_endpoint * @chip: The chip * * This free all endpoints and those resources */ void snd_usb_endpoint_free_all(struct snd_usb_audio *chip) { struct snd_usb_endpoint *ep, *en; struct snd_usb_iface_ref *ip, *in; struct snd_usb_clock_ref *cp, *cn; list_for_each_entry_safe(ep, en, &chip->ep_list, list) kfree(ep); list_for_each_entry_safe(ip, in, &chip->iface_ref_list, list) kfree(ip); list_for_each_entry_safe(cp, cn, &chip->clock_ref_list, list) kfree(cp); } /* * snd_usb_handle_sync_urb: parse an USB sync packet * * @ep: the endpoint to handle the packet * @sender: the sending endpoint * @urb: the received packet * * This function is called from the context of an endpoint that received * the packet and is used to let another endpoint object handle the payload. */ static void snd_usb_handle_sync_urb(struct snd_usb_endpoint *ep, struct snd_usb_endpoint *sender, const struct urb *urb) { int shift; unsigned int f; unsigned long flags; snd_BUG_ON(ep == sender); /* * In case the endpoint is operating in implicit feedback mode, prepare * a new outbound URB that has the same layout as the received packet * and add it to the list of pending urbs. queue_pending_output_urbs() * will take care of them later. */ if (snd_usb_endpoint_implicit_feedback_sink(ep) && atomic_read(&ep->running)) { /* implicit feedback case */ int i, bytes = 0; struct snd_urb_ctx *in_ctx; struct snd_usb_packet_info *out_packet; in_ctx = urb->context; /* Count overall packet size */ for (i = 0; i < in_ctx->packets; i++) if (urb->iso_frame_desc[i].status == 0) bytes += urb->iso_frame_desc[i].actual_length; /* * skip empty packets. At least M-Audio's Fast Track Ultra stops * streaming once it received a 0-byte OUT URB */ if (bytes == 0) return; spin_lock_irqsave(&ep->lock, flags); if (ep->next_packet_queued >= ARRAY_SIZE(ep->next_packet)) { spin_unlock_irqrestore(&ep->lock, flags); usb_audio_err(ep->chip, "next package FIFO overflow EP 0x%x\n", ep->ep_num); notify_xrun(ep); return; } out_packet = next_packet_fifo_enqueue(ep); /* * Iterate through the inbound packet and prepare the lengths * for the output packet. The OUT packet we are about to send * will have the same amount of payload bytes per stride as the * IN packet we just received. Since the actual size is scaled * by the stride, use the sender stride to calculate the length * in case the number of channels differ between the implicitly * fed-back endpoint and the synchronizing endpoint. */ out_packet->packets = in_ctx->packets; for (i = 0; i < in_ctx->packets; i++) { if (urb->iso_frame_desc[i].status == 0) out_packet->packet_size[i] = urb->iso_frame_desc[i].actual_length / sender->stride; else out_packet->packet_size[i] = 0; } spin_unlock_irqrestore(&ep->lock, flags); snd_usb_queue_pending_output_urbs(ep, false); return; } /* * process after playback sync complete * * Full speed devices report feedback values in 10.14 format as samples * per frame, high speed devices in 16.16 format as samples per * microframe. * * Because the Audio Class 1 spec was written before USB 2.0, many high * speed devices use a wrong interpretation, some others use an * entirely different format. * * Therefore, we cannot predict what format any particular device uses * and must detect it automatically. */ if (urb->iso_frame_desc[0].status != 0 || urb->iso_frame_desc[0].actual_length < 3) return; f = le32_to_cpup(urb->transfer_buffer); if (urb->iso_frame_desc[0].actual_length == 3) f &= 0x00ffffff; else f &= 0x0fffffff; if (f == 0) return; if (unlikely(sender->tenor_fb_quirk)) { /* * Devices based on Tenor 8802 chipsets (TEAC UD-H01 * and others) sometimes change the feedback value * by +/- 0x1.0000. */ if (f < ep->freqn - 0x8000) f += 0xf000; else if (f > ep->freqn + 0x8000) f -= 0xf000; } else if (unlikely(ep->freqshift == INT_MIN)) { /* * The first time we see a feedback value, determine its format * by shifting it left or right until it matches the nominal * frequency value. This assumes that the feedback does not * differ from the nominal value more than +50% or -25%. */ shift = 0; while (f < ep->freqn - ep->freqn / 4) { f <<= 1; shift++; } while (f > ep->freqn + ep->freqn / 2) { f >>= 1; shift--; } ep->freqshift = shift; } else if (ep->freqshift >= 0) f <<= ep->freqshift; else f >>= -ep->freqshift; if (likely(f >= ep->freqn - ep->freqn / 8 && f <= ep->freqmax)) { /* * If the frequency looks valid, set it. * This value is referred to in prepare_playback_urb(). */ spin_lock_irqsave(&ep->lock, flags); ep->freqm = f; spin_unlock_irqrestore(&ep->lock, flags); } else { /* * Out of range; maybe the shift value is wrong. * Reset it so that we autodetect again the next time. */ ep->freqshift = INT_MIN; } } |
8 4 7 8 206 213 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 | // SPDX-License-Identifier: GPL-2.0-only /* * async.c: Asynchronous function calls for boot performance * * (C) Copyright 2009 Intel Corporation * Author: Arjan van de Ven <arjan@linux.intel.com> */ /* Goals and Theory of Operation The primary goal of this feature is to reduce the kernel boot time, by doing various independent hardware delays and discovery operations decoupled and not strictly serialized. More specifically, the asynchronous function call concept allows certain operations (primarily during system boot) to happen asynchronously, out of order, while these operations still have their externally visible parts happen sequentially and in-order. (not unlike how out-of-order CPUs retire their instructions in order) Key to the asynchronous function call implementation is the concept of a "sequence cookie" (which, although it has an abstracted type, can be thought of as a monotonically incrementing number). The async core will assign each scheduled event such a sequence cookie and pass this to the called functions. The asynchronously called function should before doing a globally visible operation, such as registering device numbers, call the async_synchronize_cookie() function and pass in its own cookie. The async_synchronize_cookie() function will make sure that all asynchronous operations that were scheduled prior to the operation corresponding with the cookie have completed. Subsystem/driver initialization code that scheduled asynchronous probe functions, but which shares global resources with other drivers/subsystems that do not use the asynchronous call feature, need to do a full synchronization with the async_synchronize_full() function, before returning from their init function. This is to maintain strict ordering between the asynchronous and synchronous parts of the kernel. */ #include <linux/async.h> #include <linux/atomic.h> #include <linux/export.h> #include <linux/ktime.h> #include <linux/pid.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/wait.h> #include <linux/workqueue.h> #include "workqueue_internal.h" static async_cookie_t next_cookie = 1; #define MAX_WORK 32768 #define ASYNC_COOKIE_MAX ULLONG_MAX /* infinity cookie */ static LIST_HEAD(async_global_pending); /* pending from all registered doms */ static ASYNC_DOMAIN(async_dfl_domain); static DEFINE_SPINLOCK(async_lock); static struct workqueue_struct *async_wq; struct async_entry { struct list_head domain_list; struct list_head global_list; struct work_struct work; async_cookie_t cookie; async_func_t func; void *data; struct async_domain *domain; }; static DECLARE_WAIT_QUEUE_HEAD(async_done); static atomic_t entry_count; static long long microseconds_since(ktime_t start) { ktime_t now = ktime_get(); return ktime_to_ns(ktime_sub(now, start)) >> 10; } static async_cookie_t lowest_in_progress(struct async_domain *domain) { struct async_entry *first = NULL; async_cookie_t ret = ASYNC_COOKIE_MAX; unsigned long flags; spin_lock_irqsave(&async_lock, flags); if (domain) { if (!list_empty(&domain->pending)) first = list_first_entry(&domain->pending, struct async_entry, domain_list); } else { if (!list_empty(&async_global_pending)) first = list_first_entry(&async_global_pending, struct async_entry, global_list); } if (first) ret = first->cookie; spin_unlock_irqrestore(&async_lock, flags); return ret; } /* * pick the first pending entry and run it */ static void async_run_entry_fn(struct work_struct *work) { struct async_entry *entry = container_of(work, struct async_entry, work); unsigned long flags; ktime_t calltime; /* 1) run (and print duration) */ pr_debug("calling %lli_%pS @ %i\n", (long long)entry->cookie, entry->func, task_pid_nr(current)); calltime = ktime_get(); entry->func(entry->data, entry->cookie); pr_debug("initcall %lli_%pS returned after %lld usecs\n", (long long)entry->cookie, entry->func, microseconds_since(calltime)); /* 2) remove self from the pending queues */ spin_lock_irqsave(&async_lock, flags); list_del_init(&entry->domain_list); list_del_init(&entry->global_list); /* 3) free the entry */ kfree(entry); atomic_dec(&entry_count); spin_unlock_irqrestore(&async_lock, flags); /* 4) wake up any waiters */ wake_up(&async_done); } static async_cookie_t __async_schedule_node_domain(async_func_t func, void *data, int node, struct async_domain *domain, struct async_entry *entry) { async_cookie_t newcookie; unsigned long flags; INIT_LIST_HEAD(&entry->domain_list); INIT_LIST_HEAD(&entry->global_list); INIT_WORK(&entry->work, async_run_entry_fn); entry->func = func; entry->data = data; entry->domain = domain; spin_lock_irqsave(&async_lock, flags); /* allocate cookie and queue */ newcookie = entry->cookie = next_cookie++; list_add_tail(&entry->domain_list, &domain->pending); if (domain->registered) list_add_tail(&entry->global_list, &async_global_pending); atomic_inc(&entry_count); spin_unlock_irqrestore(&async_lock, flags); /* schedule for execution */ queue_work_node(node, async_wq, &entry->work); return newcookie; } /** * async_schedule_node_domain - NUMA specific version of async_schedule_domain * @func: function to execute asynchronously * @data: data pointer to pass to the function * @node: NUMA node that we want to schedule this on or close to * @domain: the domain * * Returns an async_cookie_t that may be used for checkpointing later. * @domain may be used in the async_synchronize_*_domain() functions to * wait within a certain synchronization domain rather than globally. * * Note: This function may be called from atomic or non-atomic contexts. * * The node requested will be honored on a best effort basis. If the node * has no CPUs associated with it then the work is distributed among all * available CPUs. */ async_cookie_t async_schedule_node_domain(async_func_t func, void *data, int node, struct async_domain *domain) { struct async_entry *entry; unsigned long flags; async_cookie_t newcookie; /* allow irq-off callers */ entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC); /* * If we're out of memory or if there's too much work * pending already, we execute synchronously. */ if (!entry || atomic_read(&entry_count) > MAX_WORK) { kfree(entry); spin_lock_irqsave(&async_lock, flags); newcookie = next_cookie++; spin_unlock_irqrestore(&async_lock, flags); /* low on memory.. run synchronously */ func(data, newcookie); return newcookie; } return __async_schedule_node_domain(func, data, node, domain, entry); } EXPORT_SYMBOL_GPL(async_schedule_node_domain); /** * async_schedule_node - NUMA specific version of async_schedule * @func: function to execute asynchronously * @data: data pointer to pass to the function * @node: NUMA node that we want to schedule this on or close to * * Returns an async_cookie_t that may be used for checkpointing later. * Note: This function may be called from atomic or non-atomic contexts. * * The node requested will be honored on a best effort basis. If the node * has no CPUs associated with it then the work is distributed among all * available CPUs. */ async_cookie_t async_schedule_node(async_func_t func, void *data, int node) { return async_schedule_node_domain(func, data, node, &async_dfl_domain); } EXPORT_SYMBOL_GPL(async_schedule_node); /** * async_schedule_dev_nocall - A simplified variant of async_schedule_dev() * @func: function to execute asynchronously * @dev: device argument to be passed to function * * @dev is used as both the argument for the function and to provide NUMA * context for where to run the function. * * If the asynchronous execution of @func is scheduled successfully, return * true. Otherwise, do nothing and return false, unlike async_schedule_dev() * that will run the function synchronously then. */ bool async_schedule_dev_nocall(async_func_t func, struct device *dev) { struct async_entry *entry; entry = kzalloc(sizeof(struct async_entry), GFP_KERNEL); /* Give up if there is no memory or too much work. */ if (!entry || atomic_read(&entry_count) > MAX_WORK) { kfree(entry); return false; } __async_schedule_node_domain(func, dev, dev_to_node(dev), &async_dfl_domain, entry); return true; } /** * async_synchronize_full - synchronize all asynchronous function calls * * This function waits until all asynchronous function calls have been done. */ void async_synchronize_full(void) { async_synchronize_full_domain(NULL); } EXPORT_SYMBOL_GPL(async_synchronize_full); /** * async_synchronize_full_domain - synchronize all asynchronous function within a certain domain * @domain: the domain to synchronize * * This function waits until all asynchronous function calls for the * synchronization domain specified by @domain have been done. */ void async_synchronize_full_domain(struct async_domain *domain) { async_synchronize_cookie_domain(ASYNC_COOKIE_MAX, domain); } EXPORT_SYMBOL_GPL(async_synchronize_full_domain); /** * async_synchronize_cookie_domain - synchronize asynchronous function calls within a certain domain with cookie checkpointing * @cookie: async_cookie_t to use as checkpoint * @domain: the domain to synchronize (%NULL for all registered domains) * * This function waits until all asynchronous function calls for the * synchronization domain specified by @domain submitted prior to @cookie * have been done. */ void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain *domain) { ktime_t starttime; pr_debug("async_waiting @ %i\n", task_pid_nr(current)); starttime = ktime_get(); wait_event(async_done, lowest_in_progress(domain) >= cookie); pr_debug("async_continuing @ %i after %lli usec\n", task_pid_nr(current), microseconds_since(starttime)); } EXPORT_SYMBOL_GPL(async_synchronize_cookie_domain); /** * async_synchronize_cookie - synchronize asynchronous function calls with cookie checkpointing * @cookie: async_cookie_t to use as checkpoint * * This function waits until all asynchronous function calls prior to @cookie * have been done. */ void async_synchronize_cookie(async_cookie_t cookie) { async_synchronize_cookie_domain(cookie, &async_dfl_domain); } EXPORT_SYMBOL_GPL(async_synchronize_cookie); /** * current_is_async - is %current an async worker task? * * Returns %true if %current is an async worker task. */ bool current_is_async(void) { struct worker *worker = current_wq_worker(); return worker && worker->current_func == async_run_entry_fn; } EXPORT_SYMBOL_GPL(current_is_async); void __init async_init(void) { /* * Async can schedule a number of interdependent work items. However, * unbound workqueues can handle only upto min_active interdependent * work items. The default min_active of 8 isn't sufficient for async * and can lead to stalls. Let's use a dedicated workqueue with raised * min_active. */ async_wq = alloc_workqueue("async", WQ_UNBOUND, 0); BUG_ON(!async_wq); workqueue_set_min_active(async_wq, WQ_DFL_ACTIVE); } |
7 2 6 1 1 1 9 9 4 3 3 5 1 5 6 6 1 3 5 5 1 1 3 3 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Linear symmetric key cipher operations. * * Generic encrypt/decrypt wrapper for ciphers. * * Copyright (c) 2023 Herbert Xu <herbert@gondor.apana.org.au> */ #include <linux/cryptouser.h> #include <linux/err.h> #include <linux/export.h> #include <linux/kernel.h> #include <linux/seq_file.h> #include <linux/slab.h> #include <linux/string.h> #include <net/netlink.h> #include "skcipher.h" static inline struct crypto_lskcipher *__crypto_lskcipher_cast( struct crypto_tfm *tfm) { return container_of(tfm, struct crypto_lskcipher, base); } static inline struct lskcipher_alg *__crypto_lskcipher_alg( struct crypto_alg *alg) { return container_of(alg, struct lskcipher_alg, co.base); } static inline struct crypto_istat_cipher *lskcipher_get_stat( struct lskcipher_alg *alg) { return skcipher_get_stat_common(&alg->co); } static inline int crypto_lskcipher_errstat(struct lskcipher_alg *alg, int err) { struct crypto_istat_cipher *istat = lskcipher_get_stat(alg); if (!IS_ENABLED(CONFIG_CRYPTO_STATS)) return err; if (err) atomic64_inc(&istat->err_cnt); return err; } static int lskcipher_setkey_unaligned(struct crypto_lskcipher *tfm, const u8 *key, unsigned int keylen) { unsigned long alignmask = crypto_lskcipher_alignmask(tfm); struct lskcipher_alg *cipher = crypto_lskcipher_alg(tfm); u8 *buffer, *alignbuffer; unsigned long absize; int ret; absize = keylen + alignmask; buffer = kmalloc(absize, GFP_ATOMIC); if (!buffer) return -ENOMEM; alignbuffer = (u8 *)ALIGN((unsigned long)buffer, alignmask + 1); memcpy(alignbuffer, key, keylen); ret = cipher->setkey(tfm, alignbuffer, keylen); kfree_sensitive(buffer); return ret; } int crypto_lskcipher_setkey(struct crypto_lskcipher *tfm, const u8 *key, unsigned int keylen) { unsigned long alignmask = crypto_lskcipher_alignmask(tfm); struct lskcipher_alg *cipher = crypto_lskcipher_alg(tfm); if (keylen < cipher->co.min_keysize || keylen > cipher->co.max_keysize) return -EINVAL; if ((unsigned long)key & alignmask) return lskcipher_setkey_unaligned(tfm, key, keylen); else return cipher->setkey(tfm, key, keylen); } EXPORT_SYMBOL_GPL(crypto_lskcipher_setkey); static int crypto_lskcipher_crypt_unaligned( struct crypto_lskcipher *tfm, const u8 *src, u8 *dst, unsigned len, u8 *iv, int (*crypt)(struct crypto_lskcipher *tfm, const u8 *src, u8 *dst, unsigned len, u8 *iv, u32 flags)) { unsigned statesize = crypto_lskcipher_statesize(tfm); unsigned ivsize = crypto_lskcipher_ivsize(tfm); unsigned bs = crypto_lskcipher_blocksize(tfm); unsigned cs = crypto_lskcipher_chunksize(tfm); int err; u8 *tiv; u8 *p; BUILD_BUG_ON(MAX_CIPHER_BLOCKSIZE > PAGE_SIZE || MAX_CIPHER_ALIGNMASK >= PAGE_SIZE); tiv = kmalloc(PAGE_SIZE, GFP_ATOMIC); if (!tiv) return -ENOMEM; memcpy(tiv, iv, ivsize + statesize); p = kmalloc(PAGE_SIZE, GFP_ATOMIC); err = -ENOMEM; if (!p) goto out; while (len >= bs) { unsigned chunk = min((unsigned)PAGE_SIZE, len); int err; if (chunk > cs) chunk &= ~(cs - 1); memcpy(p, src, chunk); err = crypt(tfm, p, p, chunk, tiv, CRYPTO_LSKCIPHER_FLAG_FINAL); if (err) goto out; memcpy(dst, p, chunk); src += chunk; dst += chunk; len -= chunk; } err = len ? -EINVAL : 0; out: memcpy(iv, tiv, ivsize + statesize); kfree_sensitive(p); kfree_sensitive(tiv); return err; } static int crypto_lskcipher_crypt(struct crypto_lskcipher *tfm, const u8 *src, u8 *dst, unsigned len, u8 *iv, int (*crypt)(struct crypto_lskcipher *tfm, const u8 *src, u8 *dst, unsigned len, u8 *iv, u32 flags)) { unsigned long alignmask = crypto_lskcipher_alignmask(tfm); struct lskcipher_alg *alg = crypto_lskcipher_alg(tfm); int ret; if (((unsigned long)src | (unsigned long)dst | (unsigned long)iv) & alignmask) { ret = crypto_lskcipher_crypt_unaligned(tfm, src, dst, len, iv, crypt); goto out; } ret = crypt(tfm, src, dst, len, iv, CRYPTO_LSKCIPHER_FLAG_FINAL); out: return crypto_lskcipher_errstat(alg, ret); } int crypto_lskcipher_encrypt(struct crypto_lskcipher *tfm, const u8 *src, u8 *dst, unsigned len, u8 *iv) { struct lskcipher_alg *alg = crypto_lskcipher_alg(tfm); if (IS_ENABLED(CONFIG_CRYPTO_STATS)) { struct crypto_istat_cipher *istat = lskcipher_get_stat(alg); atomic64_inc(&istat->encrypt_cnt); atomic64_add(len, &istat->encrypt_tlen); } return crypto_lskcipher_crypt(tfm, src, dst, len, iv, alg->encrypt); } EXPORT_SYMBOL_GPL(crypto_lskcipher_encrypt); int crypto_lskcipher_decrypt(struct crypto_lskcipher *tfm, const u8 *src, u8 *dst, unsigned len, u8 *iv) { struct lskcipher_alg *alg = crypto_lskcipher_alg(tfm); if (IS_ENABLED(CONFIG_CRYPTO_STATS)) { struct crypto_istat_cipher *istat = lskcipher_get_stat(alg); atomic64_inc(&istat->decrypt_cnt); atomic64_add(len, &istat->decrypt_tlen); } return crypto_lskcipher_crypt(tfm, src, dst, len, iv, alg->decrypt); } EXPORT_SYMBOL_GPL(crypto_lskcipher_decrypt); static int crypto_lskcipher_crypt_sg(struct skcipher_request *req, int (*crypt)(struct crypto_lskcipher *tfm, const u8 *src, u8 *dst, unsigned len, u8 *ivs, u32 flags)) { struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req); struct crypto_lskcipher **ctx = crypto_skcipher_ctx(skcipher); u8 *ivs = skcipher_request_ctx(req); struct crypto_lskcipher *tfm = *ctx; struct skcipher_walk walk; unsigned ivsize; u32 flags; int err; ivsize = crypto_lskcipher_ivsize(tfm); ivs = PTR_ALIGN(ivs, crypto_skcipher_alignmask(skcipher) + 1); memcpy(ivs, req->iv, ivsize); flags = req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP; if (req->base.flags & CRYPTO_SKCIPHER_REQ_CONT) flags |= CRYPTO_LSKCIPHER_FLAG_CONT; if (!(req->base.flags & CRYPTO_SKCIPHER_REQ_NOTFINAL)) flags |= CRYPTO_LSKCIPHER_FLAG_FINAL; err = skcipher_walk_virt(&walk, req, false); while (walk.nbytes) { err = crypt(tfm, walk.src.virt.addr, walk.dst.virt.addr, walk.nbytes, ivs, flags & ~(walk.nbytes == walk.total ? 0 : CRYPTO_LSKCIPHER_FLAG_FINAL)); err = skcipher_walk_done(&walk, err); flags |= CRYPTO_LSKCIPHER_FLAG_CONT; } memcpy(req->iv, ivs, ivsize); return err; } int crypto_lskcipher_encrypt_sg(struct skcipher_request *req) { struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req); struct crypto_lskcipher **ctx = crypto_skcipher_ctx(skcipher); struct lskcipher_alg *alg = crypto_lskcipher_alg(*ctx); return crypto_lskcipher_crypt_sg(req, alg->encrypt); } int crypto_lskcipher_decrypt_sg(struct skcipher_request *req) { struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req); struct crypto_lskcipher **ctx = crypto_skcipher_ctx(skcipher); struct lskcipher_alg *alg = crypto_lskcipher_alg(*ctx); return crypto_lskcipher_crypt_sg(req, alg->decrypt); } static void crypto_lskcipher_exit_tfm(struct crypto_tfm *tfm) { struct crypto_lskcipher *skcipher = __crypto_lskcipher_cast(tfm); struct lskcipher_alg *alg = crypto_lskcipher_alg(skcipher); alg->exit(skcipher); } static int crypto_lskcipher_init_tfm(struct crypto_tfm *tfm) { struct crypto_lskcipher *skcipher = __crypto_lskcipher_cast(tfm); struct lskcipher_alg *alg = crypto_lskcipher_alg(skcipher); if (alg->exit) skcipher->base.exit = crypto_lskcipher_exit_tfm; if (alg->init) return alg->init(skcipher); return 0; } static void crypto_lskcipher_free_instance(struct crypto_instance *inst) { struct lskcipher_instance *skcipher = container_of(inst, struct lskcipher_instance, s.base); skcipher->free(skcipher); } static void __maybe_unused crypto_lskcipher_show( struct seq_file *m, struct crypto_alg *alg) { struct lskcipher_alg *skcipher = __crypto_lskcipher_alg(alg); seq_printf(m, "type : lskcipher\n"); seq_printf(m, "blocksize : %u\n", alg->cra_blocksize); seq_printf(m, "min keysize : %u\n", skcipher->co.min_keysize); seq_printf(m, "max keysize : %u\n", skcipher->co.max_keysize); seq_printf(m, "ivsize : %u\n", skcipher->co.ivsize); seq_printf(m, "chunksize : %u\n", skcipher->co.chunksize); seq_printf(m, "statesize : %u\n", skcipher->co.statesize); } static int __maybe_unused crypto_lskcipher_report( struct sk_buff *skb, struct crypto_alg *alg) { struct lskcipher_alg *skcipher = __crypto_lskcipher_alg(alg); struct crypto_report_blkcipher rblkcipher; memset(&rblkcipher, 0, sizeof(rblkcipher)); strscpy(rblkcipher.type, "lskcipher", sizeof(rblkcipher.type)); strscpy(rblkcipher.geniv, "<none>", sizeof(rblkcipher.geniv)); rblkcipher.blocksize = alg->cra_blocksize; rblkcipher.min_keysize = skcipher->co.min_keysize; rblkcipher.max_keysize = skcipher->co.max_keysize; rblkcipher.ivsize = skcipher->co.ivsize; return nla_put(skb, CRYPTOCFGA_REPORT_BLKCIPHER, sizeof(rblkcipher), &rblkcipher); } static int __maybe_unused crypto_lskcipher_report_stat( struct sk_buff *skb, struct crypto_alg *alg) { struct lskcipher_alg *skcipher = __crypto_lskcipher_alg(alg); struct crypto_istat_cipher *istat; struct crypto_stat_cipher rcipher; istat = lskcipher_get_stat(skcipher); memset(&rcipher, 0, sizeof(rcipher)); strscpy(rcipher.type, "cipher", sizeof(rcipher.type)); rcipher.stat_encrypt_cnt = atomic64_read(&istat->encrypt_cnt); rcipher.stat_encrypt_tlen = atomic64_read(&istat->encrypt_tlen); rcipher.stat_decrypt_cnt = atomic64_read(&istat->decrypt_cnt); rcipher.stat_decrypt_tlen = atomic64_read(&istat->decrypt_tlen); rcipher.stat_err_cnt = atomic64_read(&istat->err_cnt); return nla_put(skb, CRYPTOCFGA_STAT_CIPHER, sizeof(rcipher), &rcipher); } static const struct crypto_type crypto_lskcipher_type = { .extsize = crypto_alg_extsize, .init_tfm = crypto_lskcipher_init_tfm, .free = crypto_lskcipher_free_instance, #ifdef CONFIG_PROC_FS .show = crypto_lskcipher_show, #endif #if IS_ENABLED(CONFIG_CRYPTO_USER) .report = crypto_lskcipher_report, #endif #ifdef CONFIG_CRYPTO_STATS .report_stat = crypto_lskcipher_report_stat, #endif .maskclear = ~CRYPTO_ALG_TYPE_MASK, .maskset = CRYPTO_ALG_TYPE_MASK, .type = CRYPTO_ALG_TYPE_LSKCIPHER, .tfmsize = offsetof(struct crypto_lskcipher, base), }; static void crypto_lskcipher_exit_tfm_sg(struct crypto_tfm *tfm) { struct crypto_lskcipher **ctx = crypto_tfm_ctx(tfm); crypto_free_lskcipher(*ctx); } int crypto_init_lskcipher_ops_sg(struct crypto_tfm *tfm) { struct crypto_lskcipher **ctx = crypto_tfm_ctx(tfm); struct crypto_alg *calg = tfm->__crt_alg; struct crypto_lskcipher *skcipher; if (!crypto_mod_get(calg)) return -EAGAIN; skcipher = crypto_create_tfm(calg, &crypto_lskcipher_type); if (IS_ERR(skcipher)) { crypto_mod_put(calg); return PTR_ERR(skcipher); } *ctx = skcipher; tfm->exit = crypto_lskcipher_exit_tfm_sg; return 0; } int crypto_grab_lskcipher(struct crypto_lskcipher_spawn *spawn, struct crypto_instance *inst, const char *name, u32 type, u32 mask) { spawn->base.frontend = &crypto_lskcipher_type; return crypto_grab_spawn(&spawn->base, inst, name, type, mask); } EXPORT_SYMBOL_GPL(crypto_grab_lskcipher); struct crypto_lskcipher *crypto_alloc_lskcipher(const char *alg_name, u32 type, u32 mask) { return crypto_alloc_tfm(alg_name, &crypto_lskcipher_type, type, mask); } EXPORT_SYMBOL_GPL(crypto_alloc_lskcipher); static int lskcipher_prepare_alg(struct lskcipher_alg *alg) { struct crypto_alg *base = &alg->co.base; int err; err = skcipher_prepare_alg_common(&alg->co); if (err) return err; if (alg->co.chunksize & (alg->co.chunksize - 1)) return -EINVAL; base->cra_type = &crypto_lskcipher_type; base->cra_flags |= CRYPTO_ALG_TYPE_LSKCIPHER; return 0; } int crypto_register_lskcipher(struct lskcipher_alg *alg) { struct crypto_alg *base = &alg->co.base; int err; err = lskcipher_prepare_alg(alg); if (err) return err; return crypto_register_alg(base); } EXPORT_SYMBOL_GPL(crypto_register_lskcipher); void crypto_unregister_lskcipher(struct lskcipher_alg *alg) { crypto_unregister_alg(&alg->co.base); } EXPORT_SYMBOL_GPL(crypto_unregister_lskcipher); int crypto_register_lskciphers(struct lskcipher_alg *algs, int count) { int i, ret; for (i = 0; i < count; i++) { ret = crypto_register_lskcipher(&algs[i]); if (ret) goto err; } return 0; err: for (--i; i >= 0; --i) crypto_unregister_lskcipher(&algs[i]); return ret; } EXPORT_SYMBOL_GPL(crypto_register_lskciphers); void crypto_unregister_lskciphers(struct lskcipher_alg *algs, int count) { int i; for (i = count - 1; i >= 0; --i) crypto_unregister_lskcipher(&algs[i]); } EXPORT_SYMBOL_GPL(crypto_unregister_lskciphers); int lskcipher_register_instance(struct crypto_template *tmpl, struct lskcipher_instance *inst) { int err; if (WARN_ON(!inst->free)) return -EINVAL; err = lskcipher_prepare_alg(&inst->alg); if (err) return err; return crypto_register_instance(tmpl, lskcipher_crypto_instance(inst)); } EXPORT_SYMBOL_GPL(lskcipher_register_instance); static int lskcipher_setkey_simple(struct crypto_lskcipher *tfm, const u8 *key, unsigned int keylen) { struct crypto_lskcipher *cipher = lskcipher_cipher_simple(tfm); crypto_lskcipher_clear_flags(cipher, CRYPTO_TFM_REQ_MASK); crypto_lskcipher_set_flags(cipher, crypto_lskcipher_get_flags(tfm) & CRYPTO_TFM_REQ_MASK); return crypto_lskcipher_setkey(cipher, key, keylen); } static int lskcipher_init_tfm_simple(struct crypto_lskcipher *tfm) { struct lskcipher_instance *inst = lskcipher_alg_instance(tfm); struct crypto_lskcipher **ctx = crypto_lskcipher_ctx(tfm); struct crypto_lskcipher_spawn *spawn; struct crypto_lskcipher *cipher; spawn = lskcipher_instance_ctx(inst); cipher = crypto_spawn_lskcipher(spawn); if (IS_ERR(cipher)) return PTR_ERR(cipher); *ctx = cipher; return 0; } static void lskcipher_exit_tfm_simple(struct crypto_lskcipher *tfm) { struct crypto_lskcipher **ctx = crypto_lskcipher_ctx(tfm); crypto_free_lskcipher(*ctx); } static void lskcipher_free_instance_simple(struct lskcipher_instance *inst) { crypto_drop_lskcipher(lskcipher_instance_ctx(inst)); kfree(inst); } /** * lskcipher_alloc_instance_simple - allocate instance of simple block cipher * * Allocate an lskcipher_instance for a simple block cipher mode of operation, * e.g. cbc or ecb. The instance context will have just a single crypto_spawn, * that for the underlying cipher. The {min,max}_keysize, ivsize, blocksize, * alignmask, and priority are set from the underlying cipher but can be * overridden if needed. The tfm context defaults to * struct crypto_lskcipher *, and default ->setkey(), ->init(), and * ->exit() methods are installed. * * @tmpl: the template being instantiated * @tb: the template parameters * * Return: a pointer to the new instance, or an ERR_PTR(). The caller still * needs to register the instance. */ struct lskcipher_instance *lskcipher_alloc_instance_simple( struct crypto_template *tmpl, struct rtattr **tb) { u32 mask; struct lskcipher_instance *inst; struct crypto_lskcipher_spawn *spawn; char ecb_name[CRYPTO_MAX_ALG_NAME]; struct lskcipher_alg *cipher_alg; const char *cipher_name; int err; err = crypto_check_attr_type(tb, CRYPTO_ALG_TYPE_LSKCIPHER, &mask); if (err) return ERR_PTR(err); cipher_name = crypto_attr_alg_name(tb[1]); if (IS_ERR(cipher_name)) return ERR_CAST(cipher_name); inst = kzalloc(sizeof(*inst) + sizeof(*spawn), GFP_KERNEL); if (!inst) return ERR_PTR(-ENOMEM); spawn = lskcipher_instance_ctx(inst); err = crypto_grab_lskcipher(spawn, lskcipher_crypto_instance(inst), cipher_name, 0, mask); ecb_name[0] = 0; if (err == -ENOENT && !!memcmp(tmpl->name, "ecb", 4)) { err = -ENAMETOOLONG; if (snprintf(ecb_name, CRYPTO_MAX_ALG_NAME, "ecb(%s)", cipher_name) >= CRYPTO_MAX_ALG_NAME) goto err_free_inst; err = crypto_grab_lskcipher(spawn, lskcipher_crypto_instance(inst), ecb_name, 0, mask); } if (err) goto err_free_inst; cipher_alg = crypto_lskcipher_spawn_alg(spawn); err = crypto_inst_setname(lskcipher_crypto_instance(inst), tmpl->name, &cipher_alg->co.base); if (err) goto err_free_inst; if (ecb_name[0]) { int len; err = -EINVAL; len = strscpy(ecb_name, &cipher_alg->co.base.cra_name[4], sizeof(ecb_name)); if (len < 2) goto err_free_inst; if (ecb_name[len - 1] != ')') goto err_free_inst; ecb_name[len - 1] = 0; err = -ENAMETOOLONG; if (snprintf(inst->alg.co.base.cra_name, CRYPTO_MAX_ALG_NAME, "%s(%s)", tmpl->name, ecb_name) >= CRYPTO_MAX_ALG_NAME) goto err_free_inst; if (strcmp(ecb_name, cipher_name) && snprintf(inst->alg.co.base.cra_driver_name, CRYPTO_MAX_ALG_NAME, "%s(%s)", tmpl->name, cipher_name) >= CRYPTO_MAX_ALG_NAME) goto err_free_inst; } else { /* Don't allow nesting. */ err = -ELOOP; if ((cipher_alg->co.base.cra_flags & CRYPTO_ALG_INSTANCE)) goto err_free_inst; } err = -EINVAL; if (cipher_alg->co.ivsize) goto err_free_inst; inst->free = lskcipher_free_instance_simple; /* Default algorithm properties, can be overridden */ inst->alg.co.base.cra_blocksize = cipher_alg->co.base.cra_blocksize; inst->alg.co.base.cra_alignmask = cipher_alg->co.base.cra_alignmask; inst->alg.co.base.cra_priority = cipher_alg->co.base.cra_priority; inst->alg.co.min_keysize = cipher_alg->co.min_keysize; inst->alg.co.max_keysize = cipher_alg->co.max_keysize; inst->alg.co.ivsize = cipher_alg->co.base.cra_blocksize; inst->alg.co.statesize = cipher_alg->co.statesize; /* Use struct crypto_lskcipher * by default, can be overridden */ inst->alg.co.base.cra_ctxsize = sizeof(struct crypto_lskcipher *); inst->alg.setkey = lskcipher_setkey_simple; inst->alg.init = lskcipher_init_tfm_simple; inst->alg.exit = lskcipher_exit_tfm_simple; return inst; err_free_inst: lskcipher_free_instance_simple(inst); return ERR_PTR(err); } EXPORT_SYMBOL_GPL(lskcipher_alloc_instance_simple); |
20 1 9 1 23 2 1 20 2 1 2 16 15 9 9 9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | // SPDX-License-Identifier: GPL-2.0 /* * rtc and date/time utility functions * * Copyright (C) 2005-06 Tower Technologies * Author: Alessandro Zummo <a.zummo@towertech.it> * * based on arch/arm/common/rtctime.c and other bits * * Author: Cassio Neri <cassio.neri@gmail.com> (rtc_time64_to_tm) */ #include <linux/export.h> #include <linux/rtc.h> static const unsigned char rtc_days_in_month[] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 }; static const unsigned short rtc_ydays[2][13] = { /* Normal years */ { 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365 }, /* Leap years */ { 0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366 } }; /* * The number of days in the month. */ int rtc_month_days(unsigned int month, unsigned int year) { return rtc_days_in_month[month] + (is_leap_year(year) && month == 1); } EXPORT_SYMBOL(rtc_month_days); /* * The number of days since January 1. (0 to 365) */ int rtc_year_days(unsigned int day, unsigned int month, unsigned int year) { return rtc_ydays[is_leap_year(year)][month] + day - 1; } EXPORT_SYMBOL(rtc_year_days); /** * rtc_time64_to_tm - converts time64_t to rtc_time. * * @time: The number of seconds since 01-01-1970 00:00:00. * (Must be positive.) * @tm: Pointer to the struct rtc_time. */ void rtc_time64_to_tm(time64_t time, struct rtc_time *tm) { unsigned int secs; int days; u64 u64tmp; u32 u32tmp, udays, century, day_of_century, year_of_century, year, day_of_year, month, day; bool is_Jan_or_Feb, is_leap_year; /* time must be positive */ days = div_s64_rem(time, 86400, &secs); /* day of the week, 1970-01-01 was a Thursday */ tm->tm_wday = (days + 4) % 7; /* * The following algorithm is, basically, Proposition 6.3 of Neri * and Schneider [1]. In a few words: it works on the computational * (fictitious) calendar where the year starts in March, month = 2 * (*), and finishes in February, month = 13. This calendar is * mathematically convenient because the day of the year does not * depend on whether the year is leap or not. For instance: * * March 1st 0-th day of the year; * ... * April 1st 31-st day of the year; * ... * January 1st 306-th day of the year; (Important!) * ... * February 28th 364-th day of the year; * February 29th 365-th day of the year (if it exists). * * After having worked out the date in the computational calendar * (using just arithmetics) it's easy to convert it to the * corresponding date in the Gregorian calendar. * * [1] "Euclidean Affine Functions and Applications to Calendar * Algorithms". https://arxiv.org/abs/2102.06959 * * (*) The numbering of months follows rtc_time more closely and * thus, is slightly different from [1]. */ udays = ((u32) days) + 719468; u32tmp = 4 * udays + 3; century = u32tmp / 146097; day_of_century = u32tmp % 146097 / 4; u32tmp = 4 * day_of_century + 3; u64tmp = 2939745ULL * u32tmp; year_of_century = upper_32_bits(u64tmp); day_of_year = lower_32_bits(u64tmp) / 2939745 / 4; year = 100 * century + year_of_century; is_leap_year = year_of_century != 0 ? year_of_century % 4 == 0 : century % 4 == 0; u32tmp = 2141 * day_of_year + 132377; month = u32tmp >> 16; day = ((u16) u32tmp) / 2141; /* * Recall that January 01 is the 306-th day of the year in the * computational (not Gregorian) calendar. */ is_Jan_or_Feb = day_of_year >= 306; /* Converts to the Gregorian calendar. */ year = year + is_Jan_or_Feb; month = is_Jan_or_Feb ? month - 12 : month; day = day + 1; day_of_year = is_Jan_or_Feb ? day_of_year - 306 : day_of_year + 31 + 28 + is_leap_year; /* Converts to rtc_time's format. */ tm->tm_year = (int) (year - 1900); tm->tm_mon = (int) month; tm->tm_mday = (int) day; tm->tm_yday = (int) day_of_year + 1; tm->tm_hour = secs / 3600; secs -= tm->tm_hour * 3600; tm->tm_min = secs / 60; tm->tm_sec = secs - tm->tm_min * 60; tm->tm_isdst = 0; } EXPORT_SYMBOL(rtc_time64_to_tm); /* * Does the rtc_time represent a valid date/time? */ int rtc_valid_tm(struct rtc_time *tm) { if (tm->tm_year < 70 || tm->tm_year > (INT_MAX - 1900) || ((unsigned int)tm->tm_mon) >= 12 || tm->tm_mday < 1 || tm->tm_mday > rtc_month_days(tm->tm_mon, ((unsigned int)tm->tm_year + 1900)) || ((unsigned int)tm->tm_hour) >= 24 || ((unsigned int)tm->tm_min) >= 60 || ((unsigned int)tm->tm_sec) >= 60) return -EINVAL; return 0; } EXPORT_SYMBOL(rtc_valid_tm); /* * rtc_tm_to_time64 - Converts rtc_time to time64_t. * Convert Gregorian date to seconds since 01-01-1970 00:00:00. */ time64_t rtc_tm_to_time64(struct rtc_time *tm) { return mktime64(((unsigned int)tm->tm_year + 1900), tm->tm_mon + 1, tm->tm_mday, tm->tm_hour, tm->tm_min, tm->tm_sec); } EXPORT_SYMBOL(rtc_tm_to_time64); /* * Convert rtc_time to ktime */ ktime_t rtc_tm_to_ktime(struct rtc_time tm) { return ktime_set(rtc_tm_to_time64(&tm), 0); } EXPORT_SYMBOL_GPL(rtc_tm_to_ktime); /* * Convert ktime to rtc_time */ struct rtc_time rtc_ktime_to_tm(ktime_t kt) { struct timespec64 ts; struct rtc_time ret; ts = ktime_to_timespec64(kt); /* Round up any ns */ if (ts.tv_nsec) ts.tv_sec++; rtc_time64_to_tm(ts.tv_sec, &ret); return ret; } EXPORT_SYMBOL_GPL(rtc_ktime_to_tm); |
75 58 60 60 56 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 | // SPDX-License-Identifier: GPL-2.0-only /* * partition.c * * PURPOSE * Partition handling routines for the OSTA-UDF(tm) filesystem. * * COPYRIGHT * (C) 1998-2001 Ben Fennema * * HISTORY * * 12/06/98 blf Created file. * */ #include "udfdecl.h" #include "udf_sb.h" #include "udf_i.h" #include <linux/fs.h> #include <linux/string.h> #include <linux/mutex.h> uint32_t udf_get_pblock(struct super_block *sb, uint32_t block, uint16_t partition, uint32_t offset) { struct udf_sb_info *sbi = UDF_SB(sb); struct udf_part_map *map; if (partition >= sbi->s_partitions) { udf_debug("block=%u, partition=%u, offset=%u: invalid partition\n", block, partition, offset); return 0xFFFFFFFF; } map = &sbi->s_partmaps[partition]; if (map->s_partition_func) return map->s_partition_func(sb, block, partition, offset); else return map->s_partition_root + block + offset; } uint32_t udf_get_pblock_virt15(struct super_block *sb, uint32_t block, uint16_t partition, uint32_t offset) { struct buffer_head *bh = NULL; uint32_t newblock; uint32_t index; uint32_t loc; struct udf_sb_info *sbi = UDF_SB(sb); struct udf_part_map *map; struct udf_virtual_data *vdata; struct udf_inode_info *iinfo = UDF_I(sbi->s_vat_inode); int err; map = &sbi->s_partmaps[partition]; vdata = &map->s_type_specific.s_virtual; if (block > vdata->s_num_entries) { udf_debug("Trying to access block beyond end of VAT (%u max %u)\n", block, vdata->s_num_entries); return 0xFFFFFFFF; } if (iinfo->i_alloc_type == ICBTAG_FLAG_AD_IN_ICB) { loc = le32_to_cpu(((__le32 *)(iinfo->i_data + vdata->s_start_offset))[block]); goto translate; } index = (sb->s_blocksize - vdata->s_start_offset) / sizeof(uint32_t); if (block >= index) { block -= index; newblock = 1 + (block / (sb->s_blocksize / sizeof(uint32_t))); index = block % (sb->s_blocksize / sizeof(uint32_t)); } else { newblock = 0; index = vdata->s_start_offset / sizeof(uint32_t) + block; } bh = udf_bread(sbi->s_vat_inode, newblock, 0, &err); if (!bh) { udf_debug("get_pblock(UDF_VIRTUAL_MAP:%p,%u,%u)\n", sb, block, partition); return 0xFFFFFFFF; } loc = le32_to_cpu(((__le32 *)bh->b_data)[index]); brelse(bh); translate: if (iinfo->i_location.partitionReferenceNum == partition) { udf_debug("recursive call to udf_get_pblock!\n"); return 0xFFFFFFFF; } return udf_get_pblock(sb, loc, iinfo->i_location.partitionReferenceNum, offset); } inline uint32_t udf_get_pblock_virt20(struct super_block *sb, uint32_t block, uint16_t partition, uint32_t offset) { return udf_get_pblock_virt15(sb, block, partition, offset); } uint32_t udf_get_pblock_spar15(struct super_block *sb, uint32_t block, uint16_t partition, uint32_t offset) { int i; struct sparingTable *st = NULL; struct udf_sb_info *sbi = UDF_SB(sb); struct udf_part_map *map; uint32_t packet; struct udf_sparing_data *sdata; map = &sbi->s_partmaps[partition]; sdata = &map->s_type_specific.s_sparing; packet = (block + offset) & ~(sdata->s_packet_len - 1); for (i = 0; i < 4; i++) { if (sdata->s_spar_map[i] != NULL) { st = (struct sparingTable *) sdata->s_spar_map[i]->b_data; break; } } if (st) { for (i = 0; i < le16_to_cpu(st->reallocationTableLen); i++) { struct sparingEntry *entry = &st->mapEntry[i]; u32 origLoc = le32_to_cpu(entry->origLocation); if (origLoc >= 0xFFFFFFF0) break; else if (origLoc == packet) return le32_to_cpu(entry->mappedLocation) + ((block + offset) & (sdata->s_packet_len - 1)); else if (origLoc > packet) break; } } return map->s_partition_root + block + offset; } int udf_relocate_blocks(struct super_block *sb, long old_block, long *new_block) { struct udf_sparing_data *sdata; struct sparingTable *st = NULL; struct sparingEntry mapEntry; uint32_t packet; int i, j, k, l; struct udf_sb_info *sbi = UDF_SB(sb); u16 reallocationTableLen; struct buffer_head *bh; int ret = 0; mutex_lock(&sbi->s_alloc_mutex); for (i = 0; i < sbi->s_partitions; i++) { struct udf_part_map *map = &sbi->s_partmaps[i]; if (old_block > map->s_partition_root && old_block < map->s_partition_root + map->s_partition_len) { sdata = &map->s_type_specific.s_sparing; packet = (old_block - map->s_partition_root) & ~(sdata->s_packet_len - 1); for (j = 0; j < 4; j++) if (sdata->s_spar_map[j] != NULL) { st = (struct sparingTable *) sdata->s_spar_map[j]->b_data; break; } if (!st) { ret = 1; goto out; } reallocationTableLen = le16_to_cpu(st->reallocationTableLen); for (k = 0; k < reallocationTableLen; k++) { struct sparingEntry *entry = &st->mapEntry[k]; u32 origLoc = le32_to_cpu(entry->origLocation); if (origLoc == 0xFFFFFFFF) { for (; j < 4; j++) { int len; bh = sdata->s_spar_map[j]; if (!bh) continue; st = (struct sparingTable *) bh->b_data; entry->origLocation = cpu_to_le32(packet); len = sizeof(struct sparingTable) + reallocationTableLen * sizeof(struct sparingEntry); udf_update_tag((char *)st, len); mark_buffer_dirty(bh); } *new_block = le32_to_cpu( entry->mappedLocation) + ((old_block - map->s_partition_root) & (sdata->s_packet_len - 1)); ret = 0; goto out; } else if (origLoc == packet) { *new_block = le32_to_cpu( entry->mappedLocation) + ((old_block - map->s_partition_root) & (sdata->s_packet_len - 1)); ret = 0; goto out; } else if (origLoc > packet) break; } for (l = k; l < reallocationTableLen; l++) { struct sparingEntry *entry = &st->mapEntry[l]; u32 origLoc = le32_to_cpu(entry->origLocation); if (origLoc != 0xFFFFFFFF) continue; for (; j < 4; j++) { bh = sdata->s_spar_map[j]; if (!bh) continue; st = (struct sparingTable *)bh->b_data; mapEntry = st->mapEntry[l]; mapEntry.origLocation = cpu_to_le32(packet); memmove(&st->mapEntry[k + 1], &st->mapEntry[k], (l - k) * sizeof(struct sparingEntry)); st->mapEntry[k] = mapEntry; udf_update_tag((char *)st, sizeof(struct sparingTable) + reallocationTableLen * sizeof(struct sparingEntry)); mark_buffer_dirty(bh); } *new_block = le32_to_cpu( st->mapEntry[k].mappedLocation) + ((old_block - map->s_partition_root) & (sdata->s_packet_len - 1)); ret = 0; goto out; } ret = 1; goto out; } /* if old_block */ } if (i == sbi->s_partitions) { /* outside of partitions */ /* for now, fail =) */ ret = 1; } out: mutex_unlock(&sbi->s_alloc_mutex); return ret; } static uint32_t udf_try_read_meta(struct inode *inode, uint32_t block, uint16_t partition, uint32_t offset) { struct super_block *sb = inode->i_sb; struct udf_part_map *map; struct kernel_lb_addr eloc; uint32_t elen; sector_t ext_offset; struct extent_position epos = {}; uint32_t phyblock; if (inode_bmap(inode, block, &epos, &eloc, &elen, &ext_offset) != (EXT_RECORDED_ALLOCATED >> 30)) phyblock = 0xFFFFFFFF; else { map = &UDF_SB(sb)->s_partmaps[partition]; /* map to sparable/physical partition desc */ phyblock = udf_get_pblock(sb, eloc.logicalBlockNum, map->s_type_specific.s_metadata.s_phys_partition_ref, ext_offset + offset); } brelse(epos.bh); return phyblock; } uint32_t udf_get_pblock_meta25(struct super_block *sb, uint32_t block, uint16_t partition, uint32_t offset) { struct udf_sb_info *sbi = UDF_SB(sb); struct udf_part_map *map; struct udf_meta_data *mdata; uint32_t retblk; struct inode *inode; udf_debug("READING from METADATA\n"); map = &sbi->s_partmaps[partition]; mdata = &map->s_type_specific.s_metadata; inode = mdata->s_metadata_fe ? : mdata->s_mirror_fe; if (!inode) return 0xFFFFFFFF; retblk = udf_try_read_meta(inode, block, partition, offset); if (retblk == 0xFFFFFFFF && mdata->s_metadata_fe) { udf_warn(sb, "error reading from METADATA, trying to read from MIRROR\n"); if (!(mdata->s_flags & MF_MIRROR_FE_LOADED)) { mdata->s_mirror_fe = udf_find_metadata_inode_efe(sb, mdata->s_mirror_file_loc, mdata->s_phys_partition_ref); if (IS_ERR(mdata->s_mirror_fe)) mdata->s_mirror_fe = NULL; mdata->s_flags |= MF_MIRROR_FE_LOADED; } inode = mdata->s_mirror_fe; if (!inode) return 0xFFFFFFFF; retblk = udf_try_read_meta(inode, block, partition, offset); } return retblk; } |
1 2 2 2 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | // SPDX-License-Identifier: GPL-2.0 #include <linux/kernel.h> #include <linux/errno.h> #include <linux/fs.h> #include <linux/file.h> #include <linux/mm.h> #include <linux/slab.h> #include <linux/namei.h> #include <linux/io_uring.h> #include <linux/splice.h> #include <uapi/linux/io_uring.h> #include "io_uring.h" #include "splice.h" struct io_splice { struct file *file_out; loff_t off_out; loff_t off_in; u64 len; int splice_fd_in; unsigned int flags; }; static int __io_splice_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_splice *sp = io_kiocb_to_cmd(req, struct io_splice); unsigned int valid_flags = SPLICE_F_FD_IN_FIXED | SPLICE_F_ALL; sp->len = READ_ONCE(sqe->len); sp->flags = READ_ONCE(sqe->splice_flags); if (unlikely(sp->flags & ~valid_flags)) return -EINVAL; sp->splice_fd_in = READ_ONCE(sqe->splice_fd_in); req->flags |= REQ_F_FORCE_ASYNC; return 0; } int io_tee_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { if (READ_ONCE(sqe->splice_off_in) || READ_ONCE(sqe->off)) return -EINVAL; return __io_splice_prep(req, sqe); } int io_tee(struct io_kiocb *req, unsigned int issue_flags) { struct io_splice *sp = io_kiocb_to_cmd(req, struct io_splice); struct file *out = sp->file_out; unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED; struct file *in; ssize_t ret = 0; WARN_ON_ONCE(issue_flags & IO_URING_F_NONBLOCK); if (sp->flags & SPLICE_F_FD_IN_FIXED) in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags); else in = io_file_get_normal(req, sp->splice_fd_in); if (!in) { ret = -EBADF; goto done; } if (sp->len) ret = do_tee(in, out, sp->len, flags); if (!(sp->flags & SPLICE_F_FD_IN_FIXED)) fput(in); done: if (ret != sp->len) req_set_fail(req); io_req_set_res(req, ret, 0); return IOU_OK; } int io_splice_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_splice *sp = io_kiocb_to_cmd(req, struct io_splice); sp->off_in = READ_ONCE(sqe->splice_off_in); sp->off_out = READ_ONCE(sqe->off); return __io_splice_prep(req, sqe); } int io_splice(struct io_kiocb *req, unsigned int issue_flags) { struct io_splice *sp = io_kiocb_to_cmd(req, struct io_splice); struct file *out = sp->file_out; unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED; loff_t *poff_in, *poff_out; struct file *in; ssize_t ret = 0; WARN_ON_ONCE(issue_flags & IO_URING_F_NONBLOCK); if (sp->flags & SPLICE_F_FD_IN_FIXED) in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags); else in = io_file_get_normal(req, sp->splice_fd_in); if (!in) { ret = -EBADF; goto done; } poff_in = (sp->off_in == -1) ? NULL : &sp->off_in; poff_out = (sp->off_out == -1) ? NULL : &sp->off_out; if (sp->len) ret = do_splice(in, poff_in, out, poff_out, sp->len, flags); if (!(sp->flags & SPLICE_F_FD_IN_FIXED)) fput(in); done: if (ret != sp->len) req_set_fail(req); io_req_set_res(req, ret, 0); return IOU_OK; } |
18 21 2 4 14 3 3 28 4 21 9 3 4 14 7 20 5 15 7 8 11 3 2 15 3 3 2 1 3 13 14 18 2 3 14 16 6 2 7 15 8 1 1 6 5 3 13 1 12 9 2 7 8 1 26 1 1 3 21 1 21 5 16 16 2 3 20 21 21 13 5 6 1 3 10 10 28 1 7 20 15 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 | // SPDX-License-Identifier: GPL-2.0 /* * fs/timerfd.c * * Copyright (C) 2007 Davide Libenzi <davidel@xmailserver.org> * * * Thanks to Thomas Gleixner for code reviews and useful comments. * */ #include <linux/alarmtimer.h> #include <linux/file.h> #include <linux/poll.h> #include <linux/init.h> #include <linux/fs.h> #include <linux/sched.h> #include <linux/kernel.h> #include <linux/slab.h> #include <linux/list.h> #include <linux/spinlock.h> #include <linux/time.h> #include <linux/hrtimer.h> #include <linux/anon_inodes.h> #include <linux/timerfd.h> #include <linux/syscalls.h> #include <linux/compat.h> #include <linux/rcupdate.h> #include <linux/time_namespace.h> struct timerfd_ctx { union { struct hrtimer tmr; struct alarm alarm; } t; ktime_t tintv; ktime_t moffs; wait_queue_head_t wqh; u64 ticks; int clockid; short unsigned expired; short unsigned settime_flags; /* to show in fdinfo */ struct rcu_head rcu; struct list_head clist; spinlock_t cancel_lock; bool might_cancel; }; static LIST_HEAD(cancel_list); static DEFINE_SPINLOCK(cancel_lock); static inline bool isalarm(struct timerfd_ctx *ctx) { return ctx->clockid == CLOCK_REALTIME_ALARM || ctx->clockid == CLOCK_BOOTTIME_ALARM; } /* * This gets called when the timer event triggers. We set the "expired" * flag, but we do not re-arm the timer (in case it's necessary, * tintv != 0) until the timer is accessed. */ static void timerfd_triggered(struct timerfd_ctx *ctx) { unsigned long flags; spin_lock_irqsave(&ctx->wqh.lock, flags); ctx->expired = 1; ctx->ticks++; wake_up_locked_poll(&ctx->wqh, EPOLLIN); spin_unlock_irqrestore(&ctx->wqh.lock, flags); } static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr) { struct timerfd_ctx *ctx = container_of(htmr, struct timerfd_ctx, t.tmr); timerfd_triggered(ctx); return HRTIMER_NORESTART; } static enum alarmtimer_restart timerfd_alarmproc(struct alarm *alarm, ktime_t now) { struct timerfd_ctx *ctx = container_of(alarm, struct timerfd_ctx, t.alarm); timerfd_triggered(ctx); return ALARMTIMER_NORESTART; } /* * Called when the clock was set to cancel the timers in the cancel * list. This will wake up processes waiting on these timers. The * wake-up requires ctx->ticks to be non zero, therefore we increment * it before calling wake_up_locked(). */ void timerfd_clock_was_set(void) { ktime_t moffs = ktime_mono_to_real(0); struct timerfd_ctx *ctx; unsigned long flags; rcu_read_lock(); list_for_each_entry_rcu(ctx, &cancel_list, clist) { if (!ctx->might_cancel) continue; spin_lock_irqsave(&ctx->wqh.lock, flags); if (ctx->moffs != moffs) { ctx->moffs = KTIME_MAX; ctx->ticks++; wake_up_locked_poll(&ctx->wqh, EPOLLIN); } spin_unlock_irqrestore(&ctx->wqh.lock, flags); } rcu_read_unlock(); } static void timerfd_resume_work(struct work_struct *work) { timerfd_clock_was_set(); } static DECLARE_WORK(timerfd_work, timerfd_resume_work); /* * Invoked from timekeeping_resume(). Defer the actual update to work so * timerfd_clock_was_set() runs in task context. */ void timerfd_resume(void) { schedule_work(&timerfd_work); } static void __timerfd_remove_cancel(struct timerfd_ctx *ctx) { if (ctx->might_cancel) { ctx->might_cancel = false; spin_lock(&cancel_lock); list_del_rcu(&ctx->clist); spin_unlock(&cancel_lock); } } static void timerfd_remove_cancel(struct timerfd_ctx *ctx) { spin_lock(&ctx->cancel_lock); __timerfd_remove_cancel(ctx); spin_unlock(&ctx->cancel_lock); } static bool timerfd_canceled(struct timerfd_ctx *ctx) { if (!ctx->might_cancel || ctx->moffs != KTIME_MAX) return false; ctx->moffs = ktime_mono_to_real(0); return true; } static void timerfd_setup_cancel(struct timerfd_ctx *ctx, int flags) { spin_lock(&ctx->cancel_lock); if ((ctx->clockid == CLOCK_REALTIME || ctx->clockid == CLOCK_REALTIME_ALARM) && (flags & TFD_TIMER_ABSTIME) && (flags & TFD_TIMER_CANCEL_ON_SET)) { if (!ctx->might_cancel) { ctx->might_cancel = true; spin_lock(&cancel_lock); list_add_rcu(&ctx->clist, &cancel_list); spin_unlock(&cancel_lock); } } else { __timerfd_remove_cancel(ctx); } spin_unlock(&ctx->cancel_lock); } static ktime_t timerfd_get_remaining(struct timerfd_ctx *ctx) { ktime_t remaining; if (isalarm(ctx)) remaining = alarm_expires_remaining(&ctx->t.alarm); else remaining = hrtimer_expires_remaining_adjusted(&ctx->t.tmr); return remaining < 0 ? 0: remaining; } static int timerfd_setup(struct timerfd_ctx *ctx, int flags, const struct itimerspec64 *ktmr) { enum hrtimer_mode htmode; ktime_t texp; int clockid = ctx->clockid; htmode = (flags & TFD_TIMER_ABSTIME) ? HRTIMER_MODE_ABS: HRTIMER_MODE_REL; texp = timespec64_to_ktime(ktmr->it_value); ctx->expired = 0; ctx->ticks = 0; ctx->tintv = timespec64_to_ktime(ktmr->it_interval); if (isalarm(ctx)) { alarm_init(&ctx->t.alarm, ctx->clockid == CLOCK_REALTIME_ALARM ? ALARM_REALTIME : ALARM_BOOTTIME, timerfd_alarmproc); } else { hrtimer_init(&ctx->t.tmr, clockid, htmode); hrtimer_set_expires(&ctx->t.tmr, texp); ctx->t.tmr.function = timerfd_tmrproc; } if (texp != 0) { if (flags & TFD_TIMER_ABSTIME) texp = timens_ktime_to_host(clockid, texp); if (isalarm(ctx)) { if (flags & TFD_TIMER_ABSTIME) alarm_start(&ctx->t.alarm, texp); else alarm_start_relative(&ctx->t.alarm, texp); } else { hrtimer_start(&ctx->t.tmr, texp, htmode); } if (timerfd_canceled(ctx)) return -ECANCELED; } ctx->settime_flags = flags & TFD_SETTIME_FLAGS; return 0; } static int timerfd_release(struct inode *inode, struct file *file) { struct timerfd_ctx *ctx = file->private_data; timerfd_remove_cancel(ctx); if (isalarm(ctx)) alarm_cancel(&ctx->t.alarm); else hrtimer_cancel(&ctx->t.tmr); kfree_rcu(ctx, rcu); return 0; } static __poll_t timerfd_poll(struct file *file, poll_table *wait) { struct timerfd_ctx *ctx = file->private_data; __poll_t events = 0; unsigned long flags; poll_wait(file, &ctx->wqh, wait); spin_lock_irqsave(&ctx->wqh.lock, flags); if (ctx->ticks) events |= EPOLLIN; spin_unlock_irqrestore(&ctx->wqh.lock, flags); return events; } static ssize_t timerfd_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { struct timerfd_ctx *ctx = file->private_data; ssize_t res; u64 ticks = 0; if (count < sizeof(ticks)) return -EINVAL; spin_lock_irq(&ctx->wqh.lock); if (file->f_flags & O_NONBLOCK) res = -EAGAIN; else res = wait_event_interruptible_locked_irq(ctx->wqh, ctx->ticks); /* * If clock has changed, we do not care about the * ticks and we do not rearm the timer. Userspace must * reevaluate anyway. */ if (timerfd_canceled(ctx)) { ctx->ticks = 0; ctx->expired = 0; res = -ECANCELED; } if (ctx->ticks) { ticks = ctx->ticks; if (ctx->expired && ctx->tintv) { /* * If tintv != 0, this is a periodic timer that * needs to be re-armed. We avoid doing it in the timer * callback to avoid DoS attacks specifying a very * short timer period. */ if (isalarm(ctx)) { ticks += alarm_forward_now( &ctx->t.alarm, ctx->tintv) - 1; alarm_restart(&ctx->t.alarm); } else { ticks += hrtimer_forward_now(&ctx->t.tmr, ctx->tintv) - 1; hrtimer_restart(&ctx->t.tmr); } } ctx->expired = 0; ctx->ticks = 0; } spin_unlock_irq(&ctx->wqh.lock); if (ticks) res = put_user(ticks, (u64 __user *) buf) ? -EFAULT: sizeof(ticks); return res; } #ifdef CONFIG_PROC_FS static void timerfd_show(struct seq_file *m, struct file *file) { struct timerfd_ctx *ctx = file->private_data; struct timespec64 value, interval; spin_lock_irq(&ctx->wqh.lock); value = ktime_to_timespec64(timerfd_get_remaining(ctx)); interval = ktime_to_timespec64(ctx->tintv); spin_unlock_irq(&ctx->wqh.lock); seq_printf(m, "clockid: %d\n" "ticks: %llu\n" "settime flags: 0%o\n" "it_value: (%llu, %llu)\n" "it_interval: (%llu, %llu)\n", ctx->clockid, (unsigned long long)ctx->ticks, ctx->settime_flags, (unsigned long long)value.tv_sec, (unsigned long long)value.tv_nsec, (unsigned long long)interval.tv_sec, (unsigned long long)interval.tv_nsec); } #else #define timerfd_show NULL #endif #ifdef CONFIG_CHECKPOINT_RESTORE static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct timerfd_ctx *ctx = file->private_data; int ret = 0; switch (cmd) { case TFD_IOC_SET_TICKS: { u64 ticks; if (copy_from_user(&ticks, (u64 __user *)arg, sizeof(ticks))) return -EFAULT; if (!ticks) return -EINVAL; spin_lock_irq(&ctx->wqh.lock); if (!timerfd_canceled(ctx)) { ctx->ticks = ticks; wake_up_locked_poll(&ctx->wqh, EPOLLIN); } else ret = -ECANCELED; spin_unlock_irq(&ctx->wqh.lock); break; } default: ret = -ENOTTY; break; } return ret; } #else #define timerfd_ioctl NULL #endif static const struct file_operations timerfd_fops = { .release = timerfd_release, .poll = timerfd_poll, .read = timerfd_read, .llseek = noop_llseek, .show_fdinfo = timerfd_show, .unlocked_ioctl = timerfd_ioctl, }; static int timerfd_fget(int fd, struct fd *p) { struct fd f = fdget(fd); if (!f.file) return -EBADF; if (f.file->f_op != &timerfd_fops) { fdput(f); return -EINVAL; } *p = f; return 0; } SYSCALL_DEFINE2(timerfd_create, int, clockid, int, flags) { int ufd; struct timerfd_ctx *ctx; /* Check the TFD_* constants for consistency. */ BUILD_BUG_ON(TFD_CLOEXEC != O_CLOEXEC); BUILD_BUG_ON(TFD_NONBLOCK != O_NONBLOCK); if ((flags & ~TFD_CREATE_FLAGS) || (clockid != CLOCK_MONOTONIC && clockid != CLOCK_REALTIME && clockid != CLOCK_REALTIME_ALARM && clockid != CLOCK_BOOTTIME && clockid != CLOCK_BOOTTIME_ALARM)) return -EINVAL; if ((clockid == CLOCK_REALTIME_ALARM || clockid == CLOCK_BOOTTIME_ALARM) && !capable(CAP_WAKE_ALARM)) return -EPERM; ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); if (!ctx) return -ENOMEM; init_waitqueue_head(&ctx->wqh); spin_lock_init(&ctx->cancel_lock); ctx->clockid = clockid; if (isalarm(ctx)) alarm_init(&ctx->t.alarm, ctx->clockid == CLOCK_REALTIME_ALARM ? ALARM_REALTIME : ALARM_BOOTTIME, timerfd_alarmproc); else hrtimer_init(&ctx->t.tmr, clockid, HRTIMER_MODE_ABS); ctx->moffs = ktime_mono_to_real(0); ufd = anon_inode_getfd("[timerfd]", &timerfd_fops, ctx, O_RDWR | (flags & TFD_SHARED_FCNTL_FLAGS)); if (ufd < 0) kfree(ctx); return ufd; } static int do_timerfd_settime(int ufd, int flags, const struct itimerspec64 *new, struct itimerspec64 *old) { struct fd f; struct timerfd_ctx *ctx; int ret; if ((flags & ~TFD_SETTIME_FLAGS) || !itimerspec64_valid(new)) return -EINVAL; ret = timerfd_fget(ufd, &f); if (ret) return ret; ctx = f.file->private_data; if (isalarm(ctx) && !capable(CAP_WAKE_ALARM)) { fdput(f); return -EPERM; } timerfd_setup_cancel(ctx, flags); /* * We need to stop the existing timer before reprogramming * it to the new values. */ for (;;) { spin_lock_irq(&ctx->wqh.lock); if (isalarm(ctx)) { if (alarm_try_to_cancel(&ctx->t.alarm) >= 0) break; } else { if (hrtimer_try_to_cancel(&ctx->t.tmr) >= 0) break; } spin_unlock_irq(&ctx->wqh.lock); if (isalarm(ctx)) hrtimer_cancel_wait_running(&ctx->t.alarm.timer); else hrtimer_cancel_wait_running(&ctx->t.tmr); } /* * If the timer is expired and it's periodic, we need to advance it * because the caller may want to know the previous expiration time. * We do not update "ticks" and "expired" since the timer will be * re-programmed again in the following timerfd_setup() call. */ if (ctx->expired && ctx->tintv) { if (isalarm(ctx)) alarm_forward_now(&ctx->t.alarm, ctx->tintv); else hrtimer_forward_now(&ctx->t.tmr, ctx->tintv); } old->it_value = ktime_to_timespec64(timerfd_get_remaining(ctx)); old->it_interval = ktime_to_timespec64(ctx->tintv); /* * Re-program the timer to the new value ... */ ret = timerfd_setup(ctx, flags, new); spin_unlock_irq(&ctx->wqh.lock); fdput(f); return ret; } static int do_timerfd_gettime(int ufd, struct itimerspec64 *t) { struct fd f; struct timerfd_ctx *ctx; int ret = timerfd_fget(ufd, &f); if (ret) return ret; ctx = f.file->private_data; spin_lock_irq(&ctx->wqh.lock); if (ctx->expired && ctx->tintv) { ctx->expired = 0; if (isalarm(ctx)) { ctx->ticks += alarm_forward_now( &ctx->t.alarm, ctx->tintv) - 1; alarm_restart(&ctx->t.alarm); } else { ctx->ticks += hrtimer_forward_now(&ctx->t.tmr, ctx->tintv) - 1; hrtimer_restart(&ctx->t.tmr); } } t->it_value = ktime_to_timespec64(timerfd_get_remaining(ctx)); t->it_interval = ktime_to_timespec64(ctx->tintv); spin_unlock_irq(&ctx->wqh.lock); fdput(f); return 0; } SYSCALL_DEFINE4(timerfd_settime, int, ufd, int, flags, const struct __kernel_itimerspec __user *, utmr, struct __kernel_itimerspec __user *, otmr) { struct itimerspec64 new, old; int ret; if (get_itimerspec64(&new, utmr)) return -EFAULT; ret = do_timerfd_settime(ufd, flags, &new, &old); if (ret) return ret; if (otmr && put_itimerspec64(&old, otmr)) return -EFAULT; return ret; } SYSCALL_DEFINE2(timerfd_gettime, int, ufd, struct __kernel_itimerspec __user *, otmr) { struct itimerspec64 kotmr; int ret = do_timerfd_gettime(ufd, &kotmr); if (ret) return ret; return put_itimerspec64(&kotmr, otmr) ? -EFAULT : 0; } #ifdef CONFIG_COMPAT_32BIT_TIME SYSCALL_DEFINE4(timerfd_settime32, int, ufd, int, flags, const struct old_itimerspec32 __user *, utmr, struct old_itimerspec32 __user *, otmr) { struct itimerspec64 new, old; int ret; if (get_old_itimerspec32(&new, utmr)) return -EFAULT; ret = do_timerfd_settime(ufd, flags, &new, &old); if (ret) return ret; if (otmr && put_old_itimerspec32(&old, otmr)) return -EFAULT; return ret; } SYSCALL_DEFINE2(timerfd_gettime32, int, ufd, struct old_itimerspec32 __user *, otmr) { struct itimerspec64 kotmr; int ret = do_timerfd_gettime(ufd, &kotmr); if (ret) return ret; return put_old_itimerspec32(&kotmr, otmr) ? -EFAULT : 0; } #endif |
1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 | // SPDX-License-Identifier: GPL-2.0-or-later /* * IgorPlug-USB IR Receiver * * Copyright (C) 2014 Sean Young <sean@mess.org> * * Supports the standard homebrew IgorPlugUSB receiver with Igor's firmware. * See http://www.cesko.host.sk/IgorPlugUSB/IgorPlug-USB%20(AVR)_eng.htm * * Based on the lirc_igorplugusb.c driver: * Copyright (C) 2004 Jan M. Hochstein * <hochstein@algo.informatik.tu-darmstadt.de> */ #include <linux/device.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/usb.h> #include <linux/usb/input.h> #include <media/rc-core.h> #define DRIVER_DESC "IgorPlug-USB IR Receiver" #define DRIVER_NAME "igorplugusb" #define HEADERLEN 3 #define BUFLEN 36 #define MAX_PACKET (HEADERLEN + BUFLEN) #define SET_INFRABUFFER_EMPTY 1 #define GET_INFRACODE 2 struct igorplugusb { struct rc_dev *rc; struct device *dev; struct urb *urb; struct usb_ctrlrequest request; struct timer_list timer; u8 *buf_in; char phys[64]; }; static void igorplugusb_cmd(struct igorplugusb *ir, int cmd); static void igorplugusb_irdata(struct igorplugusb *ir, unsigned len) { struct ir_raw_event rawir = {}; unsigned i, start, overflow; dev_dbg(ir->dev, "irdata: %*ph (len=%u)", len, ir->buf_in, len); /* * If more than 36 pulses and spaces follow each other, the igorplugusb * overwrites its buffer from the beginning. The overflow value is the * last offset which was not overwritten. Everything from this offset * onwards occurred before everything until this offset. */ overflow = ir->buf_in[2]; i = start = overflow + HEADERLEN; if (start >= len) { dev_err(ir->dev, "receive overflow invalid: %u", overflow); } else { if (overflow > 0) { dev_warn(ir->dev, "receive overflow, at least %u lost", overflow); ir_raw_event_overflow(ir->rc); } do { rawir.duration = ir->buf_in[i] * 85; rawir.pulse = i & 1; ir_raw_event_store_with_filter(ir->rc, &rawir); if (++i == len) i = HEADERLEN; } while (i != start); /* add a trailing space */ rawir.duration = ir->rc->timeout; rawir.pulse = false; ir_raw_event_store_with_filter(ir->rc, &rawir); ir_raw_event_handle(ir->rc); } igorplugusb_cmd(ir, SET_INFRABUFFER_EMPTY); } static void igorplugusb_callback(struct urb *urb) { struct usb_ctrlrequest *req; struct igorplugusb *ir = urb->context; req = (struct usb_ctrlrequest *)urb->setup_packet; switch (urb->status) { case 0: if (req->bRequest == GET_INFRACODE && urb->actual_length > HEADERLEN) igorplugusb_irdata(ir, urb->actual_length); else /* request IR */ mod_timer(&ir->timer, jiffies + msecs_to_jiffies(50)); break; case -EPROTO: case -ECONNRESET: case -ENOENT: case -ESHUTDOWN: return; default: dev_warn(ir->dev, "Error: urb status = %d\n", urb->status); igorplugusb_cmd(ir, SET_INFRABUFFER_EMPTY); break; } } static void igorplugusb_cmd(struct igorplugusb *ir, int cmd) { int ret; ir->request.bRequest = cmd; ir->urb->transfer_flags = 0; ret = usb_submit_urb(ir->urb, GFP_ATOMIC); if (ret && ret != -EPERM) dev_err(ir->dev, "submit urb failed: %d", ret); } static void igorplugusb_timer(struct timer_list *t) { struct igorplugusb *ir = from_timer(ir, t, timer); igorplugusb_cmd(ir, GET_INFRACODE); } static int igorplugusb_probe(struct usb_interface *intf, const struct usb_device_id *id) { struct usb_device *udev; struct usb_host_interface *idesc; struct usb_endpoint_descriptor *ep; struct igorplugusb *ir; struct rc_dev *rc; int ret = -ENOMEM; udev = interface_to_usbdev(intf); idesc = intf->cur_altsetting; if (idesc->desc.bNumEndpoints != 1) { dev_err(&intf->dev, "incorrect number of endpoints"); return -ENODEV; } ep = &idesc->endpoint[0].desc; if (!usb_endpoint_dir_in(ep) || !usb_endpoint_xfer_control(ep)) { dev_err(&intf->dev, "endpoint incorrect"); return -ENODEV; } ir = devm_kzalloc(&intf->dev, sizeof(*ir), GFP_KERNEL); if (!ir) return -ENOMEM; ir->dev = &intf->dev; timer_setup(&ir->timer, igorplugusb_timer, 0); ir->request.bRequest = GET_INFRACODE; ir->request.bRequestType = USB_TYPE_VENDOR | USB_DIR_IN; ir->request.wLength = cpu_to_le16(MAX_PACKET); ir->urb = usb_alloc_urb(0, GFP_KERNEL); if (!ir->urb) goto fail; ir->buf_in = kmalloc(MAX_PACKET, GFP_KERNEL); if (!ir->buf_in) goto fail; usb_fill_control_urb(ir->urb, udev, usb_rcvctrlpipe(udev, 0), (uint8_t *)&ir->request, ir->buf_in, MAX_PACKET, igorplugusb_callback, ir); usb_make_path(udev, ir->phys, sizeof(ir->phys)); rc = rc_allocate_device(RC_DRIVER_IR_RAW); if (!rc) goto fail; rc->device_name = DRIVER_DESC; rc->input_phys = ir->phys; usb_to_input_id(udev, &rc->input_id); rc->dev.parent = &intf->dev; /* * This device can only store 36 pulses + spaces, which is not enough * for the NEC protocol and many others. */ rc->allowed_protocols = RC_PROTO_BIT_ALL_IR_DECODER & ~(RC_PROTO_BIT_NEC | RC_PROTO_BIT_NECX | RC_PROTO_BIT_NEC32 | RC_PROTO_BIT_RC6_6A_20 | RC_PROTO_BIT_RC6_6A_24 | RC_PROTO_BIT_RC6_6A_32 | RC_PROTO_BIT_RC6_MCE | RC_PROTO_BIT_SONY20 | RC_PROTO_BIT_SANYO); rc->priv = ir; rc->driver_name = DRIVER_NAME; rc->map_name = RC_MAP_HAUPPAUGE; rc->timeout = MS_TO_US(100); rc->rx_resolution = 85; ir->rc = rc; ret = rc_register_device(rc); if (ret) { dev_err(&intf->dev, "failed to register rc device: %d", ret); goto fail; } usb_set_intfdata(intf, ir); igorplugusb_cmd(ir, SET_INFRABUFFER_EMPTY); return 0; fail: usb_poison_urb(ir->urb); del_timer(&ir->timer); usb_unpoison_urb(ir->urb); usb_free_urb(ir->urb); rc_free_device(ir->rc); kfree(ir->buf_in); return ret; } static void igorplugusb_disconnect(struct usb_interface *intf) { struct igorplugusb *ir = usb_get_intfdata(intf); rc_unregister_device(ir->rc); usb_poison_urb(ir->urb); del_timer_sync(&ir->timer); usb_set_intfdata(intf, NULL); usb_unpoison_urb(ir->urb); usb_free_urb(ir->urb); kfree(ir->buf_in); } static const struct usb_device_id igorplugusb_table[] = { /* Igor Plug USB (Atmel's Manufact. ID) */ { USB_DEVICE(0x03eb, 0x0002) }, /* Fit PC2 Infrared Adapter */ { USB_DEVICE(0x03eb, 0x21fe) }, /* Terminating entry */ { } }; static struct usb_driver igorplugusb_driver = { .name = DRIVER_NAME, .probe = igorplugusb_probe, .disconnect = igorplugusb_disconnect, .id_table = igorplugusb_table }; module_usb_driver(igorplugusb_driver); MODULE_DESCRIPTION(DRIVER_DESC); MODULE_AUTHOR("Sean Young <sean@mess.org>"); MODULE_LICENSE("GPL"); MODULE_DEVICE_TABLE(usb, igorplugusb_table); |
2 2 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 | // SPDX-License-Identifier: GPL-2.0-only /* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES */ #include <linux/iommufd.h> #include <linux/slab.h> #include <linux/iommu.h> #include <uapi/linux/iommufd.h> #include "../iommu-priv.h" #include "io_pagetable.h" #include "iommufd_private.h" static bool allow_unsafe_interrupts; module_param(allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR); MODULE_PARM_DESC( allow_unsafe_interrupts, "Allow IOMMUFD to bind to devices even if the platform cannot isolate " "the MSI interrupt window. Enabling this is a security weakness."); static void iommufd_group_release(struct kref *kref) { struct iommufd_group *igroup = container_of(kref, struct iommufd_group, ref); WARN_ON(igroup->hwpt || !list_empty(&igroup->device_list)); xa_cmpxchg(&igroup->ictx->groups, iommu_group_id(igroup->group), igroup, NULL, GFP_KERNEL); iommu_group_put(igroup->group); mutex_destroy(&igroup->lock); kfree(igroup); } static void iommufd_put_group(struct iommufd_group *group) { kref_put(&group->ref, iommufd_group_release); } static bool iommufd_group_try_get(struct iommufd_group *igroup, struct iommu_group *group) { if (!igroup) return false; /* * group ID's cannot be re-used until the group is put back which does * not happen if we could get an igroup pointer under the xa_lock. */ if (WARN_ON(igroup->group != group)) return false; return kref_get_unless_zero(&igroup->ref); } /* * iommufd needs to store some more data for each iommu_group, we keep a * parallel xarray indexed by iommu_group id to hold this instead of putting it * in the core structure. To keep things simple the iommufd_group memory is * unique within the iommufd_ctx. This makes it easy to check there are no * memory leaks. */ static struct iommufd_group *iommufd_get_group(struct iommufd_ctx *ictx, struct device *dev) { struct iommufd_group *new_igroup; struct iommufd_group *cur_igroup; struct iommufd_group *igroup; struct iommu_group *group; unsigned int id; group = iommu_group_get(dev); if (!group) return ERR_PTR(-ENODEV); id = iommu_group_id(group); xa_lock(&ictx->groups); igroup = xa_load(&ictx->groups, id); if (iommufd_group_try_get(igroup, group)) { xa_unlock(&ictx->groups); iommu_group_put(group); return igroup; } xa_unlock(&ictx->groups); new_igroup = kzalloc(sizeof(*new_igroup), GFP_KERNEL); if (!new_igroup) { iommu_group_put(group); return ERR_PTR(-ENOMEM); } kref_init(&new_igroup->ref); mutex_init(&new_igroup->lock); INIT_LIST_HEAD(&new_igroup->device_list); new_igroup->sw_msi_start = PHYS_ADDR_MAX; /* group reference moves into new_igroup */ new_igroup->group = group; /* * The ictx is not additionally refcounted here becase all objects using * an igroup must put it before their destroy completes. */ new_igroup->ictx = ictx; /* * We dropped the lock so igroup is invalid. NULL is a safe and likely * value to assume for the xa_cmpxchg algorithm. */ cur_igroup = NULL; xa_lock(&ictx->groups); while (true) { igroup = __xa_cmpxchg(&ictx->groups, id, cur_igroup, new_igroup, GFP_KERNEL); if (xa_is_err(igroup)) { xa_unlock(&ictx->groups); iommufd_put_group(new_igroup); return ERR_PTR(xa_err(igroup)); } /* new_group was successfully installed */ if (cur_igroup == igroup) { xa_unlock(&ictx->groups); return new_igroup; } /* Check again if the current group is any good */ if (iommufd_group_try_get(igroup, group)) { xa_unlock(&ictx->groups); iommufd_put_group(new_igroup); return igroup; } cur_igroup = igroup; } } void iommufd_device_destroy(struct iommufd_object *obj) { struct iommufd_device *idev = container_of(obj, struct iommufd_device, obj); iommu_device_release_dma_owner(idev->dev); iommufd_put_group(idev->igroup); if (!iommufd_selftest_is_mock_dev(idev->dev)) iommufd_ctx_put(idev->ictx); } /** * iommufd_device_bind - Bind a physical device to an iommu fd * @ictx: iommufd file descriptor * @dev: Pointer to a physical device struct * @id: Output ID number to return to userspace for this device * * A successful bind establishes an ownership over the device and returns * struct iommufd_device pointer, otherwise returns error pointer. * * A driver using this API must set driver_managed_dma and must not touch * the device until this routine succeeds and establishes ownership. * * Binding a PCI device places the entire RID under iommufd control. * * The caller must undo this with iommufd_device_unbind() */ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, struct device *dev, u32 *id) { struct iommufd_device *idev; struct iommufd_group *igroup; int rc; /* * iommufd always sets IOMMU_CACHE because we offer no way for userspace * to restore cache coherency. */ if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY)) return ERR_PTR(-EINVAL); igroup = iommufd_get_group(ictx, dev); if (IS_ERR(igroup)) return ERR_CAST(igroup); /* * For historical compat with VFIO the insecure interrupt path is * allowed if the module parameter is set. Secure/Isolated means that a * MemWr operation from the device (eg a simple DMA) cannot trigger an * interrupt outside this iommufd context. */ if (!iommufd_selftest_is_mock_dev(dev) && !iommu_group_has_isolated_msi(igroup->group)) { if (!allow_unsafe_interrupts) { rc = -EPERM; goto out_group_put; } dev_warn( dev, "MSI interrupts are not secure, they cannot be isolated by the platform. " "Check that platform features like interrupt remapping are enabled. " "Use the \"allow_unsafe_interrupts\" module parameter to override\n"); } rc = iommu_device_claim_dma_owner(dev, ictx); if (rc) goto out_group_put; idev = iommufd_object_alloc(ictx, idev, IOMMUFD_OBJ_DEVICE); if (IS_ERR(idev)) { rc = PTR_ERR(idev); goto out_release_owner; } idev->ictx = ictx; if (!iommufd_selftest_is_mock_dev(dev)) iommufd_ctx_get(ictx); idev->dev = dev; idev->enforce_cache_coherency = device_iommu_capable(dev, IOMMU_CAP_ENFORCE_CACHE_COHERENCY); /* The calling driver is a user until iommufd_device_unbind() */ refcount_inc(&idev->obj.users); /* igroup refcount moves into iommufd_device */ idev->igroup = igroup; /* * If the caller fails after this success it must call * iommufd_unbind_device() which is safe since we hold this refcount. * This also means the device is a leaf in the graph and no other object * can take a reference on it. */ iommufd_object_finalize(ictx, &idev->obj); *id = idev->obj.id; return idev; out_release_owner: iommu_device_release_dma_owner(dev); out_group_put: iommufd_put_group(igroup); return ERR_PTR(rc); } EXPORT_SYMBOL_NS_GPL(iommufd_device_bind, IOMMUFD); /** * iommufd_ctx_has_group - True if any device within the group is bound * to the ictx * @ictx: iommufd file descriptor * @group: Pointer to a physical iommu_group struct * * True if any device within the group has been bound to this ictx, ex. via * iommufd_device_bind(), therefore implying ictx ownership of the group. */ bool iommufd_ctx_has_group(struct iommufd_ctx *ictx, struct iommu_group *group) { struct iommufd_object *obj; unsigned long index; if (!ictx || !group) return false; xa_lock(&ictx->objects); xa_for_each(&ictx->objects, index, obj) { if (obj->type == IOMMUFD_OBJ_DEVICE && container_of(obj, struct iommufd_device, obj) ->igroup->group == group) { xa_unlock(&ictx->objects); return true; } } xa_unlock(&ictx->objects); return false; } EXPORT_SYMBOL_NS_GPL(iommufd_ctx_has_group, IOMMUFD); /** * iommufd_device_unbind - Undo iommufd_device_bind() * @idev: Device returned by iommufd_device_bind() * * Release the device from iommufd control. The DMA ownership will return back * to unowned with DMA controlled by the DMA API. This invalidates the * iommufd_device pointer, other APIs that consume it must not be called * concurrently. */ void iommufd_device_unbind(struct iommufd_device *idev) { iommufd_object_destroy_user(idev->ictx, &idev->obj); } EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD); struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev) { return idev->ictx; } EXPORT_SYMBOL_NS_GPL(iommufd_device_to_ictx, IOMMUFD); u32 iommufd_device_to_id(struct iommufd_device *idev) { return idev->obj.id; } EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, IOMMUFD); static int iommufd_group_setup_msi(struct iommufd_group *igroup, struct iommufd_hwpt_paging *hwpt_paging) { phys_addr_t sw_msi_start = igroup->sw_msi_start; int rc; /* * If the IOMMU driver gives a IOMMU_RESV_SW_MSI then it is asking us to * call iommu_get_msi_cookie() on its behalf. This is necessary to setup * the MSI window so iommu_dma_prepare_msi() can install pages into our * domain after request_irq(). If it is not done interrupts will not * work on this domain. * * FIXME: This is conceptually broken for iommufd since we want to allow * userspace to change the domains, eg switch from an identity IOAS to a * DMA IOAS. There is currently no way to create a MSI window that * matches what the IRQ layer actually expects in a newly created * domain. */ if (sw_msi_start != PHYS_ADDR_MAX && !hwpt_paging->msi_cookie) { rc = iommu_get_msi_cookie(hwpt_paging->common.domain, sw_msi_start); if (rc) return rc; /* * iommu_get_msi_cookie() can only be called once per domain, * it returns -EBUSY on later calls. */ hwpt_paging->msi_cookie = true; } return 0; } static int iommufd_hwpt_paging_attach(struct iommufd_hwpt_paging *hwpt_paging, struct iommufd_device *idev) { int rc; lockdep_assert_held(&idev->igroup->lock); rc = iopt_table_enforce_dev_resv_regions(&hwpt_paging->ioas->iopt, idev->dev, &idev->igroup->sw_msi_start); if (rc) return rc; if (list_empty(&idev->igroup->device_list)) { rc = iommufd_group_setup_msi(idev->igroup, hwpt_paging); if (rc) { iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev); return rc; } } return 0; } int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, struct iommufd_device *idev) { int rc; mutex_lock(&idev->igroup->lock); if (idev->igroup->hwpt != NULL && idev->igroup->hwpt != hwpt) { rc = -EINVAL; goto err_unlock; } if (hwpt_is_paging(hwpt)) { rc = iommufd_hwpt_paging_attach(to_hwpt_paging(hwpt), idev); if (rc) goto err_unlock; } /* * Only attach to the group once for the first device that is in the * group. All the other devices will follow this attachment. The user * should attach every device individually to the hwpt as the per-device * reserved regions are only updated during individual device * attachment. */ if (list_empty(&idev->igroup->device_list)) { rc = iommu_attach_group(hwpt->domain, idev->igroup->group); if (rc) goto err_unresv; idev->igroup->hwpt = hwpt; } refcount_inc(&hwpt->obj.users); list_add_tail(&idev->group_item, &idev->igroup->device_list); mutex_unlock(&idev->igroup->lock); return 0; err_unresv: if (hwpt_is_paging(hwpt)) iopt_remove_reserved_iova(&to_hwpt_paging(hwpt)->ioas->iopt, idev->dev); err_unlock: mutex_unlock(&idev->igroup->lock); return rc; } struct iommufd_hw_pagetable * iommufd_hw_pagetable_detach(struct iommufd_device *idev) { struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt; mutex_lock(&idev->igroup->lock); list_del(&idev->group_item); if (list_empty(&idev->igroup->device_list)) { iommu_detach_group(hwpt->domain, idev->igroup->group); idev->igroup->hwpt = NULL; } if (hwpt_is_paging(hwpt)) iopt_remove_reserved_iova(&to_hwpt_paging(hwpt)->ioas->iopt, idev->dev); mutex_unlock(&idev->igroup->lock); /* Caller must destroy hwpt */ return hwpt; } static struct iommufd_hw_pagetable * iommufd_device_do_attach(struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt) { int rc; rc = iommufd_hw_pagetable_attach(hwpt, idev); if (rc) return ERR_PTR(rc); return NULL; } static void iommufd_group_remove_reserved_iova(struct iommufd_group *igroup, struct iommufd_hwpt_paging *hwpt_paging) { struct iommufd_device *cur; lockdep_assert_held(&igroup->lock); list_for_each_entry(cur, &igroup->device_list, group_item) iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, cur->dev); } static int iommufd_group_do_replace_paging(struct iommufd_group *igroup, struct iommufd_hwpt_paging *hwpt_paging) { struct iommufd_hw_pagetable *old_hwpt = igroup->hwpt; struct iommufd_device *cur; int rc; lockdep_assert_held(&igroup->lock); if (!hwpt_is_paging(old_hwpt) || hwpt_paging->ioas != to_hwpt_paging(old_hwpt)->ioas) { list_for_each_entry(cur, &igroup->device_list, group_item) { rc = iopt_table_enforce_dev_resv_regions( &hwpt_paging->ioas->iopt, cur->dev, NULL); if (rc) goto err_unresv; } } rc = iommufd_group_setup_msi(igroup, hwpt_paging); if (rc) goto err_unresv; return 0; err_unresv: iommufd_group_remove_reserved_iova(igroup, hwpt_paging); return rc; } static struct iommufd_hw_pagetable * iommufd_device_do_replace(struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt) { struct iommufd_group *igroup = idev->igroup; struct iommufd_hw_pagetable *old_hwpt; unsigned int num_devices; int rc; mutex_lock(&idev->igroup->lock); if (igroup->hwpt == NULL) { rc = -EINVAL; goto err_unlock; } if (hwpt == igroup->hwpt) { mutex_unlock(&idev->igroup->lock); return NULL; } old_hwpt = igroup->hwpt; if (hwpt_is_paging(hwpt)) { rc = iommufd_group_do_replace_paging(igroup, to_hwpt_paging(hwpt)); if (rc) goto err_unlock; } rc = iommu_group_replace_domain(igroup->group, hwpt->domain); if (rc) goto err_unresv; if (hwpt_is_paging(old_hwpt) && (!hwpt_is_paging(hwpt) || to_hwpt_paging(hwpt)->ioas != to_hwpt_paging(old_hwpt)->ioas)) iommufd_group_remove_reserved_iova(igroup, to_hwpt_paging(old_hwpt)); igroup->hwpt = hwpt; num_devices = list_count_nodes(&igroup->device_list); /* * Move the refcounts held by the device_list to the new hwpt. Retain a * refcount for this thread as the caller will free it. */ refcount_add(num_devices, &hwpt->obj.users); if (num_devices > 1) WARN_ON(refcount_sub_and_test(num_devices - 1, &old_hwpt->obj.users)); mutex_unlock(&idev->igroup->lock); /* Caller must destroy old_hwpt */ return old_hwpt; err_unresv: if (hwpt_is_paging(hwpt)) iommufd_group_remove_reserved_iova(igroup, to_hwpt_paging(old_hwpt)); err_unlock: mutex_unlock(&idev->igroup->lock); return ERR_PTR(rc); } typedef struct iommufd_hw_pagetable *(*attach_fn)( struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt); /* * When automatically managing the domains we search for a compatible domain in * the iopt and if one is found use it, otherwise create a new domain. * Automatic domain selection will never pick a manually created domain. */ static struct iommufd_hw_pagetable * iommufd_device_auto_get_domain(struct iommufd_device *idev, struct iommufd_ioas *ioas, u32 *pt_id, attach_fn do_attach) { /* * iommufd_hw_pagetable_attach() is called by * iommufd_hw_pagetable_alloc() in immediate attachment mode, same as * iommufd_device_do_attach(). So if we are in this mode then we prefer * to use the immediate_attach path as it supports drivers that can't * directly allocate a domain. */ bool immediate_attach = do_attach == iommufd_device_do_attach; struct iommufd_hw_pagetable *destroy_hwpt; struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_hw_pagetable *hwpt; /* * There is no differentiation when domains are allocated, so any domain * that is willing to attach to the device is interchangeable with any * other. */ mutex_lock(&ioas->mutex); list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { if (!hwpt_paging->auto_domain) continue; hwpt = &hwpt_paging->common; if (!iommufd_lock_obj(&hwpt->obj)) continue; destroy_hwpt = (*do_attach)(idev, hwpt); if (IS_ERR(destroy_hwpt)) { iommufd_put_object(idev->ictx, &hwpt->obj); /* * -EINVAL means the domain is incompatible with the * device. Other error codes should propagate to * userspace as failure. Success means the domain is * attached. */ if (PTR_ERR(destroy_hwpt) == -EINVAL) continue; goto out_unlock; } *pt_id = hwpt->obj.id; iommufd_put_object(idev->ictx, &hwpt->obj); goto out_unlock; } hwpt_paging = iommufd_hwpt_paging_alloc(idev->ictx, ioas, idev, 0, immediate_attach, NULL); if (IS_ERR(hwpt_paging)) { destroy_hwpt = ERR_CAST(hwpt_paging); goto out_unlock; } hwpt = &hwpt_paging->common; if (!immediate_attach) { destroy_hwpt = (*do_attach)(idev, hwpt); if (IS_ERR(destroy_hwpt)) goto out_abort; } else { destroy_hwpt = NULL; } hwpt_paging->auto_domain = true; *pt_id = hwpt->obj.id; iommufd_object_finalize(idev->ictx, &hwpt->obj); mutex_unlock(&ioas->mutex); return destroy_hwpt; out_abort: iommufd_object_abort_and_destroy(idev->ictx, &hwpt->obj); out_unlock: mutex_unlock(&ioas->mutex); return destroy_hwpt; } static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, attach_fn do_attach) { struct iommufd_hw_pagetable *destroy_hwpt; struct iommufd_object *pt_obj; pt_obj = iommufd_get_object(idev->ictx, *pt_id, IOMMUFD_OBJ_ANY); if (IS_ERR(pt_obj)) return PTR_ERR(pt_obj); switch (pt_obj->type) { case IOMMUFD_OBJ_HWPT_NESTED: case IOMMUFD_OBJ_HWPT_PAGING: { struct iommufd_hw_pagetable *hwpt = container_of(pt_obj, struct iommufd_hw_pagetable, obj); destroy_hwpt = (*do_attach)(idev, hwpt); if (IS_ERR(destroy_hwpt)) goto out_put_pt_obj; break; } case IOMMUFD_OBJ_IOAS: { struct iommufd_ioas *ioas = container_of(pt_obj, struct iommufd_ioas, obj); destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id, do_attach); if (IS_ERR(destroy_hwpt)) goto out_put_pt_obj; break; } default: destroy_hwpt = ERR_PTR(-EINVAL); goto out_put_pt_obj; } iommufd_put_object(idev->ictx, pt_obj); /* This destruction has to be after we unlock everything */ if (destroy_hwpt) iommufd_hw_pagetable_put(idev->ictx, destroy_hwpt); return 0; out_put_pt_obj: iommufd_put_object(idev->ictx, pt_obj); return PTR_ERR(destroy_hwpt); } /** * iommufd_device_attach - Connect a device to an iommu_domain * @idev: device to attach * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HWPT_PAGING * Output the IOMMUFD_OBJ_HWPT_PAGING ID * * This connects the device to an iommu_domain, either automatically or manually * selected. Once this completes the device could do DMA. * * The caller should return the resulting pt_id back to userspace. * This function is undone by calling iommufd_device_detach(). */ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id) { int rc; rc = iommufd_device_change_pt(idev, pt_id, &iommufd_device_do_attach); if (rc) return rc; /* * Pairs with iommufd_device_detach() - catches caller bugs attempting * to destroy a device with an attachment. */ refcount_inc(&idev->obj.users); return 0; } EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD); /** * iommufd_device_replace - Change the device's iommu_domain * @idev: device to change * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HWPT_PAGING * Output the IOMMUFD_OBJ_HWPT_PAGING ID * * This is the same as:: * * iommufd_device_detach(); * iommufd_device_attach(); * * If it fails then no change is made to the attachment. The iommu driver may * implement this so there is no disruption in translation. This can only be * called if iommufd_device_attach() has already succeeded. */ int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id) { return iommufd_device_change_pt(idev, pt_id, &iommufd_device_do_replace); } EXPORT_SYMBOL_NS_GPL(iommufd_device_replace, IOMMUFD); /** * iommufd_device_detach - Disconnect a device to an iommu_domain * @idev: device to detach * * Undo iommufd_device_attach(). This disconnects the idev from the previously * attached pt_id. The device returns back to a blocked DMA translation. */ void iommufd_device_detach(struct iommufd_device *idev) { struct iommufd_hw_pagetable *hwpt; hwpt = iommufd_hw_pagetable_detach(idev); iommufd_hw_pagetable_put(idev->ictx, hwpt); refcount_dec(&idev->obj.users); } EXPORT_SYMBOL_NS_GPL(iommufd_device_detach, IOMMUFD); /* * On success, it will refcount_inc() at a valid new_ioas and refcount_dec() at * a valid cur_ioas (access->ioas). A caller passing in a valid new_ioas should * call iommufd_put_object() if it does an iommufd_get_object() for a new_ioas. */ static int iommufd_access_change_ioas(struct iommufd_access *access, struct iommufd_ioas *new_ioas) { u32 iopt_access_list_id = access->iopt_access_list_id; struct iommufd_ioas *cur_ioas = access->ioas; int rc; lockdep_assert_held(&access->ioas_lock); /* We are racing with a concurrent detach, bail */ if (cur_ioas != access->ioas_unpin) return -EBUSY; if (cur_ioas == new_ioas) return 0; /* * Set ioas to NULL to block any further iommufd_access_pin_pages(). * iommufd_access_unpin_pages() can continue using access->ioas_unpin. */ access->ioas = NULL; if (new_ioas) { rc = iopt_add_access(&new_ioas->iopt, access); if (rc) { access->ioas = cur_ioas; return rc; } refcount_inc(&new_ioas->obj.users); } if (cur_ioas) { if (access->ops->unmap) { mutex_unlock(&access->ioas_lock); access->ops->unmap(access->data, 0, ULONG_MAX); mutex_lock(&access->ioas_lock); } iopt_remove_access(&cur_ioas->iopt, access, iopt_access_list_id); refcount_dec(&cur_ioas->obj.users); } access->ioas = new_ioas; access->ioas_unpin = new_ioas; return 0; } static int iommufd_access_change_ioas_id(struct iommufd_access *access, u32 id) { struct iommufd_ioas *ioas = iommufd_get_ioas(access->ictx, id); int rc; if (IS_ERR(ioas)) return PTR_ERR(ioas); rc = iommufd_access_change_ioas(access, ioas); iommufd_put_object(access->ictx, &ioas->obj); return rc; } void iommufd_access_destroy_object(struct iommufd_object *obj) { struct iommufd_access *access = container_of(obj, struct iommufd_access, obj); mutex_lock(&access->ioas_lock); if (access->ioas) WARN_ON(iommufd_access_change_ioas(access, NULL)); mutex_unlock(&access->ioas_lock); iommufd_ctx_put(access->ictx); } /** * iommufd_access_create - Create an iommufd_access * @ictx: iommufd file descriptor * @ops: Driver's ops to associate with the access * @data: Opaque data to pass into ops functions * @id: Output ID number to return to userspace for this access * * An iommufd_access allows a driver to read/write to the IOAS without using * DMA. The underlying CPU memory can be accessed using the * iommufd_access_pin_pages() or iommufd_access_rw() functions. * * The provided ops are required to use iommufd_access_pin_pages(). */ struct iommufd_access * iommufd_access_create(struct iommufd_ctx *ictx, const struct iommufd_access_ops *ops, void *data, u32 *id) { struct iommufd_access *access; /* * There is no uAPI for the access object, but to keep things symmetric * use the object infrastructure anyhow. */ access = iommufd_object_alloc(ictx, access, IOMMUFD_OBJ_ACCESS); if (IS_ERR(access)) return access; access->data = data; access->ops = ops; if (ops->needs_pin_pages) access->iova_alignment = PAGE_SIZE; else access->iova_alignment = 1; /* The calling driver is a user until iommufd_access_destroy() */ refcount_inc(&access->obj.users); access->ictx = ictx; iommufd_ctx_get(ictx); iommufd_object_finalize(ictx, &access->obj); *id = access->obj.id; mutex_init(&access->ioas_lock); return access; } EXPORT_SYMBOL_NS_GPL(iommufd_access_create, IOMMUFD); /** * iommufd_access_destroy - Destroy an iommufd_access * @access: The access to destroy * * The caller must stop using the access before destroying it. */ void iommufd_access_destroy(struct iommufd_access *access) { iommufd_object_destroy_user(access->ictx, &access->obj); } EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD); void iommufd_access_detach(struct iommufd_access *access) { mutex_lock(&access->ioas_lock); if (WARN_ON(!access->ioas)) { mutex_unlock(&access->ioas_lock); return; } WARN_ON(iommufd_access_change_ioas(access, NULL)); mutex_unlock(&access->ioas_lock); } EXPORT_SYMBOL_NS_GPL(iommufd_access_detach, IOMMUFD); int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id) { int rc; mutex_lock(&access->ioas_lock); if (WARN_ON(access->ioas)) { mutex_unlock(&access->ioas_lock); return -EINVAL; } rc = iommufd_access_change_ioas_id(access, ioas_id); mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_attach, IOMMUFD); int iommufd_access_replace(struct iommufd_access *access, u32 ioas_id) { int rc; mutex_lock(&access->ioas_lock); if (!access->ioas) { mutex_unlock(&access->ioas_lock); return -ENOENT; } rc = iommufd_access_change_ioas_id(access, ioas_id); mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_replace, IOMMUFD); /** * iommufd_access_notify_unmap - Notify users of an iopt to stop using it * @iopt: iopt to work on * @iova: Starting iova in the iopt * @length: Number of bytes * * After this function returns there should be no users attached to the pages * linked to this iopt that intersect with iova,length. Anyone that has attached * a user through iopt_access_pages() needs to detach it through * iommufd_access_unpin_pages() before this function returns. * * iommufd_access_destroy() will wait for any outstanding unmap callback to * complete. Once iommufd_access_destroy() no unmap ops are running or will * run in the future. Due to this a driver must not create locking that prevents * unmap to complete while iommufd_access_destroy() is running. */ void iommufd_access_notify_unmap(struct io_pagetable *iopt, unsigned long iova, unsigned long length) { struct iommufd_ioas *ioas = container_of(iopt, struct iommufd_ioas, iopt); struct iommufd_access *access; unsigned long index; xa_lock(&ioas->iopt.access_list); xa_for_each(&ioas->iopt.access_list, index, access) { if (!iommufd_lock_obj(&access->obj)) continue; xa_unlock(&ioas->iopt.access_list); access->ops->unmap(access->data, iova, length); iommufd_put_object(access->ictx, &access->obj); xa_lock(&ioas->iopt.access_list); } xa_unlock(&ioas->iopt.access_list); } /** * iommufd_access_unpin_pages() - Undo iommufd_access_pin_pages * @access: IOAS access to act on * @iova: Starting IOVA * @length: Number of bytes to access * * Return the struct page's. The caller must stop accessing them before calling * this. The iova/length must exactly match the one provided to access_pages. */ void iommufd_access_unpin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length) { struct iopt_area_contig_iter iter; struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; if (WARN_ON(!length) || WARN_ON(check_add_overflow(iova, length - 1, &last_iova))) return; mutex_lock(&access->ioas_lock); /* * The driver must be doing something wrong if it calls this before an * iommufd_access_attach() or after an iommufd_access_detach(). */ if (WARN_ON(!access->ioas_unpin)) { mutex_unlock(&access->ioas_lock); return; } iopt = &access->ioas_unpin->iopt; down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) iopt_area_remove_access( area, iopt_area_iova_to_index(area, iter.cur_iova), iopt_area_iova_to_index( area, min(last_iova, iopt_area_last_iova(area)))); WARN_ON(!iopt_area_contig_done(&iter)); up_read(&iopt->iova_rwsem); mutex_unlock(&access->ioas_lock); } EXPORT_SYMBOL_NS_GPL(iommufd_access_unpin_pages, IOMMUFD); static bool iopt_area_contig_is_aligned(struct iopt_area_contig_iter *iter) { if (iopt_area_start_byte(iter->area, iter->cur_iova) % PAGE_SIZE) return false; if (!iopt_area_contig_done(iter) && (iopt_area_start_byte(iter->area, iopt_area_last_iova(iter->area)) % PAGE_SIZE) != (PAGE_SIZE - 1)) return false; return true; } static bool check_area_prot(struct iopt_area *area, unsigned int flags) { if (flags & IOMMUFD_ACCESS_RW_WRITE) return area->iommu_prot & IOMMU_WRITE; return area->iommu_prot & IOMMU_READ; } /** * iommufd_access_pin_pages() - Return a list of pages under the iova * @access: IOAS access to act on * @iova: Starting IOVA * @length: Number of bytes to access * @out_pages: Output page list * @flags: IOPMMUFD_ACCESS_RW_* flags * * Reads @length bytes starting at iova and returns the struct page * pointers. * These can be kmap'd by the caller for CPU access. * * The caller must perform iommufd_access_unpin_pages() when done to balance * this. * * This API always requires a page aligned iova. This happens naturally if the * ioas alignment is >= PAGE_SIZE and the iova is PAGE_SIZE aligned. However * smaller alignments have corner cases where this API can fail on otherwise * aligned iova. */ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length, struct page **out_pages, unsigned int flags) { struct iopt_area_contig_iter iter; struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; int rc; /* Driver's ops don't support pin_pages */ if (IS_ENABLED(CONFIG_IOMMUFD_TEST) && WARN_ON(access->iova_alignment != PAGE_SIZE || !access->ops->unmap)) return -EINVAL; if (!length) return -EINVAL; if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW; mutex_lock(&access->ioas_lock); if (!access->ioas) { mutex_unlock(&access->ioas_lock); return -ENOENT; } iopt = &access->ioas->iopt; down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); unsigned long last_index = iopt_area_iova_to_index(area, last); unsigned long index = iopt_area_iova_to_index(area, iter.cur_iova); if (area->prevent_access || !iopt_area_contig_is_aligned(&iter)) { rc = -EINVAL; goto err_remove; } if (!check_area_prot(area, flags)) { rc = -EPERM; goto err_remove; } rc = iopt_area_add_access(area, index, last_index, out_pages, flags); if (rc) goto err_remove; out_pages += last_index - index + 1; } if (!iopt_area_contig_done(&iter)) { rc = -ENOENT; goto err_remove; } up_read(&iopt->iova_rwsem); mutex_unlock(&access->ioas_lock); return 0; err_remove: if (iova < iter.cur_iova) { last_iova = iter.cur_iova - 1; iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) iopt_area_remove_access( area, iopt_area_iova_to_index(area, iter.cur_iova), iopt_area_iova_to_index( area, min(last_iova, iopt_area_last_iova(area)))); } up_read(&iopt->iova_rwsem); mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); /** * iommufd_access_rw - Read or write data under the iova * @access: IOAS access to act on * @iova: Starting IOVA * @data: Kernel buffer to copy to/from * @length: Number of bytes to access * @flags: IOMMUFD_ACCESS_RW_* flags * * Copy kernel to/from data into the range given by IOVA/length. If flags * indicates IOMMUFD_ACCESS_RW_KTHREAD then a large copy can be optimized * by changing it into copy_to/from_user(). */ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, void *data, size_t length, unsigned int flags) { struct iopt_area_contig_iter iter; struct io_pagetable *iopt; struct iopt_area *area; unsigned long last_iova; int rc; if (!length) return -EINVAL; if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW; mutex_lock(&access->ioas_lock); if (!access->ioas) { mutex_unlock(&access->ioas_lock); return -ENOENT; } iopt = &access->ioas->iopt; down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); unsigned long bytes = (last - iter.cur_iova) + 1; if (area->prevent_access) { rc = -EINVAL; goto err_out; } if (!check_area_prot(area, flags)) { rc = -EPERM; goto err_out; } rc = iopt_pages_rw_access( area->pages, iopt_area_start_byte(area, iter.cur_iova), data, bytes, flags); if (rc) goto err_out; data += bytes; } if (!iopt_area_contig_done(&iter)) rc = -ENOENT; err_out: up_read(&iopt->iova_rwsem); mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_rw, IOMMUFD); int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) { struct iommu_hw_info *cmd = ucmd->cmd; void __user *user_ptr = u64_to_user_ptr(cmd->data_uptr); const struct iommu_ops *ops; struct iommufd_device *idev; unsigned int data_len; unsigned int copy_len; void *data; int rc; if (cmd->flags || cmd->__reserved) return -EOPNOTSUPP; idev = iommufd_get_device(ucmd, cmd->dev_id); if (IS_ERR(idev)) return PTR_ERR(idev); ops = dev_iommu_ops(idev->dev); if (ops->hw_info) { data = ops->hw_info(idev->dev, &data_len, &cmd->out_data_type); if (IS_ERR(data)) { rc = PTR_ERR(data); goto out_put; } /* * drivers that have hw_info callback should have a unique * iommu_hw_info_type. */ if (WARN_ON_ONCE(cmd->out_data_type == IOMMU_HW_INFO_TYPE_NONE)) { rc = -ENODEV; goto out_free; } } else { cmd->out_data_type = IOMMU_HW_INFO_TYPE_NONE; data_len = 0; data = NULL; } copy_len = min(cmd->data_len, data_len); if (copy_to_user(user_ptr, data, copy_len)) { rc = -EFAULT; goto out_free; } /* * Zero the trailing bytes if the user buffer is bigger than the * data size kernel actually has. */ if (copy_len < cmd->data_len) { if (clear_user(user_ptr + copy_len, cmd->data_len - copy_len)) { rc = -EFAULT; goto out_free; } } /* * We return the length the kernel supports so userspace may know what * the kernel capability is. It could be larger than the input buffer. */ cmd->data_len = data_len; cmd->out_capabilities = 0; if (device_iommu_capable(idev->dev, IOMMU_CAP_DIRTY_TRACKING)) cmd->out_capabilities |= IOMMU_HW_CAP_DIRTY_TRACKING; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); out_free: kfree(data); out_put: iommufd_put_object(ucmd->ictx, &idev->obj); return rc; } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _VHOST_H #define _VHOST_H #include <linux/eventfd.h> #include <linux/vhost.h> #include <linux/mm.h> #include <linux/mutex.h> #include <linux/poll.h> #include <linux/file.h> #include <linux/uio.h> #include <linux/virtio_config.h> #include <linux/virtio_ring.h> #include <linux/atomic.h> #include <linux/vhost_iotlb.h> #include <linux/irqbypass.h> struct vhost_work; struct vhost_task; typedef void (*vhost_work_fn_t)(struct vhost_work *work); #define VHOST_WORK_QUEUED 1 struct vhost_work { struct llist_node node; vhost_work_fn_t fn; unsigned long flags; }; struct vhost_worker { struct vhost_task *vtsk; /* Used to serialize device wide flushing with worker swapping. */ struct mutex mutex; struct llist_head work_list; u64 kcov_handle; u32 id; int attachment_cnt; }; /* Poll a file (eventfd or socket) */ /* Note: there's nothing vhost specific about this structure. */ struct vhost_poll { poll_table table; wait_queue_head_t *wqh; wait_queue_entry_t wait; struct vhost_work work; __poll_t mask; struct vhost_dev *dev; struct vhost_virtqueue *vq; }; void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn, __poll_t mask, struct vhost_dev *dev, struct vhost_virtqueue *vq); int vhost_poll_start(struct vhost_poll *poll, struct file *file); void vhost_poll_stop(struct vhost_poll *poll); void vhost_poll_queue(struct vhost_poll *poll); void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn); void vhost_dev_flush(struct vhost_dev *dev); struct vhost_log { u64 addr; u64 len; }; enum vhost_uaddr_type { VHOST_ADDR_DESC = 0, VHOST_ADDR_AVAIL = 1, VHOST_ADDR_USED = 2, VHOST_NUM_ADDRS = 3, }; struct vhost_vring_call { struct eventfd_ctx *ctx; struct irq_bypass_producer producer; }; /* The virtqueue structure describes a queue attached to a device. */ struct vhost_virtqueue { struct vhost_dev *dev; struct vhost_worker __rcu *worker; /* The actual ring of buffers. */ struct mutex mutex; unsigned int num; vring_desc_t __user *desc; vring_avail_t __user *avail; vring_used_t __user *used; const struct vhost_iotlb_map *meta_iotlb[VHOST_NUM_ADDRS]; struct file *kick; struct vhost_vring_call call_ctx; struct eventfd_ctx *error_ctx; struct eventfd_ctx *log_ctx; struct vhost_poll poll; /* The routine to call when the Guest pings us, or timeout. */ vhost_work_fn_t handle_kick; /* Last available index we saw. * Values are limited to 0x7fff, and the high bit is used as * a wrap counter when using VIRTIO_F_RING_PACKED. */ u16 last_avail_idx; /* Caches available index value from user. */ u16 avail_idx; /* Last index we used. * Values are limited to 0x7fff, and the high bit is used as * a wrap counter when using VIRTIO_F_RING_PACKED. */ u16 last_used_idx; /* Used flags */ u16 used_flags; /* Last used index value we have signalled on */ u16 signalled_used; /* Last used index value we have signalled on */ bool signalled_used_valid; /* Log writes to used structure. */ bool log_used; u64 log_addr; struct iovec iov[UIO_MAXIOV]; struct iovec iotlb_iov[64]; struct iovec *indirect; struct vring_used_elem *heads; /* Protected by virtqueue mutex. */ struct vhost_iotlb *umem; struct vhost_iotlb *iotlb; void *private_data; u64 acked_features; u64 acked_backend_features; /* Log write descriptors */ void __user *log_base; struct vhost_log *log; struct iovec log_iov[64]; /* Ring endianness. Defaults to legacy native endianness. * Set to true when starting a modern virtio device. */ bool is_le; #ifdef CONFIG_VHOST_CROSS_ENDIAN_LEGACY /* Ring endianness requested by userspace for cross-endian support. */ bool user_be; #endif u32 busyloop_timeout; }; struct vhost_msg_node { union { struct vhost_msg msg; struct vhost_msg_v2 msg_v2; }; struct vhost_virtqueue *vq; struct list_head node; }; struct vhost_dev { struct mm_struct *mm; struct mutex mutex; struct vhost_virtqueue **vqs; int nvqs; struct eventfd_ctx *log_ctx; struct vhost_iotlb *umem; struct vhost_iotlb *iotlb; spinlock_t iotlb_lock; struct list_head read_list; struct list_head pending_list; wait_queue_head_t wait; int iov_limit; int weight; int byte_weight; struct xarray worker_xa; bool use_worker; int (*msg_handler)(struct vhost_dev *dev, u32 asid, struct vhost_iotlb_msg *msg); }; bool vhost_exceeds_weight(struct vhost_virtqueue *vq, int pkts, int total_len); void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int nvqs, int iov_limit, int weight, int byte_weight, bool use_worker, int (*msg_handler)(struct vhost_dev *dev, u32 asid, struct vhost_iotlb_msg *msg)); long vhost_dev_set_owner(struct vhost_dev *dev); bool vhost_dev_has_owner(struct vhost_dev *dev); long vhost_dev_check_owner(struct vhost_dev *); struct vhost_iotlb *vhost_dev_reset_owner_prepare(void); void vhost_dev_reset_owner(struct vhost_dev *dev, struct vhost_iotlb *iotlb); void vhost_dev_cleanup(struct vhost_dev *); void vhost_dev_stop(struct vhost_dev *); long vhost_dev_ioctl(struct vhost_dev *, unsigned int ioctl, void __user *argp); long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *argp); long vhost_worker_ioctl(struct vhost_dev *dev, unsigned int ioctl, void __user *argp); bool vhost_vq_access_ok(struct vhost_virtqueue *vq); bool vhost_log_access_ok(struct vhost_dev *); void vhost_clear_msg(struct vhost_dev *dev); int vhost_get_vq_desc(struct vhost_virtqueue *, struct iovec iov[], unsigned int iov_size, unsigned int *out_num, unsigned int *in_num, struct vhost_log *log, unsigned int *log_num); void vhost_discard_vq_desc(struct vhost_virtqueue *, int n); void vhost_vq_flush(struct vhost_virtqueue *vq); bool vhost_vq_work_queue(struct vhost_virtqueue *vq, struct vhost_work *work); bool vhost_vq_has_work(struct vhost_virtqueue *vq); bool vhost_vq_is_setup(struct vhost_virtqueue *vq); int vhost_vq_init_access(struct vhost_virtqueue *); int vhost_add_used(struct vhost_virtqueue *, unsigned int head, int len); int vhost_add_used_n(struct vhost_virtqueue *, struct vring_used_elem *heads, unsigned count); void vhost_add_used_and_signal(struct vhost_dev *, struct vhost_virtqueue *, unsigned int id, int len); void vhost_add_used_and_signal_n(struct vhost_dev *, struct vhost_virtqueue *, struct vring_used_elem *heads, unsigned count); void vhost_signal(struct vhost_dev *, struct vhost_virtqueue *); void vhost_disable_notify(struct vhost_dev *, struct vhost_virtqueue *); bool vhost_vq_avail_empty(struct vhost_dev *, struct vhost_virtqueue *); bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *); int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log, unsigned int log_num, u64 len, struct iovec *iov, int count); int vq_meta_prefetch(struct vhost_virtqueue *vq); struct vhost_msg_node *vhost_new_msg(struct vhost_virtqueue *vq, int type); void vhost_enqueue_msg(struct vhost_dev *dev, struct list_head *head, struct vhost_msg_node *node); struct vhost_msg_node *vhost_dequeue_msg(struct vhost_dev *dev, struct list_head *head); void vhost_set_backend_features(struct vhost_dev *dev, u64 features); __poll_t vhost_chr_poll(struct file *file, struct vhost_dev *dev, poll_table *wait); ssize_t vhost_chr_read_iter(struct vhost_dev *dev, struct iov_iter *to, int noblock); ssize_t vhost_chr_write_iter(struct vhost_dev *dev, struct iov_iter *from); int vhost_init_device_iotlb(struct vhost_dev *d); void vhost_iotlb_map_free(struct vhost_iotlb *iotlb, struct vhost_iotlb_map *map); #define vq_err(vq, fmt, ...) do { \ pr_debug(pr_fmt(fmt), ##__VA_ARGS__); \ if ((vq)->error_ctx) \ eventfd_signal((vq)->error_ctx);\ } while (0) enum { VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | (1ULL << VIRTIO_RING_F_EVENT_IDX) | (1ULL << VHOST_F_LOG_ALL) | (1ULL << VIRTIO_F_ANY_LAYOUT) | (1ULL << VIRTIO_F_VERSION_1) }; /** * vhost_vq_set_backend - Set backend. * * @vq Virtqueue. * @private_data The private data. * * Context: Need to call with vq->mutex acquired. */ static inline void vhost_vq_set_backend(struct vhost_virtqueue *vq, void *private_data) { vq->private_data = private_data; } /** * vhost_vq_get_backend - Get backend. * * @vq Virtqueue. * * Context: Need to call with vq->mutex acquired. * Return: Private data previously set with vhost_vq_set_backend. */ static inline void *vhost_vq_get_backend(struct vhost_virtqueue *vq) { return vq->private_data; } static inline bool vhost_has_feature(struct vhost_virtqueue *vq, int bit) { return vq->acked_features & (1ULL << bit); } static inline bool vhost_backend_has_feature(struct vhost_virtqueue *vq, int bit) { return vq->acked_backend_features & (1ULL << bit); } #ifdef CONFIG_VHOST_CROSS_ENDIAN_LEGACY static inline bool vhost_is_little_endian(struct vhost_virtqueue *vq) { return vq->is_le; } #else static inline bool vhost_is_little_endian(struct vhost_virtqueue *vq) { return virtio_legacy_is_little_endian() || vq->is_le; } #endif /* Memory accessors */ static inline u16 vhost16_to_cpu(struct vhost_virtqueue *vq, __virtio16 val) { return __virtio16_to_cpu(vhost_is_little_endian(vq), val); } static inline __virtio16 cpu_to_vhost16(struct vhost_virtqueue *vq, u16 val) { return __cpu_to_virtio16(vhost_is_little_endian(vq), val); } static inline u32 vhost32_to_cpu(struct vhost_virtqueue *vq, __virtio32 val) { return __virtio32_to_cpu(vhost_is_little_endian(vq), val); } static inline __virtio32 cpu_to_vhost32(struct vhost_virtqueue *vq, u32 val) { return __cpu_to_virtio32(vhost_is_little_endian(vq), val); } static inline u64 vhost64_to_cpu(struct vhost_virtqueue *vq, __virtio64 val) { return __virtio64_to_cpu(vhost_is_little_endian(vq), val); } static inline __virtio64 cpu_to_vhost64(struct vhost_virtqueue *vq, u64 val) { return __cpu_to_virtio64(vhost_is_little_endian(vq), val); } #endif |
4 4 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright 2008 by Karsten Keil <kkeil@novell.com> */ #include <linux/slab.h> #include <linux/types.h> #include <linux/stddef.h> #include <linux/module.h> #include <linux/spinlock.h> #include <linux/mISDNif.h> #include "core.h" static u_int debug; MODULE_AUTHOR("Karsten Keil"); MODULE_LICENSE("GPL"); module_param(debug, uint, S_IRUGO | S_IWUSR); static u64 device_ids; #define MAX_DEVICE_ID 63 static LIST_HEAD(Bprotocols); static DEFINE_RWLOCK(bp_lock); static void mISDN_dev_release(struct device *dev) { /* nothing to do: the device is part of its parent's data structure */ } static ssize_t id_show(struct device *dev, struct device_attribute *attr, char *buf) { struct mISDNdevice *mdev = dev_to_mISDN(dev); if (!mdev) return -ENODEV; return sprintf(buf, "%d\n", mdev->id); } static DEVICE_ATTR_RO(id); static ssize_t nrbchan_show(struct device *dev, struct device_attribute *attr, char *buf) { struct mISDNdevice *mdev = dev_to_mISDN(dev); if (!mdev) return -ENODEV; return sprintf(buf, "%d\n", mdev->nrbchan); } static DEVICE_ATTR_RO(nrbchan); static ssize_t d_protocols_show(struct device *dev, struct device_attribute *attr, char *buf) { struct mISDNdevice *mdev = dev_to_mISDN(dev); if (!mdev) return -ENODEV; return sprintf(buf, "%d\n", mdev->Dprotocols); } static DEVICE_ATTR_RO(d_protocols); static ssize_t b_protocols_show(struct device *dev, struct device_attribute *attr, char *buf) { struct mISDNdevice *mdev = dev_to_mISDN(dev); if (!mdev) return -ENODEV; return sprintf(buf, "%d\n", mdev->Bprotocols | get_all_Bprotocols()); } static DEVICE_ATTR_RO(b_protocols); static ssize_t protocol_show(struct device *dev, struct device_attribute *attr, char *buf) { struct mISDNdevice *mdev = dev_to_mISDN(dev); if (!mdev) return -ENODEV; return sprintf(buf, "%d\n", mdev->D.protocol); } static DEVICE_ATTR_RO(protocol); static ssize_t name_show(struct device *dev, struct device_attribute *attr, char *buf) { strcpy(buf, dev_name(dev)); return strlen(buf); } static DEVICE_ATTR_RO(name); #if 0 /* hangs */ static ssize_t name_set(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { int err = 0; char *out = kmalloc(count + 1, GFP_KERNEL); if (!out) return -ENOMEM; memcpy(out, buf, count); if (count && out[count - 1] == '\n') out[--count] = 0; if (count) err = device_rename(dev, out); kfree(out); return (err < 0) ? err : count; } static DEVICE_ATTR_RW(name); #endif static ssize_t channelmap_show(struct device *dev, struct device_attribute *attr, char *buf) { struct mISDNdevice *mdev = dev_to_mISDN(dev); char *bp = buf; int i; for (i = 0; i <= mdev->nrbchan; i++) *bp++ = test_channelmap(i, mdev->channelmap) ? '1' : '0'; return bp - buf; } static DEVICE_ATTR_RO(channelmap); static struct attribute *mISDN_attrs[] = { &dev_attr_id.attr, &dev_attr_d_protocols.attr, &dev_attr_b_protocols.attr, &dev_attr_protocol.attr, &dev_attr_channelmap.attr, &dev_attr_nrbchan.attr, &dev_attr_name.attr, NULL, }; ATTRIBUTE_GROUPS(mISDN); static int mISDN_uevent(const struct device *dev, struct kobj_uevent_env *env) { const struct mISDNdevice *mdev = dev_to_mISDN(dev); if (!mdev) return 0; if (add_uevent_var(env, "nchans=%d", mdev->nrbchan)) return -ENOMEM; return 0; } static struct class mISDN_class = { .name = "mISDN", .dev_uevent = mISDN_uevent, .dev_groups = mISDN_groups, .dev_release = mISDN_dev_release, }; static int _get_mdevice(struct device *dev, const void *id) { struct mISDNdevice *mdev = dev_to_mISDN(dev); if (!mdev) return 0; if (mdev->id != *(const u_int *)id) return 0; return 1; } struct mISDNdevice *get_mdevice(u_int id) { return dev_to_mISDN(class_find_device(&mISDN_class, NULL, &id, _get_mdevice)); } static int _get_mdevice_count(struct device *dev, void *cnt) { *(int *)cnt += 1; return 0; } int get_mdevice_count(void) { int cnt = 0; class_for_each_device(&mISDN_class, NULL, &cnt, _get_mdevice_count); return cnt; } static int get_free_devid(void) { u_int i; for (i = 0; i <= MAX_DEVICE_ID; i++) if (!test_and_set_bit(i, (u_long *)&device_ids)) break; if (i > MAX_DEVICE_ID) return -EBUSY; return i; } int mISDN_register_device(struct mISDNdevice *dev, struct device *parent, char *name) { int err; err = get_free_devid(); if (err < 0) return err; dev->id = err; device_initialize(&dev->dev); if (name && name[0]) dev_set_name(&dev->dev, "%s", name); else dev_set_name(&dev->dev, "mISDN%d", dev->id); if (debug & DEBUG_CORE) printk(KERN_DEBUG "mISDN_register %s %d\n", dev_name(&dev->dev), dev->id); dev->dev.class = &mISDN_class; err = create_stack(dev); if (err) goto error1; dev->dev.platform_data = dev; dev->dev.parent = parent; dev_set_drvdata(&dev->dev, dev); err = device_add(&dev->dev); if (err) goto error3; return 0; error3: delete_stack(dev); error1: put_device(&dev->dev); return err; } EXPORT_SYMBOL(mISDN_register_device); void mISDN_unregister_device(struct mISDNdevice *dev) { if (debug & DEBUG_CORE) printk(KERN_DEBUG "mISDN_unregister %s %d\n", dev_name(&dev->dev), dev->id); /* sysfs_remove_link(&dev->dev.kobj, "device"); */ device_del(&dev->dev); dev_set_drvdata(&dev->dev, NULL); test_and_clear_bit(dev->id, (u_long *)&device_ids); delete_stack(dev); put_device(&dev->dev); } EXPORT_SYMBOL(mISDN_unregister_device); u_int get_all_Bprotocols(void) { struct Bprotocol *bp; u_int m = 0; read_lock(&bp_lock); list_for_each_entry(bp, &Bprotocols, list) m |= bp->Bprotocols; read_unlock(&bp_lock); return m; } struct Bprotocol * get_Bprotocol4mask(u_int m) { struct Bprotocol *bp; read_lock(&bp_lock); list_for_each_entry(bp, &Bprotocols, list) if (bp->Bprotocols & m) { read_unlock(&bp_lock); return bp; } read_unlock(&bp_lock); return NULL; } struct Bprotocol * get_Bprotocol4id(u_int id) { u_int m; if (id < ISDN_P_B_START || id > 63) { printk(KERN_WARNING "%s id not in range %d\n", __func__, id); return NULL; } m = 1 << (id & ISDN_P_B_MASK); return get_Bprotocol4mask(m); } int mISDN_register_Bprotocol(struct Bprotocol *bp) { u_long flags; struct Bprotocol *old; if (debug & DEBUG_CORE) printk(KERN_DEBUG "%s: %s/%x\n", __func__, bp->name, bp->Bprotocols); old = get_Bprotocol4mask(bp->Bprotocols); if (old) { printk(KERN_WARNING "register duplicate protocol old %s/%x new %s/%x\n", old->name, old->Bprotocols, bp->name, bp->Bprotocols); return -EBUSY; } write_lock_irqsave(&bp_lock, flags); list_add_tail(&bp->list, &Bprotocols); write_unlock_irqrestore(&bp_lock, flags); return 0; } EXPORT_SYMBOL(mISDN_register_Bprotocol); void mISDN_unregister_Bprotocol(struct Bprotocol *bp) { u_long flags; if (debug & DEBUG_CORE) printk(KERN_DEBUG "%s: %s/%x\n", __func__, bp->name, bp->Bprotocols); write_lock_irqsave(&bp_lock, flags); list_del(&bp->list); write_unlock_irqrestore(&bp_lock, flags); } EXPORT_SYMBOL(mISDN_unregister_Bprotocol); static const char *msg_no_channel = "<no channel>"; static const char *msg_no_stack = "<no stack>"; static const char *msg_no_stackdev = "<no stack device>"; const char *mISDNDevName4ch(struct mISDNchannel *ch) { if (!ch) return msg_no_channel; if (!ch->st) return msg_no_stack; if (!ch->st->dev) return msg_no_stackdev; return dev_name(&ch->st->dev->dev); }; EXPORT_SYMBOL(mISDNDevName4ch); static int mISDNInit(void) { int err; printk(KERN_INFO "Modular ISDN core version %d.%d.%d\n", MISDN_MAJOR_VERSION, MISDN_MINOR_VERSION, MISDN_RELEASE); mISDN_init_clock(&debug); mISDN_initstack(&debug); err = class_register(&mISDN_class); if (err) goto error1; err = mISDN_inittimer(&debug); if (err) goto error2; err = Isdnl1_Init(&debug); if (err) goto error3; err = Isdnl2_Init(&debug); if (err) goto error4; err = misdn_sock_init(&debug); if (err) goto error5; return 0; error5: Isdnl2_cleanup(); error4: Isdnl1_cleanup(); error3: mISDN_timer_cleanup(); error2: class_unregister(&mISDN_class); error1: return err; } static void mISDN_cleanup(void) { misdn_sock_cleanup(); Isdnl2_cleanup(); Isdnl1_cleanup(); mISDN_timer_cleanup(); class_unregister(&mISDN_class); printk(KERN_DEBUG "mISDNcore unloaded\n"); } module_init(mISDNInit); module_exit(mISDN_cleanup); |
2 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | /* SPDX-License-Identifier: GPL-2.0-only */ /* * Stream Parser * * Copyright (c) 2016 Tom Herbert <tom@herbertland.com> */ #ifndef __NET_STRPARSER_H_ #define __NET_STRPARSER_H_ #include <linux/skbuff.h> #include <net/sock.h> #define STRP_STATS_ADD(stat, count) ((stat) += (count)) #define STRP_STATS_INCR(stat) ((stat)++) struct strp_stats { unsigned long long msgs; unsigned long long bytes; unsigned int mem_fail; unsigned int need_more_hdr; unsigned int msg_too_big; unsigned int msg_timeouts; unsigned int bad_hdr_len; }; struct strp_aggr_stats { unsigned long long msgs; unsigned long long bytes; unsigned int mem_fail; unsigned int need_more_hdr; unsigned int msg_too_big; unsigned int msg_timeouts; unsigned int bad_hdr_len; unsigned int aborts; unsigned int interrupted; unsigned int unrecov_intr; }; struct strparser; /* Callbacks are called with lock held for the attached socket */ struct strp_callbacks { int (*parse_msg)(struct strparser *strp, struct sk_buff *skb); void (*rcv_msg)(struct strparser *strp, struct sk_buff *skb); int (*read_sock_done)(struct strparser *strp, int err); void (*abort_parser)(struct strparser *strp, int err); void (*lock)(struct strparser *strp); void (*unlock)(struct strparser *strp); }; struct strp_msg { int full_len; int offset; }; struct _strp_msg { /* Internal cb structure. struct strp_msg must be first for passing * to upper layer. */ struct strp_msg strp; int accum_len; }; struct sk_skb_cb { #define SK_SKB_CB_PRIV_LEN 20 unsigned char data[SK_SKB_CB_PRIV_LEN]; /* align strp on cache line boundary within skb->cb[] */ unsigned char pad[4]; struct _strp_msg strp; /* strp users' data follows */ struct tls_msg { u8 control; } tls; /* temp_reg is a temporary register used for bpf_convert_data_end_access * when dst_reg == src_reg. */ u64 temp_reg; }; static inline struct strp_msg *strp_msg(struct sk_buff *skb) { return (struct strp_msg *)((void *)skb->cb + offsetof(struct sk_skb_cb, strp)); } /* Structure for an attached lower socket */ struct strparser { struct sock *sk; u32 stopped : 1; u32 paused : 1; u32 aborted : 1; u32 interrupted : 1; u32 unrecov_intr : 1; struct sk_buff **skb_nextp; struct sk_buff *skb_head; unsigned int need_bytes; struct delayed_work msg_timer_work; struct work_struct work; struct strp_stats stats; struct strp_callbacks cb; }; /* Must be called with lock held for attached socket */ static inline void strp_pause(struct strparser *strp) { strp->paused = 1; } /* May be called without holding lock for attached socket */ void strp_unpause(struct strparser *strp); /* Must be called with process lock held (lock_sock) */ void __strp_unpause(struct strparser *strp); static inline void save_strp_stats(struct strparser *strp, struct strp_aggr_stats *agg_stats) { /* Save psock statistics in the mux when psock is being unattached. */ #define SAVE_PSOCK_STATS(_stat) (agg_stats->_stat += \ strp->stats._stat) SAVE_PSOCK_STATS(msgs); SAVE_PSOCK_STATS(bytes); SAVE_PSOCK_STATS(mem_fail); SAVE_PSOCK_STATS(need_more_hdr); SAVE_PSOCK_STATS(msg_too_big); SAVE_PSOCK_STATS(msg_timeouts); SAVE_PSOCK_STATS(bad_hdr_len); #undef SAVE_PSOCK_STATS if (strp->aborted) agg_stats->aborts++; if (strp->interrupted) agg_stats->interrupted++; if (strp->unrecov_intr) agg_stats->unrecov_intr++; } static inline void aggregate_strp_stats(struct strp_aggr_stats *stats, struct strp_aggr_stats *agg_stats) { #define SAVE_PSOCK_STATS(_stat) (agg_stats->_stat += stats->_stat) SAVE_PSOCK_STATS(msgs); SAVE_PSOCK_STATS(bytes); SAVE_PSOCK_STATS(mem_fail); SAVE_PSOCK_STATS(need_more_hdr); SAVE_PSOCK_STATS(msg_too_big); SAVE_PSOCK_STATS(msg_timeouts); SAVE_PSOCK_STATS(bad_hdr_len); SAVE_PSOCK_STATS(aborts); SAVE_PSOCK_STATS(interrupted); SAVE_PSOCK_STATS(unrecov_intr); #undef SAVE_PSOCK_STATS } void strp_done(struct strparser *strp); void strp_stop(struct strparser *strp); void strp_check_rcv(struct strparser *strp); int strp_init(struct strparser *strp, struct sock *sk, const struct strp_callbacks *cb); void strp_data_ready(struct strparser *strp); int strp_process(struct strparser *strp, struct sk_buff *orig_skb, unsigned int orig_offset, size_t orig_len, size_t max_msg_size, long timeo); #endif /* __NET_STRPARSER_H_ */ |
1 1 6 6 3 1 2 1 1 1 3 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (c) 2014 Arturo Borrero Gonzalez <arturo@debian.org> */ #include <linux/kernel.h> #include <linux/init.h> #include <linux/module.h> #include <linux/netlink.h> #include <linux/netfilter.h> #include <linux/netfilter/nf_tables.h> #include <net/netfilter/nf_tables.h> #include <net/netfilter/nf_nat.h> #include <net/netfilter/nf_nat_masquerade.h> struct nft_masq { u32 flags; u8 sreg_proto_min; u8 sreg_proto_max; }; static const struct nla_policy nft_masq_policy[NFTA_MASQ_MAX + 1] = { [NFTA_MASQ_FLAGS] = NLA_POLICY_MASK(NLA_BE32, NF_NAT_RANGE_MASK), [NFTA_MASQ_REG_PROTO_MIN] = { .type = NLA_U32 }, [NFTA_MASQ_REG_PROTO_MAX] = { .type = NLA_U32 }, }; static int nft_masq_validate(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nft_data **data) { int err; err = nft_chain_validate_dependency(ctx->chain, NFT_CHAIN_T_NAT); if (err < 0) return err; return nft_chain_validate_hooks(ctx->chain, (1 << NF_INET_POST_ROUTING)); } static int nft_masq_init(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nlattr * const tb[]) { u32 plen = sizeof_field(struct nf_nat_range, min_proto.all); struct nft_masq *priv = nft_expr_priv(expr); int err; if (tb[NFTA_MASQ_FLAGS]) priv->flags = ntohl(nla_get_be32(tb[NFTA_MASQ_FLAGS])); if (tb[NFTA_MASQ_REG_PROTO_MIN]) { err = nft_parse_register_load(tb[NFTA_MASQ_REG_PROTO_MIN], &priv->sreg_proto_min, plen); if (err < 0) return err; if (tb[NFTA_MASQ_REG_PROTO_MAX]) { err = nft_parse_register_load(tb[NFTA_MASQ_REG_PROTO_MAX], &priv->sreg_proto_max, plen); if (err < 0) return err; } else { priv->sreg_proto_max = priv->sreg_proto_min; } } return nf_ct_netns_get(ctx->net, ctx->family); } static int nft_masq_dump(struct sk_buff *skb, const struct nft_expr *expr, bool reset) { const struct nft_masq *priv = nft_expr_priv(expr); if (priv->flags != 0 && nla_put_be32(skb, NFTA_MASQ_FLAGS, htonl(priv->flags))) goto nla_put_failure; if (priv->sreg_proto_min) { if (nft_dump_register(skb, NFTA_MASQ_REG_PROTO_MIN, priv->sreg_proto_min) || nft_dump_register(skb, NFTA_MASQ_REG_PROTO_MAX, priv->sreg_proto_max)) goto nla_put_failure; } return 0; nla_put_failure: return -1; } static void nft_masq_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) { const struct nft_masq *priv = nft_expr_priv(expr); struct nf_nat_range2 range; memset(&range, 0, sizeof(range)); range.flags = priv->flags; if (priv->sreg_proto_min) { range.min_proto.all = (__force __be16) nft_reg_load16(®s->data[priv->sreg_proto_min]); range.max_proto.all = (__force __be16) nft_reg_load16(®s->data[priv->sreg_proto_max]); } switch (nft_pf(pkt)) { case NFPROTO_IPV4: regs->verdict.code = nf_nat_masquerade_ipv4(pkt->skb, nft_hook(pkt), &range, nft_out(pkt)); break; #ifdef CONFIG_NF_TABLES_IPV6 case NFPROTO_IPV6: regs->verdict.code = nf_nat_masquerade_ipv6(pkt->skb, &range, nft_out(pkt)); break; #endif default: WARN_ON_ONCE(1); break; } } static void nft_masq_ipv4_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr) { nf_ct_netns_put(ctx->net, NFPROTO_IPV4); } static struct nft_expr_type nft_masq_ipv4_type; static const struct nft_expr_ops nft_masq_ipv4_ops = { .type = &nft_masq_ipv4_type, .size = NFT_EXPR_SIZE(sizeof(struct nft_masq)), .eval = nft_masq_eval, .init = nft_masq_init, .destroy = nft_masq_ipv4_destroy, .dump = nft_masq_dump, .validate = nft_masq_validate, .reduce = NFT_REDUCE_READONLY, }; static struct nft_expr_type nft_masq_ipv4_type __read_mostly = { .family = NFPROTO_IPV4, .name = "masq", .ops = &nft_masq_ipv4_ops, .policy = nft_masq_policy, .maxattr = NFTA_MASQ_MAX, .owner = THIS_MODULE, }; #ifdef CONFIG_NF_TABLES_IPV6 static void nft_masq_ipv6_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr) { nf_ct_netns_put(ctx->net, NFPROTO_IPV6); } static struct nft_expr_type nft_masq_ipv6_type; static const struct nft_expr_ops nft_masq_ipv6_ops = { .type = &nft_masq_ipv6_type, .size = NFT_EXPR_SIZE(sizeof(struct nft_masq)), .eval = nft_masq_eval, .init = nft_masq_init, .destroy = nft_masq_ipv6_destroy, .dump = nft_masq_dump, .validate = nft_masq_validate, .reduce = NFT_REDUCE_READONLY, }; static struct nft_expr_type nft_masq_ipv6_type __read_mostly = { .family = NFPROTO_IPV6, .name = "masq", .ops = &nft_masq_ipv6_ops, .policy = nft_masq_policy, .maxattr = NFTA_MASQ_MAX, .owner = THIS_MODULE, }; static int __init nft_masq_module_init_ipv6(void) { return nft_register_expr(&nft_masq_ipv6_type); } static void nft_masq_module_exit_ipv6(void) { nft_unregister_expr(&nft_masq_ipv6_type); } #else static inline int nft_masq_module_init_ipv6(void) { return 0; } static inline void nft_masq_module_exit_ipv6(void) {} #endif #ifdef CONFIG_NF_TABLES_INET static void nft_masq_inet_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr) { nf_ct_netns_put(ctx->net, NFPROTO_INET); } static struct nft_expr_type nft_masq_inet_type; static const struct nft_expr_ops nft_masq_inet_ops = { .type = &nft_masq_inet_type, .size = NFT_EXPR_SIZE(sizeof(struct nft_masq)), .eval = nft_masq_eval, .init = nft_masq_init, .destroy = nft_masq_inet_destroy, .dump = nft_masq_dump, .validate = nft_masq_validate, .reduce = NFT_REDUCE_READONLY, }; static struct nft_expr_type nft_masq_inet_type __read_mostly = { .family = NFPROTO_INET, .name = "masq", .ops = &nft_masq_inet_ops, .policy = nft_masq_policy, .maxattr = NFTA_MASQ_MAX, .owner = THIS_MODULE, }; static int __init nft_masq_module_init_inet(void) { return nft_register_expr(&nft_masq_inet_type); } static void nft_masq_module_exit_inet(void) { nft_unregister_expr(&nft_masq_inet_type); } #else static inline int nft_masq_module_init_inet(void) { return 0; } static inline void nft_masq_module_exit_inet(void) {} #endif static int __init nft_masq_module_init(void) { int ret; ret = nft_masq_module_init_ipv6(); if (ret < 0) return ret; ret = nft_masq_module_init_inet(); if (ret < 0) { nft_masq_module_exit_ipv6(); return ret; } ret = nft_register_expr(&nft_masq_ipv4_type); if (ret < 0) { nft_masq_module_exit_inet(); nft_masq_module_exit_ipv6(); return ret; } ret = nf_nat_masquerade_inet_register_notifiers(); if (ret < 0) { nft_masq_module_exit_ipv6(); nft_masq_module_exit_inet(); nft_unregister_expr(&nft_masq_ipv4_type); return ret; } return ret; } static void __exit nft_masq_module_exit(void) { nft_masq_module_exit_ipv6(); nft_masq_module_exit_inet(); nft_unregister_expr(&nft_masq_ipv4_type); nf_nat_masquerade_inet_unregister_notifiers(); } module_init(nft_masq_module_init); module_exit(nft_masq_module_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Arturo Borrero Gonzalez <arturo@debian.org>"); MODULE_ALIAS_NFT_EXPR("masq"); MODULE_DESCRIPTION("Netfilter nftables masquerade expression support"); |
16 15 1 6 8 8 1 7 7 2 2 2 2 1 1 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 | // SPDX-License-Identifier: GPL-2.0-or-later /* * IPV6 GSO/GRO offload support * Linux INET6 implementation * * UDPv6 GSO support */ #include <linux/skbuff.h> #include <linux/netdevice.h> #include <linux/indirect_call_wrapper.h> #include <net/protocol.h> #include <net/ipv6.h> #include <net/udp.h> #include <net/ip6_checksum.h> #include "ip6_offload.h" #include <net/gro.h> #include <net/gso.h> static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb, netdev_features_t features) { struct sk_buff *segs = ERR_PTR(-EINVAL); unsigned int mss; unsigned int unfrag_ip6hlen, unfrag_len; struct frag_hdr *fptr; u8 *packet_start, *prevhdr; u8 nexthdr; u8 frag_hdr_sz = sizeof(struct frag_hdr); __wsum csum; int tnl_hlen; int err; if (skb->encapsulation && skb_shinfo(skb)->gso_type & (SKB_GSO_UDP_TUNNEL|SKB_GSO_UDP_TUNNEL_CSUM)) segs = skb_udp_tunnel_segment(skb, features, true); else { const struct ipv6hdr *ipv6h; struct udphdr *uh; if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_UDP | SKB_GSO_UDP_L4))) goto out; if (!pskb_may_pull(skb, sizeof(struct udphdr))) goto out; if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) return __udp_gso_segment(skb, features, true); mss = skb_shinfo(skb)->gso_size; if (unlikely(skb->len <= mss)) goto out; /* Do software UFO. Complete and fill in the UDP checksum as HW cannot * do checksum of UDP packets sent as multiple IP fragments. */ uh = udp_hdr(skb); ipv6h = ipv6_hdr(skb); uh->check = 0; csum = skb_checksum(skb, 0, skb->len, 0); uh->check = udp_v6_check(skb->len, &ipv6h->saddr, &ipv6h->daddr, csum); if (uh->check == 0) uh->check = CSUM_MANGLED_0; skb->ip_summed = CHECKSUM_UNNECESSARY; /* If there is no outer header we can fake a checksum offload * due to the fact that we have already done the checksum in * software prior to segmenting the frame. */ if (!skb->encap_hdr_csum) features |= NETIF_F_HW_CSUM; /* Check if there is enough headroom to insert fragment header. */ tnl_hlen = skb_tnl_header_len(skb); if (skb->mac_header < (tnl_hlen + frag_hdr_sz)) { if (gso_pskb_expand_head(skb, tnl_hlen + frag_hdr_sz)) goto out; } /* Find the unfragmentable header and shift it left by frag_hdr_sz * bytes to insert fragment header. */ err = ip6_find_1stfragopt(skb, &prevhdr); if (err < 0) return ERR_PTR(err); unfrag_ip6hlen = err; nexthdr = *prevhdr; *prevhdr = NEXTHDR_FRAGMENT; unfrag_len = (skb_network_header(skb) - skb_mac_header(skb)) + unfrag_ip6hlen + tnl_hlen; packet_start = (u8 *) skb->head + SKB_GSO_CB(skb)->mac_offset; memmove(packet_start-frag_hdr_sz, packet_start, unfrag_len); SKB_GSO_CB(skb)->mac_offset -= frag_hdr_sz; skb->mac_header -= frag_hdr_sz; skb->network_header -= frag_hdr_sz; fptr = (struct frag_hdr *)(skb_network_header(skb) + unfrag_ip6hlen); fptr->nexthdr = nexthdr; fptr->reserved = 0; fptr->identification = ipv6_proxy_select_ident(dev_net(skb->dev), skb); /* Fragment the skb. ipv6 header and the remaining fields of the * fragment header are updated in ipv6_gso_segment() */ segs = skb_segment(skb, features); } out: return segs; } static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport, __be16 dport) { const struct ipv6hdr *iph = skb_gro_network_header(skb); struct net *net = dev_net(skb->dev); int iif, sdif; inet6_get_iif_sdif(skb, &iif, &sdif); return __udp6_lib_lookup(net, &iph->saddr, sport, &iph->daddr, dport, iif, sdif, net->ipv4.udp_table, NULL); } INDIRECT_CALLABLE_SCOPE struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb) { struct udphdr *uh = udp_gro_udphdr(skb); struct sock *sk = NULL; struct sk_buff *pp; if (unlikely(!uh)) goto flush; /* Don't bother verifying checksum if we're going to flush anyway. */ if (NAPI_GRO_CB(skb)->flush) goto skip; if (skb_gro_checksum_validate_zero_check(skb, IPPROTO_UDP, uh->check, ip6_gro_compute_pseudo)) goto flush; else if (uh->check) skb_gro_checksum_try_convert(skb, IPPROTO_UDP, ip6_gro_compute_pseudo); skip: NAPI_GRO_CB(skb)->is_ipv6 = 1; if (static_branch_unlikely(&udpv6_encap_needed_key)) sk = udp6_gro_lookup_skb(skb, uh->source, uh->dest); pp = udp_gro_receive(head, skb, uh, sk); return pp; flush: NAPI_GRO_CB(skb)->flush = 1; return NULL; } INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff) { const struct ipv6hdr *ipv6h = ipv6_hdr(skb); struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); /* do fraglist only if there is no outer UDP encap (or we already processed it) */ if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) { uh->len = htons(skb->len - nhoff); skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count; if (skb->ip_summed == CHECKSUM_UNNECESSARY) { if (skb->csum_level < SKB_MAX_CSUM_LEVEL) skb->csum_level++; } else { skb->ip_summed = CHECKSUM_UNNECESSARY; skb->csum_level = 0; } return 0; } if (uh->check) uh->check = ~udp_v6_check(skb->len - nhoff, &ipv6h->saddr, &ipv6h->daddr, 0); return udp_gro_complete(skb, nhoff, udp6_lib_lookup_skb); } int __init udpv6_offload_init(void) { net_hotdata.udpv6_offload = (struct net_offload) { .callbacks = { .gso_segment = udp6_ufo_fragment, .gro_receive = udp6_gro_receive, .gro_complete = udp6_gro_complete, }, }; return inet6_add_offload(&net_hotdata.udpv6_offload, IPPROTO_UDP); } int udpv6_offload_exit(void) { return inet6_del_offload(&net_hotdata.udpv6_offload, IPPROTO_UDP); } |
9 10 10 5 5 5 5 5 5 5 5 5 5 5 5 5 3 3 1 3 1 1 1 1 1 1 1 1 1 1 1 1 5 4 2 2 5 4 1 5 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 | /* inflate.c -- zlib decompression * Copyright (C) 1995-2005 Mark Adler * For conditions of distribution and use, see copyright notice in zlib.h * * Based on zlib 1.2.3 but modified for the Linux Kernel by * Richard Purdie <richard@openedhand.com> * * Changes mainly for static instead of dynamic memory allocation * */ #include <linux/zutil.h> #include "inftrees.h" #include "inflate.h" #include "inffast.h" #include "infutil.h" /* architecture-specific bits */ #ifdef CONFIG_ZLIB_DFLTCC # include "../zlib_dfltcc/dfltcc_inflate.h" #else #define INFLATE_RESET_HOOK(strm) do {} while (0) #define INFLATE_TYPEDO_HOOK(strm, flush) do {} while (0) #define INFLATE_NEED_UPDATEWINDOW(strm) 1 #define INFLATE_NEED_CHECKSUM(strm) 1 #endif int zlib_inflate_workspacesize(void) { return sizeof(struct inflate_workspace); } int zlib_inflateReset(z_streamp strm) { struct inflate_state *state; if (strm == NULL || strm->state == NULL) return Z_STREAM_ERROR; state = (struct inflate_state *)strm->state; strm->total_in = strm->total_out = state->total = 0; strm->msg = NULL; strm->adler = 1; /* to support ill-conceived Java test suite */ state->mode = HEAD; state->last = 0; state->havedict = 0; state->dmax = 32768U; state->hold = 0; state->bits = 0; state->lencode = state->distcode = state->next = state->codes; /* Initialise Window */ state->wsize = 1U << state->wbits; state->write = 0; state->whave = 0; INFLATE_RESET_HOOK(strm); return Z_OK; } int zlib_inflateInit2(z_streamp strm, int windowBits) { struct inflate_state *state; if (strm == NULL) return Z_STREAM_ERROR; strm->msg = NULL; /* in case we return an error */ state = &WS(strm)->inflate_state; strm->state = (struct internal_state *)state; if (windowBits < 0) { state->wrap = 0; windowBits = -windowBits; } else { state->wrap = (windowBits >> 4) + 1; } if (windowBits < 8 || windowBits > 15) { return Z_STREAM_ERROR; } state->wbits = (unsigned)windowBits; #ifdef CONFIG_ZLIB_DFLTCC /* * DFLTCC requires the window to be page aligned. * Thus, we overallocate and take the aligned portion of the buffer. */ state->window = PTR_ALIGN(&WS(strm)->working_window[0], PAGE_SIZE); #else state->window = &WS(strm)->working_window[0]; #endif return zlib_inflateReset(strm); } /* Return state with length and distance decoding tables and index sizes set to fixed code decoding. This returns fixed tables from inffixed.h. */ static void zlib_fixedtables(struct inflate_state *state) { # include "inffixed.h" state->lencode = lenfix; state->lenbits = 9; state->distcode = distfix; state->distbits = 5; } /* Update the window with the last wsize (normally 32K) bytes written before returning. This is only called when a window is already in use, or when output has been written during this inflate call, but the end of the deflate stream has not been reached yet. It is also called to window dictionary data when a dictionary is loaded. Providing output buffers larger than 32K to inflate() should provide a speed advantage, since only the last 32K of output is copied to the sliding window upon return from inflate(), and since all distances after the first 32K of output will fall in the output data, making match copies simpler and faster. The advantage may be dependent on the size of the processor's data caches. */ static void zlib_updatewindow(z_streamp strm, unsigned out) { struct inflate_state *state; unsigned copy, dist; state = (struct inflate_state *)strm->state; /* copy state->wsize or less output bytes into the circular window */ copy = out - strm->avail_out; if (copy >= state->wsize) { memcpy(state->window, strm->next_out - state->wsize, state->wsize); state->write = 0; state->whave = state->wsize; } else { dist = state->wsize - state->write; if (dist > copy) dist = copy; memcpy(state->window + state->write, strm->next_out - copy, dist); copy -= dist; if (copy) { memcpy(state->window, strm->next_out - copy, copy); state->write = copy; state->whave = state->wsize; } else { state->write += dist; if (state->write == state->wsize) state->write = 0; if (state->whave < state->wsize) state->whave += dist; } } } /* * At the end of a Deflate-compressed PPP packet, we expect to have seen * a `stored' block type value but not the (zero) length bytes. */ /* Returns true if inflate is currently at the end of a block generated by Z_SYNC_FLUSH or Z_FULL_FLUSH. This function is used by one PPP implementation to provide an additional safety check. PPP uses Z_SYNC_FLUSH but removes the length bytes of the resulting empty stored block. When decompressing, PPP checks that at the end of input packet, inflate is waiting for these length bytes. */ static int zlib_inflateSyncPacket(z_streamp strm) { struct inflate_state *state; if (strm == NULL || strm->state == NULL) return Z_STREAM_ERROR; state = (struct inflate_state *)strm->state; if (state->mode == STORED && state->bits == 0) { state->mode = TYPE; return Z_OK; } return Z_DATA_ERROR; } /* Macros for inflate(): */ /* check function to use adler32() for zlib or crc32() for gzip */ #define UPDATE(check, buf, len) zlib_adler32(check, buf, len) /* Load registers with state in inflate() for speed */ #define LOAD() \ do { \ put = strm->next_out; \ left = strm->avail_out; \ next = strm->next_in; \ have = strm->avail_in; \ hold = state->hold; \ bits = state->bits; \ } while (0) /* Restore state from registers in inflate() */ #define RESTORE() \ do { \ strm->next_out = put; \ strm->avail_out = left; \ strm->next_in = next; \ strm->avail_in = have; \ state->hold = hold; \ state->bits = bits; \ } while (0) /* Clear the input bit accumulator */ #define INITBITS() \ do { \ hold = 0; \ bits = 0; \ } while (0) /* Get a byte of input into the bit accumulator, or return from inflate() if there is no input available. */ #define PULLBYTE() \ do { \ if (have == 0) goto inf_leave; \ have--; \ hold += (unsigned long)(*next++) << bits; \ bits += 8; \ } while (0) /* Assure that there are at least n bits in the bit accumulator. If there is not enough available input to do that, then return from inflate(). */ #define NEEDBITS(n) \ do { \ while (bits < (unsigned)(n)) \ PULLBYTE(); \ } while (0) /* Return the low n bits of the bit accumulator (n < 16) */ #define BITS(n) \ ((unsigned)hold & ((1U << (n)) - 1)) /* Remove n bits from the bit accumulator */ #define DROPBITS(n) \ do { \ hold >>= (n); \ bits -= (unsigned)(n); \ } while (0) /* Remove zero to seven bits as needed to go to a byte boundary */ #define BYTEBITS() \ do { \ hold >>= bits & 7; \ bits -= bits & 7; \ } while (0) /* inflate() uses a state machine to process as much input data and generate as much output data as possible before returning. The state machine is structured roughly as follows: for (;;) switch (state) { ... case STATEn: if (not enough input data or output space to make progress) return; ... make progress ... state = STATEm; break; ... } so when inflate() is called again, the same case is attempted again, and if the appropriate resources are provided, the machine proceeds to the next state. The NEEDBITS() macro is usually the way the state evaluates whether it can proceed or should return. NEEDBITS() does the return if the requested bits are not available. The typical use of the BITS macros is: NEEDBITS(n); ... do something with BITS(n) ... DROPBITS(n); where NEEDBITS(n) either returns from inflate() if there isn't enough input left to load n bits into the accumulator, or it continues. BITS(n) gives the low n bits in the accumulator. When done, DROPBITS(n) drops the low n bits off the accumulator. INITBITS() clears the accumulator and sets the number of available bits to zero. BYTEBITS() discards just enough bits to put the accumulator on a byte boundary. After BYTEBITS() and a NEEDBITS(8), then BITS(8) would return the next byte in the stream. NEEDBITS(n) uses PULLBYTE() to get an available byte of input, or to return if there is no input available. The decoding of variable length codes uses PULLBYTE() directly in order to pull just enough bytes to decode the next code, and no more. Some states loop until they get enough input, making sure that enough state information is maintained to continue the loop where it left off if NEEDBITS() returns in the loop. For example, want, need, and keep would all have to actually be part of the saved state in case NEEDBITS() returns: case STATEw: while (want < need) { NEEDBITS(n); keep[want++] = BITS(n); DROPBITS(n); } state = STATEx; case STATEx: As shown above, if the next state is also the next case, then the break is omitted. A state may also return if there is not enough output space available to complete that state. Those states are copying stored data, writing a literal byte, and copying a matching string. When returning, a "goto inf_leave" is used to update the total counters, update the check value, and determine whether any progress has been made during that inflate() call in order to return the proper return code. Progress is defined as a change in either strm->avail_in or strm->avail_out. When there is a window, goto inf_leave will update the window with the last output written. If a goto inf_leave occurs in the middle of decompression and there is no window currently, goto inf_leave will create one and copy output to the window for the next call of inflate(). In this implementation, the flush parameter of inflate() only affects the return code (per zlib.h). inflate() always writes as much as possible to strm->next_out, given the space available and the provided input--the effect documented in zlib.h of Z_SYNC_FLUSH. Furthermore, inflate() always defers the allocation of and copying into a sliding window until necessary, which provides the effect documented in zlib.h for Z_FINISH when the entire input stream available. So the only thing the flush parameter actually does is: when flush is set to Z_FINISH, inflate() cannot return Z_OK. Instead it will return Z_BUF_ERROR if it has not reached the end of the stream. */ int zlib_inflate(z_streamp strm, int flush) { struct inflate_state *state; const unsigned char *next; /* next input */ unsigned char *put; /* next output */ unsigned have, left; /* available input and output */ unsigned long hold; /* bit buffer */ unsigned bits; /* bits in bit buffer */ unsigned in, out; /* save starting available input and output */ unsigned copy; /* number of stored or match bytes to copy */ unsigned char *from; /* where to copy match bytes from */ code this; /* current decoding table entry */ code last; /* parent table entry */ unsigned len; /* length to copy for repeats, bits to drop */ int ret; /* return code */ static const unsigned short order[19] = /* permutation of code lengths */ {16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15}; /* Do not check for strm->next_out == NULL here as ppc zImage inflates to strm->next_out = 0 */ if (strm == NULL || strm->state == NULL || (strm->next_in == NULL && strm->avail_in != 0)) return Z_STREAM_ERROR; state = (struct inflate_state *)strm->state; if (state->mode == TYPE) state->mode = TYPEDO; /* skip check */ LOAD(); in = have; out = left; ret = Z_OK; for (;;) switch (state->mode) { case HEAD: if (state->wrap == 0) { state->mode = TYPEDO; break; } NEEDBITS(16); if ( ((BITS(8) << 8) + (hold >> 8)) % 31) { strm->msg = (char *)"incorrect header check"; state->mode = BAD; break; } if (BITS(4) != Z_DEFLATED) { strm->msg = (char *)"unknown compression method"; state->mode = BAD; break; } DROPBITS(4); len = BITS(4) + 8; if (len > state->wbits) { strm->msg = (char *)"invalid window size"; state->mode = BAD; break; } state->dmax = 1U << len; strm->adler = state->check = zlib_adler32(0L, NULL, 0); state->mode = hold & 0x200 ? DICTID : TYPE; INITBITS(); break; case DICTID: NEEDBITS(32); strm->adler = state->check = REVERSE(hold); INITBITS(); state->mode = DICT; fallthrough; case DICT: if (state->havedict == 0) { RESTORE(); return Z_NEED_DICT; } strm->adler = state->check = zlib_adler32(0L, NULL, 0); state->mode = TYPE; fallthrough; case TYPE: if (flush == Z_BLOCK) goto inf_leave; fallthrough; case TYPEDO: INFLATE_TYPEDO_HOOK(strm, flush); if (state->last) { BYTEBITS(); state->mode = CHECK; break; } NEEDBITS(3); state->last = BITS(1); DROPBITS(1); switch (BITS(2)) { case 0: /* stored block */ state->mode = STORED; break; case 1: /* fixed block */ zlib_fixedtables(state); state->mode = LEN; /* decode codes */ break; case 2: /* dynamic block */ state->mode = TABLE; break; case 3: strm->msg = (char *)"invalid block type"; state->mode = BAD; } DROPBITS(2); break; case STORED: BYTEBITS(); /* go to byte boundary */ NEEDBITS(32); if ((hold & 0xffff) != ((hold >> 16) ^ 0xffff)) { strm->msg = (char *)"invalid stored block lengths"; state->mode = BAD; break; } state->length = (unsigned)hold & 0xffff; INITBITS(); state->mode = COPY; fallthrough; case COPY: copy = state->length; if (copy) { if (copy > have) copy = have; if (copy > left) copy = left; if (copy == 0) goto inf_leave; memcpy(put, next, copy); have -= copy; next += copy; left -= copy; put += copy; state->length -= copy; break; } state->mode = TYPE; break; case TABLE: NEEDBITS(14); state->nlen = BITS(5) + 257; DROPBITS(5); state->ndist = BITS(5) + 1; DROPBITS(5); state->ncode = BITS(4) + 4; DROPBITS(4); #ifndef PKZIP_BUG_WORKAROUND if (state->nlen > 286 || state->ndist > 30) { strm->msg = (char *)"too many length or distance symbols"; state->mode = BAD; break; } #endif state->have = 0; state->mode = LENLENS; fallthrough; case LENLENS: while (state->have < state->ncode) { NEEDBITS(3); state->lens[order[state->have++]] = (unsigned short)BITS(3); DROPBITS(3); } while (state->have < 19) state->lens[order[state->have++]] = 0; state->next = state->codes; state->lencode = (code const *)(state->next); state->lenbits = 7; ret = zlib_inflate_table(CODES, state->lens, 19, &(state->next), &(state->lenbits), state->work); if (ret) { strm->msg = (char *)"invalid code lengths set"; state->mode = BAD; break; } state->have = 0; state->mode = CODELENS; fallthrough; case CODELENS: while (state->have < state->nlen + state->ndist) { for (;;) { this = state->lencode[BITS(state->lenbits)]; if ((unsigned)(this.bits) <= bits) break; PULLBYTE(); } if (this.val < 16) { NEEDBITS(this.bits); DROPBITS(this.bits); state->lens[state->have++] = this.val; } else { if (this.val == 16) { NEEDBITS(this.bits + 2); DROPBITS(this.bits); if (state->have == 0) { strm->msg = (char *)"invalid bit length repeat"; state->mode = BAD; break; } len = state->lens[state->have - 1]; copy = 3 + BITS(2); DROPBITS(2); } else if (this.val == 17) { NEEDBITS(this.bits + 3); DROPBITS(this.bits); len = 0; copy = 3 + BITS(3); DROPBITS(3); } else { NEEDBITS(this.bits + 7); DROPBITS(this.bits); len = 0; copy = 11 + BITS(7); DROPBITS(7); } if (state->have + copy > state->nlen + state->ndist) { strm->msg = (char *)"invalid bit length repeat"; state->mode = BAD; break; } while (copy--) state->lens[state->have++] = (unsigned short)len; } } /* handle error breaks in while */ if (state->mode == BAD) break; /* build code tables */ state->next = state->codes; state->lencode = (code const *)(state->next); state->lenbits = 9; ret = zlib_inflate_table(LENS, state->lens, state->nlen, &(state->next), &(state->lenbits), state->work); if (ret) { strm->msg = (char *)"invalid literal/lengths set"; state->mode = BAD; break; } state->distcode = (code const *)(state->next); state->distbits = 6; ret = zlib_inflate_table(DISTS, state->lens + state->nlen, state->ndist, &(state->next), &(state->distbits), state->work); if (ret) { strm->msg = (char *)"invalid distances set"; state->mode = BAD; break; } state->mode = LEN; fallthrough; case LEN: if (have >= 6 && left >= 258) { RESTORE(); inflate_fast(strm, out); LOAD(); break; } for (;;) { this = state->lencode[BITS(state->lenbits)]; if ((unsigned)(this.bits) <= bits) break; PULLBYTE(); } if (this.op && (this.op & 0xf0) == 0) { last = this; for (;;) { this = state->lencode[last.val + (BITS(last.bits + last.op) >> last.bits)]; if ((unsigned)(last.bits + this.bits) <= bits) break; PULLBYTE(); } DROPBITS(last.bits); } DROPBITS(this.bits); state->length = (unsigned)this.val; if ((int)(this.op) == 0) { state->mode = LIT; break; } if (this.op & 32) { state->mode = TYPE; break; } if (this.op & 64) { strm->msg = (char *)"invalid literal/length code"; state->mode = BAD; break; } state->extra = (unsigned)(this.op) & 15; state->mode = LENEXT; fallthrough; case LENEXT: if (state->extra) { NEEDBITS(state->extra); state->length += BITS(state->extra); DROPBITS(state->extra); } state->mode = DIST; fallthrough; case DIST: for (;;) { this = state->distcode[BITS(state->distbits)]; if ((unsigned)(this.bits) <= bits) break; PULLBYTE(); } if ((this.op & 0xf0) == 0) { last = this; for (;;) { this = state->distcode[last.val + (BITS(last.bits + last.op) >> last.bits)]; if ((unsigned)(last.bits + this.bits) <= bits) break; PULLBYTE(); } DROPBITS(last.bits); } DROPBITS(this.bits); if (this.op & 64) { strm->msg = (char *)"invalid distance code"; state->mode = BAD; break; } state->offset = (unsigned)this.val; state->extra = (unsigned)(this.op) & 15; state->mode = DISTEXT; fallthrough; case DISTEXT: if (state->extra) { NEEDBITS(state->extra); state->offset += BITS(state->extra); DROPBITS(state->extra); } #ifdef INFLATE_STRICT if (state->offset > state->dmax) { strm->msg = (char *)"invalid distance too far back"; state->mode = BAD; break; } #endif if (state->offset > state->whave + out - left) { strm->msg = (char *)"invalid distance too far back"; state->mode = BAD; break; } state->mode = MATCH; fallthrough; case MATCH: if (left == 0) goto inf_leave; copy = out - left; if (state->offset > copy) { /* copy from window */ copy = state->offset - copy; if (copy > state->write) { copy -= state->write; from = state->window + (state->wsize - copy); } else from = state->window + (state->write - copy); if (copy > state->length) copy = state->length; } else { /* copy from output */ from = put - state->offset; copy = state->length; } if (copy > left) copy = left; left -= copy; state->length -= copy; do { *put++ = *from++; } while (--copy); if (state->length == 0) state->mode = LEN; break; case LIT: if (left == 0) goto inf_leave; *put++ = (unsigned char)(state->length); left--; state->mode = LEN; break; case CHECK: if (state->wrap) { NEEDBITS(32); out -= left; strm->total_out += out; state->total += out; if (INFLATE_NEED_CHECKSUM(strm) && out) strm->adler = state->check = UPDATE(state->check, put - out, out); out = left; if (( REVERSE(hold)) != state->check) { strm->msg = (char *)"incorrect data check"; state->mode = BAD; break; } INITBITS(); } state->mode = DONE; fallthrough; case DONE: ret = Z_STREAM_END; goto inf_leave; case BAD: ret = Z_DATA_ERROR; goto inf_leave; case MEM: return Z_MEM_ERROR; case SYNC: default: return Z_STREAM_ERROR; } /* Return from inflate(), updating the total counts and the check value. If there was no progress during the inflate() call, return a buffer error. Call zlib_updatewindow() to create and/or update the window state. */ inf_leave: RESTORE(); if (INFLATE_NEED_UPDATEWINDOW(strm) && (state->wsize || (state->mode < CHECK && out != strm->avail_out))) zlib_updatewindow(strm, out); in -= strm->avail_in; out -= strm->avail_out; strm->total_in += in; strm->total_out += out; state->total += out; if (INFLATE_NEED_CHECKSUM(strm) && state->wrap && out) strm->adler = state->check = UPDATE(state->check, strm->next_out - out, out); strm->data_type = state->bits + (state->last ? 64 : 0) + (state->mode == TYPE ? 128 : 0); if (flush == Z_PACKET_FLUSH && ret == Z_OK && strm->avail_out != 0 && strm->avail_in == 0) return zlib_inflateSyncPacket(strm); if (((in == 0 && out == 0) || flush == Z_FINISH) && ret == Z_OK) ret = Z_BUF_ERROR; return ret; } int zlib_inflateEnd(z_streamp strm) { if (strm == NULL || strm->state == NULL) return Z_STREAM_ERROR; return Z_OK; } /* * This subroutine adds the data at next_in/avail_in to the output history * without performing any output. The output buffer must be "caught up"; * i.e. no pending output but this should always be the case. The state must * be waiting on the start of a block (i.e. mode == TYPE or HEAD). On exit, * the output will also be caught up, and the checksum will have been updated * if need be. */ int zlib_inflateIncomp(z_stream *z) { struct inflate_state *state = (struct inflate_state *)z->state; Byte *saved_no = z->next_out; uInt saved_ao = z->avail_out; if (state->mode != TYPE && state->mode != HEAD) return Z_DATA_ERROR; /* Setup some variables to allow misuse of updateWindow */ z->avail_out = 0; z->next_out = (unsigned char*)z->next_in + z->avail_in; zlib_updatewindow(z, z->avail_in); /* Restore saved variables */ z->avail_out = saved_ao; z->next_out = saved_no; z->adler = state->check = UPDATE(state->check, z->next_in, z->avail_in); z->total_out += z->avail_in; z->total_in += z->avail_in; z->next_in += z->avail_in; state->total += z->avail_in; z->avail_in = 0; return Z_OK; } |
5 5 5 4 4 4 3 1 4 3 2 1 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 | // SPDX-License-Identifier: GPL-2.0-only /* * LED Class Core * * Copyright (C) 2005 John Lenz <lenz@cs.wisc.edu> * Copyright (C) 2005-2007 Richard Purdie <rpurdie@openedhand.com> */ #include <linux/ctype.h> #include <linux/device.h> #include <linux/err.h> #include <linux/init.h> #include <linux/kernel.h> #include <linux/leds.h> #include <linux/list.h> #include <linux/module.h> #include <linux/property.h> #include <linux/slab.h> #include <linux/spinlock.h> #include <linux/timer.h> #include <uapi/linux/uleds.h> #include <linux/of.h> #include "leds.h" static DEFINE_MUTEX(leds_lookup_lock); static LIST_HEAD(leds_lookup_list); static ssize_t brightness_show(struct device *dev, struct device_attribute *attr, char *buf) { struct led_classdev *led_cdev = dev_get_drvdata(dev); /* no lock needed for this */ led_update_brightness(led_cdev); return sprintf(buf, "%u\n", led_cdev->brightness); } static ssize_t brightness_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t size) { struct led_classdev *led_cdev = dev_get_drvdata(dev); unsigned long state; ssize_t ret; mutex_lock(&led_cdev->led_access); if (led_sysfs_is_disabled(led_cdev)) { ret = -EBUSY; goto unlock; } ret = kstrtoul(buf, 10, &state); if (ret) goto unlock; if (state == LED_OFF) led_trigger_remove(led_cdev); led_set_brightness(led_cdev, state); flush_work(&led_cdev->set_brightness_work); ret = size; unlock: mutex_unlock(&led_cdev->led_access); return ret; } static DEVICE_ATTR_RW(brightness); static ssize_t max_brightness_show(struct device *dev, struct device_attribute *attr, char *buf) { struct led_classdev *led_cdev = dev_get_drvdata(dev); return sprintf(buf, "%u\n", led_cdev->max_brightness); } static DEVICE_ATTR_RO(max_brightness); #ifdef CONFIG_LEDS_TRIGGERS static BIN_ATTR(trigger, 0644, led_trigger_read, led_trigger_write, 0); static struct bin_attribute *led_trigger_bin_attrs[] = { &bin_attr_trigger, NULL, }; static const struct attribute_group led_trigger_group = { .bin_attrs = led_trigger_bin_attrs, }; #endif static struct attribute *led_class_attrs[] = { &dev_attr_brightness.attr, &dev_attr_max_brightness.attr, NULL, }; static const struct attribute_group led_group = { .attrs = led_class_attrs, }; static const struct attribute_group *led_groups[] = { &led_group, #ifdef CONFIG_LEDS_TRIGGERS &led_trigger_group, #endif NULL, }; #ifdef CONFIG_LEDS_BRIGHTNESS_HW_CHANGED static ssize_t brightness_hw_changed_show(struct device *dev, struct device_attribute *attr, char *buf) { struct led_classdev *led_cdev = dev_get_drvdata(dev); if (led_cdev->brightness_hw_changed == -1) return -ENODATA; return sprintf(buf, "%u\n", led_cdev->brightness_hw_changed); } static DEVICE_ATTR_RO(brightness_hw_changed); static int led_add_brightness_hw_changed(struct led_classdev *led_cdev) { struct device *dev = led_cdev->dev; int ret; ret = device_create_file(dev, &dev_attr_brightness_hw_changed); if (ret) { dev_err(dev, "Error creating brightness_hw_changed\n"); return ret; } led_cdev->brightness_hw_changed_kn = sysfs_get_dirent(dev->kobj.sd, "brightness_hw_changed"); if (!led_cdev->brightness_hw_changed_kn) { dev_err(dev, "Error getting brightness_hw_changed kn\n"); device_remove_file(dev, &dev_attr_brightness_hw_changed); return -ENXIO; } return 0; } static void led_remove_brightness_hw_changed(struct led_classdev *led_cdev) { sysfs_put(led_cdev->brightness_hw_changed_kn); device_remove_file(led_cdev->dev, &dev_attr_brightness_hw_changed); } void led_classdev_notify_brightness_hw_changed(struct led_classdev *led_cdev, unsigned int brightness) { if (WARN_ON(!led_cdev->brightness_hw_changed_kn)) return; led_cdev->brightness_hw_changed = brightness; sysfs_notify_dirent(led_cdev->brightness_hw_changed_kn); } EXPORT_SYMBOL_GPL(led_classdev_notify_brightness_hw_changed); #else static int led_add_brightness_hw_changed(struct led_classdev *led_cdev) { return 0; } static void led_remove_brightness_hw_changed(struct led_classdev *led_cdev) { } #endif /** * led_classdev_suspend - suspend an led_classdev. * @led_cdev: the led_classdev to suspend. */ void led_classdev_suspend(struct led_classdev *led_cdev) { led_cdev->flags |= LED_SUSPENDED; led_set_brightness_nopm(led_cdev, 0); flush_work(&led_cdev->set_brightness_work); } EXPORT_SYMBOL_GPL(led_classdev_suspend); /** * led_classdev_resume - resume an led_classdev. * @led_cdev: the led_classdev to resume. */ void led_classdev_resume(struct led_classdev *led_cdev) { led_set_brightness_nopm(led_cdev, led_cdev->brightness); if (led_cdev->flash_resume) led_cdev->flash_resume(led_cdev); led_cdev->flags &= ~LED_SUSPENDED; } EXPORT_SYMBOL_GPL(led_classdev_resume); #ifdef CONFIG_PM_SLEEP static int led_suspend(struct device *dev) { struct led_classdev *led_cdev = dev_get_drvdata(dev); if (led_cdev->flags & LED_CORE_SUSPENDRESUME) led_classdev_suspend(led_cdev); return 0; } static int led_resume(struct device *dev) { struct led_classdev *led_cdev = dev_get_drvdata(dev); if (led_cdev->flags & LED_CORE_SUSPENDRESUME) led_classdev_resume(led_cdev); return 0; } #endif static SIMPLE_DEV_PM_OPS(leds_class_dev_pm_ops, led_suspend, led_resume); static struct led_classdev *led_module_get(struct device *led_dev) { struct led_classdev *led_cdev; if (!led_dev) return ERR_PTR(-EPROBE_DEFER); led_cdev = dev_get_drvdata(led_dev); if (!try_module_get(led_cdev->dev->parent->driver->owner)) { put_device(led_cdev->dev); return ERR_PTR(-ENODEV); } return led_cdev; } static const struct class leds_class = { .name = "leds", .dev_groups = led_groups, .pm = &leds_class_dev_pm_ops, }; /** * of_led_get() - request a LED device via the LED framework * @np: device node to get the LED device from * @index: the index of the LED * * Returns the LED device parsed from the phandle specified in the "leds" * property of a device tree node or a negative error-code on failure. */ struct led_classdev *of_led_get(struct device_node *np, int index) { struct device *led_dev; struct device_node *led_node; led_node = of_parse_phandle(np, "leds", index); if (!led_node) return ERR_PTR(-ENOENT); led_dev = class_find_device_by_of_node(&leds_class, led_node); of_node_put(led_node); put_device(led_dev); return led_module_get(led_dev); } EXPORT_SYMBOL_GPL(of_led_get); /** * led_put() - release a LED device * @led_cdev: LED device */ void led_put(struct led_classdev *led_cdev) { module_put(led_cdev->dev->parent->driver->owner); put_device(led_cdev->dev); } EXPORT_SYMBOL_GPL(led_put); static void devm_led_release(struct device *dev, void *res) { struct led_classdev **p = res; led_put(*p); } static struct led_classdev *__devm_led_get(struct device *dev, struct led_classdev *led) { struct led_classdev **dr; dr = devres_alloc(devm_led_release, sizeof(struct led_classdev *), GFP_KERNEL); if (!dr) { led_put(led); return ERR_PTR(-ENOMEM); } *dr = led; devres_add(dev, dr); return led; } /** * devm_of_led_get - Resource-managed request of a LED device * @dev: LED consumer * @index: index of the LED to obtain in the consumer * * The device node of the device is parse to find the request LED device. * The LED device returned from this function is automatically released * on driver detach. * * @return a pointer to a LED device or ERR_PTR(errno) on failure. */ struct led_classdev *__must_check devm_of_led_get(struct device *dev, int index) { struct led_classdev *led; if (!dev) return ERR_PTR(-EINVAL); led = of_led_get(dev->of_node, index); if (IS_ERR(led)) return led; return __devm_led_get(dev, led); } EXPORT_SYMBOL_GPL(devm_of_led_get); /** * led_get() - request a LED device via the LED framework * @dev: device for which to get the LED device * @con_id: name of the LED from the device's point of view * * @return a pointer to a LED device or ERR_PTR(errno) on failure. */ struct led_classdev *led_get(struct device *dev, char *con_id) { struct led_lookup_data *lookup; const char *provider = NULL; struct device *led_dev; mutex_lock(&leds_lookup_lock); list_for_each_entry(lookup, &leds_lookup_list, list) { if (!strcmp(lookup->dev_id, dev_name(dev)) && !strcmp(lookup->con_id, con_id)) { provider = kstrdup_const(lookup->provider, GFP_KERNEL); break; } } mutex_unlock(&leds_lookup_lock); if (!provider) return ERR_PTR(-ENOENT); led_dev = class_find_device_by_name(&leds_class, provider); kfree_const(provider); return led_module_get(led_dev); } EXPORT_SYMBOL_GPL(led_get); /** * devm_led_get() - request a LED device via the LED framework * @dev: device for which to get the LED device * @con_id: name of the LED from the device's point of view * * The LED device returned from this function is automatically released * on driver detach. * * @return a pointer to a LED device or ERR_PTR(errno) on failure. */ struct led_classdev *devm_led_get(struct device *dev, char *con_id) { struct led_classdev *led; led = led_get(dev, con_id); if (IS_ERR(led)) return led; return __devm_led_get(dev, led); } EXPORT_SYMBOL_GPL(devm_led_get); /** * led_add_lookup() - Add a LED lookup table entry * @led_lookup: the lookup table entry to add * * Add a LED lookup table entry. On systems without devicetree the lookup table * is used by led_get() to find LEDs. */ void led_add_lookup(struct led_lookup_data *led_lookup) { mutex_lock(&leds_lookup_lock); list_add_tail(&led_lookup->list, &leds_lookup_list); mutex_unlock(&leds_lookup_lock); } EXPORT_SYMBOL_GPL(led_add_lookup); /** * led_remove_lookup() - Remove a LED lookup table entry * @led_lookup: the lookup table entry to remove */ void led_remove_lookup(struct led_lookup_data *led_lookup) { mutex_lock(&leds_lookup_lock); list_del(&led_lookup->list); mutex_unlock(&leds_lookup_lock); } EXPORT_SYMBOL_GPL(led_remove_lookup); /** * devm_of_led_get_optional - Resource-managed request of an optional LED device * @dev: LED consumer * @index: index of the LED to obtain in the consumer * * The device node of the device is parsed to find the requested LED device. * The LED device returned from this function is automatically released * on driver detach. * * @return a pointer to a LED device, ERR_PTR(errno) on failure and NULL if the * led was not found. */ struct led_classdev *__must_check devm_of_led_get_optional(struct device *dev, int index) { struct led_classdev *led; led = devm_of_led_get(dev, index); if (IS_ERR(led) && PTR_ERR(led) == -ENOENT) return NULL; return led; } EXPORT_SYMBOL_GPL(devm_of_led_get_optional); static int led_classdev_next_name(const char *init_name, char *name, size_t len) { unsigned int i = 0; int ret = 0; struct device *dev; strscpy(name, init_name, len); while ((ret < len) && (dev = class_find_device_by_name(&leds_class, name))) { put_device(dev); ret = snprintf(name, len, "%s_%u", init_name, ++i); } if (ret >= len) return -ENOMEM; return i; } /** * led_classdev_register_ext - register a new object of led_classdev class * with init data. * * @parent: parent of LED device * @led_cdev: the led_classdev structure for this device. * @init_data: LED class device initialization data */ int led_classdev_register_ext(struct device *parent, struct led_classdev *led_cdev, struct led_init_data *init_data) { char composed_name[LED_MAX_NAME_SIZE]; char final_name[LED_MAX_NAME_SIZE]; const char *proposed_name = composed_name; int ret; if (init_data) { if (init_data->devname_mandatory && !init_data->devicename) { dev_err(parent, "Mandatory device name is missing"); return -EINVAL; } ret = led_compose_name(parent, init_data, composed_name); if (ret < 0) return ret; if (init_data->fwnode) { fwnode_property_read_string(init_data->fwnode, "linux,default-trigger", &led_cdev->default_trigger); if (fwnode_property_present(init_data->fwnode, "retain-state-shutdown")) led_cdev->flags |= LED_RETAIN_AT_SHUTDOWN; fwnode_property_read_u32(init_data->fwnode, "max-brightness", &led_cdev->max_brightness); if (fwnode_property_present(init_data->fwnode, "color")) fwnode_property_read_u32(init_data->fwnode, "color", &led_cdev->color); } } else { proposed_name = led_cdev->name; } ret = led_classdev_next_name(proposed_name, final_name, sizeof(final_name)); if (ret < 0) return ret; if (led_cdev->color >= LED_COLOR_ID_MAX) dev_warn(parent, "LED %s color identifier out of range\n", final_name); mutex_init(&led_cdev->led_access); mutex_lock(&led_cdev->led_access); led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, led_cdev, led_cdev->groups, "%s", final_name); if (IS_ERR(led_cdev->dev)) { mutex_unlock(&led_cdev->led_access); return PTR_ERR(led_cdev->dev); } if (init_data && init_data->fwnode) device_set_node(led_cdev->dev, init_data->fwnode); if (ret) dev_warn(parent, "Led %s renamed to %s due to name collision", proposed_name, dev_name(led_cdev->dev)); if (led_cdev->flags & LED_BRIGHT_HW_CHANGED) { ret = led_add_brightness_hw_changed(led_cdev); if (ret) { device_unregister(led_cdev->dev); led_cdev->dev = NULL; mutex_unlock(&led_cdev->led_access); return ret; } } led_cdev->work_flags = 0; #ifdef CONFIG_LEDS_TRIGGERS init_rwsem(&led_cdev->trigger_lock); #endif #ifdef CONFIG_LEDS_BRIGHTNESS_HW_CHANGED led_cdev->brightness_hw_changed = -1; #endif /* add to the list of leds */ down_write(&leds_list_lock); list_add_tail(&led_cdev->node, &leds_list); up_write(&leds_list_lock); if (!led_cdev->max_brightness) led_cdev->max_brightness = LED_FULL; led_update_brightness(led_cdev); led_init_core(led_cdev); #ifdef CONFIG_LEDS_TRIGGERS /* * If no default trigger was given and hw_control_trigger is set, * make it the default trigger. */ if (!led_cdev->default_trigger && led_cdev->hw_control_trigger) led_cdev->default_trigger = led_cdev->hw_control_trigger; led_trigger_set_default(led_cdev); #endif mutex_unlock(&led_cdev->led_access); dev_dbg(parent, "Registered led device: %s\n", led_cdev->name); return 0; } EXPORT_SYMBOL_GPL(led_classdev_register_ext); /** * led_classdev_unregister - unregisters a object of led_properties class. * @led_cdev: the led device to unregister * * Unregisters a previously registered via led_classdev_register object. */ void led_classdev_unregister(struct led_classdev *led_cdev) { if (IS_ERR_OR_NULL(led_cdev->dev)) return; #ifdef CONFIG_LEDS_TRIGGERS down_write(&led_cdev->trigger_lock); if (led_cdev->trigger) led_trigger_set(led_cdev, NULL); up_write(&led_cdev->trigger_lock); #endif led_cdev->flags |= LED_UNREGISTERING; /* Stop blinking */ led_stop_software_blink(led_cdev); if (!(led_cdev->flags & LED_RETAIN_AT_SHUTDOWN)) led_set_brightness(led_cdev, LED_OFF); flush_work(&led_cdev->set_brightness_work); if (led_cdev->flags & LED_BRIGHT_HW_CHANGED) led_remove_brightness_hw_changed(led_cdev); device_unregister(led_cdev->dev); down_write(&leds_list_lock); list_del(&led_cdev->node); up_write(&leds_list_lock); mutex_destroy(&led_cdev->led_access); } EXPORT_SYMBOL_GPL(led_classdev_unregister); static void devm_led_classdev_release(struct device *dev, void *res) { led_classdev_unregister(*(struct led_classdev **)res); } /** * devm_led_classdev_register_ext - resource managed led_classdev_register_ext() * * @parent: parent of LED device * @led_cdev: the led_classdev structure for this device. * @init_data: LED class device initialization data */ int devm_led_classdev_register_ext(struct device *parent, struct led_classdev *led_cdev, struct led_init_data *init_data) { struct led_classdev **dr; int rc; dr = devres_alloc(devm_led_classdev_release, sizeof(*dr), GFP_KERNEL); if (!dr) return -ENOMEM; rc = led_classdev_register_ext(parent, led_cdev, init_data); if (rc) { devres_free(dr); return rc; } *dr = led_cdev; devres_add(parent, dr); return 0; } EXPORT_SYMBOL_GPL(devm_led_classdev_register_ext); static int devm_led_classdev_match(struct device *dev, void *res, void *data) { struct led_classdev **p = res; if (WARN_ON(!p || !*p)) return 0; return *p == data; } /** * devm_led_classdev_unregister() - resource managed led_classdev_unregister() * @dev: The device to unregister. * @led_cdev: the led_classdev structure for this device. */ void devm_led_classdev_unregister(struct device *dev, struct led_classdev *led_cdev) { WARN_ON(devres_release(dev, devm_led_classdev_release, devm_led_classdev_match, led_cdev)); } EXPORT_SYMBOL_GPL(devm_led_classdev_unregister); static int __init leds_init(void) { return class_register(&leds_class); } static void __exit leds_exit(void) { class_unregister(&leds_class); } subsys_initcall(leds_init); module_exit(leds_exit); MODULE_AUTHOR("John Lenz, Richard Purdie"); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("LED Class Interface"); |
4 4 4 1 1 1 5 1 4 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | // SPDX-License-Identifier: GPL-2.0 #include <linux/fs.h> #include <linux/sched.h> #include <linux/slab.h> #include "internal.h" #include "mount.h" static DEFINE_SPINLOCK(pin_lock); void pin_remove(struct fs_pin *pin) { spin_lock(&pin_lock); hlist_del_init(&pin->m_list); hlist_del_init(&pin->s_list); spin_unlock(&pin_lock); spin_lock_irq(&pin->wait.lock); pin->done = 1; wake_up_locked(&pin->wait); spin_unlock_irq(&pin->wait.lock); } void pin_insert(struct fs_pin *pin, struct vfsmount *m) { spin_lock(&pin_lock); hlist_add_head(&pin->s_list, &m->mnt_sb->s_pins); hlist_add_head(&pin->m_list, &real_mount(m)->mnt_pins); spin_unlock(&pin_lock); } void pin_kill(struct fs_pin *p) { wait_queue_entry_t wait; if (!p) { rcu_read_unlock(); return; } init_wait(&wait); spin_lock_irq(&p->wait.lock); if (likely(!p->done)) { p->done = -1; spin_unlock_irq(&p->wait.lock); rcu_read_unlock(); p->kill(p); return; } if (p->done > 0) { spin_unlock_irq(&p->wait.lock); rcu_read_unlock(); return; } __add_wait_queue(&p->wait, &wait); while (1) { set_current_state(TASK_UNINTERRUPTIBLE); spin_unlock_irq(&p->wait.lock); rcu_read_unlock(); schedule(); rcu_read_lock(); if (likely(list_empty(&wait.entry))) break; /* OK, we know p couldn't have been freed yet */ spin_lock_irq(&p->wait.lock); if (p->done > 0) { spin_unlock_irq(&p->wait.lock); break; } } rcu_read_unlock(); } void mnt_pin_kill(struct mount *m) { while (1) { struct hlist_node *p; rcu_read_lock(); p = READ_ONCE(m->mnt_pins.first); if (!p) { rcu_read_unlock(); break; } pin_kill(hlist_entry(p, struct fs_pin, m_list)); } } void group_pin_kill(struct hlist_head *p) { while (1) { struct hlist_node *q; rcu_read_lock(); q = READ_ONCE(p->first); if (!q) { rcu_read_unlock(); break; } pin_kill(hlist_entry(q, struct fs_pin, s_list)); } } |
1 5 3 2 2 3 5 1 4 3 1 3 10 9 1 9 5 1 1 3 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 | // SPDX-License-Identifier: GPL-2.0-or-later /* * vimc-capture.c Virtual Media Controller Driver * * Copyright (C) 2015-2017 Helen Koike <helen.fornazier@gmail.com> */ #include <media/v4l2-ioctl.h> #include <media/videobuf2-core.h> #include <media/videobuf2-dma-contig.h> #include <media/videobuf2-vmalloc.h> #include "vimc-common.h" #include "vimc-streamer.h" struct vimc_capture_device { struct vimc_ent_device ved; struct video_device vdev; struct v4l2_pix_format format; struct vb2_queue queue; struct list_head buf_list; /* * NOTE: in a real driver, a spin lock must be used to access the * queue because the frames are generated from a hardware interruption * and the isr is not allowed to sleep. * Even if it is not necessary a spinlock in the vimc driver, we * use it here as a code reference */ spinlock_t qlock; struct mutex lock; u32 sequence; struct vimc_stream stream; struct media_pad pad; }; static const struct v4l2_pix_format fmt_default = { .width = 640, .height = 480, .pixelformat = V4L2_PIX_FMT_RGB24, .field = V4L2_FIELD_NONE, .colorspace = V4L2_COLORSPACE_SRGB, }; struct vimc_capture_buffer { /* * struct vb2_v4l2_buffer must be the first element * the videobuf2 framework will allocate this struct based on * buf_struct_size and use the first sizeof(struct vb2_buffer) bytes of * memory as a vb2_buffer */ struct vb2_v4l2_buffer vb2; struct list_head list; }; static int vimc_capture_querycap(struct file *file, void *priv, struct v4l2_capability *cap) { strscpy(cap->driver, VIMC_PDEV_NAME, sizeof(cap->driver)); strscpy(cap->card, KBUILD_MODNAME, sizeof(cap->card)); snprintf(cap->bus_info, sizeof(cap->bus_info), "platform:%s", VIMC_PDEV_NAME); return 0; } static void vimc_capture_get_format(struct vimc_ent_device *ved, struct v4l2_pix_format *fmt) { struct vimc_capture_device *vcapture = container_of(ved, struct vimc_capture_device, ved); *fmt = vcapture->format; } static int vimc_capture_g_fmt_vid_cap(struct file *file, void *priv, struct v4l2_format *f) { struct vimc_capture_device *vcapture = video_drvdata(file); f->fmt.pix = vcapture->format; return 0; } static int vimc_capture_try_fmt_vid_cap(struct file *file, void *priv, struct v4l2_format *f) { struct v4l2_pix_format *format = &f->fmt.pix; const struct vimc_pix_map *vpix; format->width = clamp_t(u32, format->width, VIMC_FRAME_MIN_WIDTH, VIMC_FRAME_MAX_WIDTH) & ~1; format->height = clamp_t(u32, format->height, VIMC_FRAME_MIN_HEIGHT, VIMC_FRAME_MAX_HEIGHT) & ~1; /* Don't accept a pixelformat that is not on the table */ vpix = vimc_pix_map_by_pixelformat(format->pixelformat); if (!vpix) { format->pixelformat = fmt_default.pixelformat; vpix = vimc_pix_map_by_pixelformat(format->pixelformat); } /* TODO: Add support for custom bytesperline values */ format->bytesperline = format->width * vpix->bpp; format->sizeimage = format->bytesperline * format->height; if (format->field == V4L2_FIELD_ANY) format->field = fmt_default.field; vimc_colorimetry_clamp(format); if (format->colorspace == V4L2_COLORSPACE_DEFAULT) format->colorspace = fmt_default.colorspace; return 0; } static int vimc_capture_s_fmt_vid_cap(struct file *file, void *priv, struct v4l2_format *f) { struct vimc_capture_device *vcapture = video_drvdata(file); int ret; /* Do not change the format while stream is on */ if (vb2_is_busy(&vcapture->queue)) return -EBUSY; ret = vimc_capture_try_fmt_vid_cap(file, priv, f); if (ret) return ret; dev_dbg(vcapture->ved.dev, "%s: format update: " "old:%dx%d (0x%x, %d, %d, %d, %d) " "new:%dx%d (0x%x, %d, %d, %d, %d)\n", vcapture->vdev.name, /* old */ vcapture->format.width, vcapture->format.height, vcapture->format.pixelformat, vcapture->format.colorspace, vcapture->format.quantization, vcapture->format.xfer_func, vcapture->format.ycbcr_enc, /* new */ f->fmt.pix.width, f->fmt.pix.height, f->fmt.pix.pixelformat, f->fmt.pix.colorspace, f->fmt.pix.quantization, f->fmt.pix.xfer_func, f->fmt.pix.ycbcr_enc); vcapture->format = f->fmt.pix; return 0; } static int vimc_capture_enum_fmt_vid_cap(struct file *file, void *priv, struct v4l2_fmtdesc *f) { const struct vimc_pix_map *vpix; if (f->mbus_code) { if (f->index > 0) return -EINVAL; vpix = vimc_pix_map_by_code(f->mbus_code); } else { vpix = vimc_pix_map_by_index(f->index); } if (!vpix) return -EINVAL; f->pixelformat = vpix->pixelformat; return 0; } static int vimc_capture_enum_framesizes(struct file *file, void *fh, struct v4l2_frmsizeenum *fsize) { const struct vimc_pix_map *vpix; if (fsize->index) return -EINVAL; /* Only accept code in the pix map table */ vpix = vimc_pix_map_by_code(fsize->pixel_format); if (!vpix) return -EINVAL; fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS; fsize->stepwise.min_width = VIMC_FRAME_MIN_WIDTH; fsize->stepwise.max_width = VIMC_FRAME_MAX_WIDTH; fsize->stepwise.min_height = VIMC_FRAME_MIN_HEIGHT; fsize->stepwise.max_height = VIMC_FRAME_MAX_HEIGHT; fsize->stepwise.step_width = 1; fsize->stepwise.step_height = 1; return 0; } static const struct v4l2_file_operations vimc_capture_fops = { .owner = THIS_MODULE, .open = v4l2_fh_open, .release = vb2_fop_release, .read = vb2_fop_read, .poll = vb2_fop_poll, .unlocked_ioctl = video_ioctl2, .mmap = vb2_fop_mmap, }; static const struct v4l2_ioctl_ops vimc_capture_ioctl_ops = { .vidioc_querycap = vimc_capture_querycap, .vidioc_g_fmt_vid_cap = vimc_capture_g_fmt_vid_cap, .vidioc_s_fmt_vid_cap = vimc_capture_s_fmt_vid_cap, .vidioc_try_fmt_vid_cap = vimc_capture_try_fmt_vid_cap, .vidioc_enum_fmt_vid_cap = vimc_capture_enum_fmt_vid_cap, .vidioc_enum_framesizes = vimc_capture_enum_framesizes, .vidioc_reqbufs = vb2_ioctl_reqbufs, .vidioc_create_bufs = vb2_ioctl_create_bufs, .vidioc_prepare_buf = vb2_ioctl_prepare_buf, .vidioc_querybuf = vb2_ioctl_querybuf, .vidioc_qbuf = vb2_ioctl_qbuf, .vidioc_dqbuf = vb2_ioctl_dqbuf, .vidioc_expbuf = vb2_ioctl_expbuf, .vidioc_streamon = vb2_ioctl_streamon, .vidioc_streamoff = vb2_ioctl_streamoff, }; static void vimc_capture_return_all_buffers(struct vimc_capture_device *vcapture, enum vb2_buffer_state state) { struct vimc_capture_buffer *vbuf, *node; spin_lock(&vcapture->qlock); list_for_each_entry_safe(vbuf, node, &vcapture->buf_list, list) { list_del(&vbuf->list); vb2_buffer_done(&vbuf->vb2.vb2_buf, state); } spin_unlock(&vcapture->qlock); } static int vimc_capture_start_streaming(struct vb2_queue *vq, unsigned int count) { struct vimc_capture_device *vcapture = vb2_get_drv_priv(vq); int ret; vcapture->sequence = 0; /* Start the media pipeline */ ret = video_device_pipeline_start(&vcapture->vdev, &vcapture->stream.pipe); if (ret) { vimc_capture_return_all_buffers(vcapture, VB2_BUF_STATE_QUEUED); return ret; } ret = vimc_streamer_s_stream(&vcapture->stream, &vcapture->ved, 1); if (ret) { video_device_pipeline_stop(&vcapture->vdev); vimc_capture_return_all_buffers(vcapture, VB2_BUF_STATE_QUEUED); return ret; } return 0; } /* * Stop the stream engine. Any remaining buffers in the stream queue are * dequeued and passed on to the vb2 framework marked as STATE_ERROR. */ static void vimc_capture_stop_streaming(struct vb2_queue *vq) { struct vimc_capture_device *vcapture = vb2_get_drv_priv(vq); vimc_streamer_s_stream(&vcapture->stream, &vcapture->ved, 0); /* Stop the media pipeline */ video_device_pipeline_stop(&vcapture->vdev); /* Release all active buffers */ vimc_capture_return_all_buffers(vcapture, VB2_BUF_STATE_ERROR); } static void vimc_capture_buf_queue(struct vb2_buffer *vb2_buf) { struct vimc_capture_device *vcapture = vb2_get_drv_priv(vb2_buf->vb2_queue); struct vimc_capture_buffer *buf = container_of(vb2_buf, struct vimc_capture_buffer, vb2.vb2_buf); spin_lock(&vcapture->qlock); list_add_tail(&buf->list, &vcapture->buf_list); spin_unlock(&vcapture->qlock); } static int vimc_capture_queue_setup(struct vb2_queue *vq, unsigned int *nbuffers, unsigned int *nplanes, unsigned int sizes[], struct device *alloc_devs[]) { struct vimc_capture_device *vcapture = vb2_get_drv_priv(vq); if (*nplanes) return sizes[0] < vcapture->format.sizeimage ? -EINVAL : 0; /* We don't support multiplanes for now */ *nplanes = 1; sizes[0] = vcapture->format.sizeimage; return 0; } static int vimc_capture_buffer_prepare(struct vb2_buffer *vb) { struct vimc_capture_device *vcapture = vb2_get_drv_priv(vb->vb2_queue); unsigned long size = vcapture->format.sizeimage; if (vb2_plane_size(vb, 0) < size) { dev_err(vcapture->ved.dev, "%s: buffer too small (%lu < %lu)\n", vcapture->vdev.name, vb2_plane_size(vb, 0), size); return -EINVAL; } return 0; } static const struct vb2_ops vimc_capture_qops = { .start_streaming = vimc_capture_start_streaming, .stop_streaming = vimc_capture_stop_streaming, .buf_queue = vimc_capture_buf_queue, .queue_setup = vimc_capture_queue_setup, .buf_prepare = vimc_capture_buffer_prepare, /* * Since q->lock is set we can use the standard * vb2_ops_wait_prepare/finish helper functions. */ .wait_prepare = vb2_ops_wait_prepare, .wait_finish = vb2_ops_wait_finish, }; static const struct media_entity_operations vimc_capture_mops = { .link_validate = vimc_vdev_link_validate, }; static void vimc_capture_release(struct vimc_ent_device *ved) { struct vimc_capture_device *vcapture = container_of(ved, struct vimc_capture_device, ved); media_entity_cleanup(vcapture->ved.ent); kfree(vcapture); } static void vimc_capture_unregister(struct vimc_ent_device *ved) { struct vimc_capture_device *vcapture = container_of(ved, struct vimc_capture_device, ved); vb2_video_unregister_device(&vcapture->vdev); } static void *vimc_capture_process_frame(struct vimc_ent_device *ved, const void *frame) { struct vimc_capture_device *vcapture = container_of(ved, struct vimc_capture_device, ved); struct vimc_capture_buffer *vimc_buf; void *vbuf; spin_lock(&vcapture->qlock); /* Get the first entry of the list */ vimc_buf = list_first_entry_or_null(&vcapture->buf_list, typeof(*vimc_buf), list); if (!vimc_buf) { spin_unlock(&vcapture->qlock); return ERR_PTR(-EAGAIN); } /* Remove this entry from the list */ list_del(&vimc_buf->list); spin_unlock(&vcapture->qlock); /* Fill the buffer */ vimc_buf->vb2.vb2_buf.timestamp = ktime_get_ns(); vimc_buf->vb2.sequence = vcapture->sequence++; vimc_buf->vb2.field = vcapture->format.field; vbuf = vb2_plane_vaddr(&vimc_buf->vb2.vb2_buf, 0); memcpy(vbuf, frame, vcapture->format.sizeimage); /* Set it as ready */ vb2_set_plane_payload(&vimc_buf->vb2.vb2_buf, 0, vcapture->format.sizeimage); vb2_buffer_done(&vimc_buf->vb2.vb2_buf, VB2_BUF_STATE_DONE); return NULL; } static struct vimc_ent_device *vimc_capture_add(struct vimc_device *vimc, const char *vcfg_name) { struct v4l2_device *v4l2_dev = &vimc->v4l2_dev; const struct vimc_pix_map *vpix; struct vimc_capture_device *vcapture; struct video_device *vdev; struct vb2_queue *q; int ret; /* Allocate the vimc_capture_device struct */ vcapture = kzalloc(sizeof(*vcapture), GFP_KERNEL); if (!vcapture) return ERR_PTR(-ENOMEM); /* Initialize the media entity */ vcapture->vdev.entity.name = vcfg_name; vcapture->vdev.entity.function = MEDIA_ENT_F_IO_V4L; vcapture->pad.flags = MEDIA_PAD_FL_SINK; ret = media_entity_pads_init(&vcapture->vdev.entity, 1, &vcapture->pad); if (ret) goto err_free_vcapture; /* Initialize the lock */ mutex_init(&vcapture->lock); /* Initialize the vb2 queue */ q = &vcapture->queue; q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE; q->io_modes = VB2_MMAP | VB2_DMABUF; if (vimc_allocator == VIMC_ALLOCATOR_VMALLOC) q->io_modes |= VB2_USERPTR; q->drv_priv = vcapture; q->buf_struct_size = sizeof(struct vimc_capture_buffer); q->ops = &vimc_capture_qops; q->mem_ops = vimc_allocator == VIMC_ALLOCATOR_DMA_CONTIG ? &vb2_dma_contig_memops : &vb2_vmalloc_memops; q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC; q->min_queued_buffers = 2; q->lock = &vcapture->lock; q->dev = v4l2_dev->dev; ret = vb2_queue_init(q); if (ret) { dev_err(vimc->mdev.dev, "%s: vb2 queue init failed (err=%d)\n", vcfg_name, ret); goto err_clean_m_ent; } /* Initialize buffer list and its lock */ INIT_LIST_HEAD(&vcapture->buf_list); spin_lock_init(&vcapture->qlock); /* Set default frame format */ vcapture->format = fmt_default; vpix = vimc_pix_map_by_pixelformat(vcapture->format.pixelformat); vcapture->format.bytesperline = vcapture->format.width * vpix->bpp; vcapture->format.sizeimage = vcapture->format.bytesperline * vcapture->format.height; /* Fill the vimc_ent_device struct */ vcapture->ved.ent = &vcapture->vdev.entity; vcapture->ved.process_frame = vimc_capture_process_frame; vcapture->ved.vdev_get_format = vimc_capture_get_format; vcapture->ved.dev = vimc->mdev.dev; /* Initialize the video_device struct */ vdev = &vcapture->vdev; vdev->device_caps = V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_STREAMING | V4L2_CAP_IO_MC; vdev->entity.ops = &vimc_capture_mops; vdev->release = video_device_release_empty; vdev->fops = &vimc_capture_fops; vdev->ioctl_ops = &vimc_capture_ioctl_ops; vdev->lock = &vcapture->lock; vdev->queue = q; vdev->v4l2_dev = v4l2_dev; vdev->vfl_dir = VFL_DIR_RX; strscpy(vdev->name, vcfg_name, sizeof(vdev->name)); video_set_drvdata(vdev, &vcapture->ved); /* Register the video_device with the v4l2 and the media framework */ ret = video_register_device(vdev, VFL_TYPE_VIDEO, -1); if (ret) { dev_err(vimc->mdev.dev, "%s: video register failed (err=%d)\n", vcapture->vdev.name, ret); goto err_clean_m_ent; } return &vcapture->ved; err_clean_m_ent: media_entity_cleanup(&vcapture->vdev.entity); err_free_vcapture: kfree(vcapture); return ERR_PTR(ret); } struct vimc_ent_type vimc_capture_type = { .add = vimc_capture_add, .unregister = vimc_capture_unregister, .release = vimc_capture_release }; |
6 30 30 62 62 3 64 66 60 67 66 63 3252 1 1 1 1 1 1 1 1 1 7 8 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 | // SPDX-License-Identifier: GPL-2.0 /* * Copyright(C) 2005-2006, Thomas Gleixner <tglx@linutronix.de> * Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar * Copyright(C) 2006-2007 Timesys Corp., Thomas Gleixner * * NOHZ implementation for low and high resolution timers * * Started by: Thomas Gleixner and Ingo Molnar */ #include <linux/cpu.h> #include <linux/err.h> #include <linux/hrtimer.h> #include <linux/interrupt.h> #include <linux/kernel_stat.h> #include <linux/percpu.h> #include <linux/nmi.h> #include <linux/profile.h> #include <linux/sched/signal.h> #include <linux/sched/clock.h> #include <linux/sched/stat.h> #include <linux/sched/nohz.h> #include <linux/sched/loadavg.h> #include <linux/module.h> #include <linux/irq_work.h> #include <linux/posix-timers.h> #include <linux/context_tracking.h> #include <linux/mm.h> #include <asm/irq_regs.h> #include "tick-internal.h" #include <trace/events/timer.h> /* * Per-CPU nohz control structure */ static DEFINE_PER_CPU(struct tick_sched, tick_cpu_sched); struct tick_sched *tick_get_tick_sched(int cpu) { return &per_cpu(tick_cpu_sched, cpu); } /* * The time when the last jiffy update happened. Write access must hold * jiffies_lock and jiffies_seq. tick_nohz_next_event() needs to get a * consistent view of jiffies and last_jiffies_update. */ static ktime_t last_jiffies_update; /* * Must be called with interrupts disabled ! */ static void tick_do_update_jiffies64(ktime_t now) { unsigned long ticks = 1; ktime_t delta, nextp; /* * 64-bit can do a quick check without holding the jiffies lock and * without looking at the sequence count. The smp_load_acquire() * pairs with the update done later in this function. * * 32-bit cannot do that because the store of 'tick_next_period' * consists of two 32-bit stores, and the first store could be * moved by the CPU to a random point in the future. */ if (IS_ENABLED(CONFIG_64BIT)) { if (ktime_before(now, smp_load_acquire(&tick_next_period))) return; } else { unsigned int seq; /* * Avoid contention on 'jiffies_lock' and protect the quick * check with the sequence count. */ do { seq = read_seqcount_begin(&jiffies_seq); nextp = tick_next_period; } while (read_seqcount_retry(&jiffies_seq, seq)); if (ktime_before(now, nextp)) return; } /* Quick check failed, i.e. update is required. */ raw_spin_lock(&jiffies_lock); /* * Re-evaluate with the lock held. Another CPU might have done the * update already. */ if (ktime_before(now, tick_next_period)) { raw_spin_unlock(&jiffies_lock); return; } write_seqcount_begin(&jiffies_seq); delta = ktime_sub(now, tick_next_period); if (unlikely(delta >= TICK_NSEC)) { /* Slow path for long idle sleep times */ s64 incr = TICK_NSEC; ticks += ktime_divns(delta, incr); last_jiffies_update = ktime_add_ns(last_jiffies_update, incr * ticks); } else { last_jiffies_update = ktime_add_ns(last_jiffies_update, TICK_NSEC); } /* Advance jiffies to complete the 'jiffies_seq' protected job */ jiffies_64 += ticks; /* Keep the tick_next_period variable up to date */ nextp = ktime_add_ns(last_jiffies_update, TICK_NSEC); if (IS_ENABLED(CONFIG_64BIT)) { /* * Pairs with smp_load_acquire() in the lockless quick * check above, and ensures that the update to 'jiffies_64' is * not reordered vs. the store to 'tick_next_period', neither * by the compiler nor by the CPU. */ smp_store_release(&tick_next_period, nextp); } else { /* * A plain store is good enough on 32-bit, as the quick check * above is protected by the sequence count. */ tick_next_period = nextp; } /* * Release the sequence count. calc_global_load() below is not * protected by it, but 'jiffies_lock' needs to be held to prevent * concurrent invocations. */ write_seqcount_end(&jiffies_seq); calc_global_load(); raw_spin_unlock(&jiffies_lock); update_wall_time(); } /* * Initialize and return retrieve the jiffies update. */ static ktime_t tick_init_jiffy_update(void) { ktime_t period; raw_spin_lock(&jiffies_lock); write_seqcount_begin(&jiffies_seq); /* Have we started the jiffies update yet ? */ if (last_jiffies_update == 0) { u32 rem; /* * Ensure that the tick is aligned to a multiple of * TICK_NSEC. */ div_u64_rem(tick_next_period, TICK_NSEC, &rem); if (rem) tick_next_period += TICK_NSEC - rem; last_jiffies_update = tick_next_period; } period = last_jiffies_update; write_seqcount_end(&jiffies_seq); raw_spin_unlock(&jiffies_lock); return period; } static inline int tick_sched_flag_test(struct tick_sched *ts, unsigned long flag) { return !!(ts->flags & flag); } static inline void tick_sched_flag_set(struct tick_sched *ts, unsigned long flag) { lockdep_assert_irqs_disabled(); ts->flags |= flag; } static inline void tick_sched_flag_clear(struct tick_sched *ts, unsigned long flag) { lockdep_assert_irqs_disabled(); ts->flags &= ~flag; } #define MAX_STALLED_JIFFIES 5 static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now) { int cpu = smp_processor_id(); /* * Check if the do_timer duty was dropped. We don't care about * concurrency: This happens only when the CPU in charge went * into a long sleep. If two CPUs happen to assign themselves to * this duty, then the jiffies update is still serialized by * 'jiffies_lock'. * * If nohz_full is enabled, this should not happen because the * 'tick_do_timer_cpu' CPU never relinquishes. */ if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) { #ifdef CONFIG_NO_HZ_FULL WARN_ON_ONCE(tick_nohz_full_running); #endif tick_do_timer_cpu = cpu; } /* Check if jiffies need an update */ if (tick_do_timer_cpu == cpu) tick_do_update_jiffies64(now); /* * If the jiffies update stalled for too long (timekeeper in stop_machine() * or VMEXIT'ed for several msecs), force an update. */ if (ts->last_tick_jiffies != jiffies) { ts->stalled_jiffies = 0; ts->last_tick_jiffies = READ_ONCE(jiffies); } else { if (++ts->stalled_jiffies == MAX_STALLED_JIFFIES) { tick_do_update_jiffies64(now); ts->stalled_jiffies = 0; ts->last_tick_jiffies = READ_ONCE(jiffies); } } if (tick_sched_flag_test(ts, TS_FLAG_INIDLE)) ts->got_idle_tick = 1; } static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs) { /* * When we are idle and the tick is stopped, we have to touch * the watchdog as we might not schedule for a really long * time. This happens on completely idle SMP systems while * waiting on the login prompt. We also increment the "start of * idle" jiffy stamp so the idle accounting adjustment we do * when we go busy again does not account too many ticks. */ if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && tick_sched_flag_test(ts, TS_FLAG_STOPPED)) { touch_softlockup_watchdog_sched(); if (is_idle_task(current)) ts->idle_jiffies++; /* * In case the current tick fired too early past its expected * expiration, make sure we don't bypass the next clock reprogramming * to the same deadline. */ ts->next_tick = 0; } update_process_times(user_mode(regs)); profile_tick(CPU_PROFILING); } /* * We rearm the timer until we get disabled by the idle code. * Called with interrupts disabled. */ static enum hrtimer_restart tick_nohz_handler(struct hrtimer *timer) { struct tick_sched *ts = container_of(timer, struct tick_sched, sched_timer); struct pt_regs *regs = get_irq_regs(); ktime_t now = ktime_get(); tick_sched_do_timer(ts, now); /* * Do not call when we are not in IRQ context and have * no valid 'regs' pointer */ if (regs) tick_sched_handle(ts, regs); else ts->next_tick = 0; /* * In dynticks mode, tick reprogram is deferred: * - to the idle task if in dynticks-idle * - to IRQ exit if in full-dynticks. */ if (unlikely(tick_sched_flag_test(ts, TS_FLAG_STOPPED))) return HRTIMER_NORESTART; hrtimer_forward(timer, now, TICK_NSEC); return HRTIMER_RESTART; } static void tick_sched_timer_cancel(struct tick_sched *ts) { if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES)) hrtimer_cancel(&ts->sched_timer); else if (tick_sched_flag_test(ts, TS_FLAG_NOHZ)) tick_program_event(KTIME_MAX, 1); } #ifdef CONFIG_NO_HZ_FULL cpumask_var_t tick_nohz_full_mask; EXPORT_SYMBOL_GPL(tick_nohz_full_mask); bool tick_nohz_full_running; EXPORT_SYMBOL_GPL(tick_nohz_full_running); static atomic_t tick_dep_mask; static bool check_tick_dependency(atomic_t *dep) { int val = atomic_read(dep); if (val & TICK_DEP_MASK_POSIX_TIMER) { trace_tick_stop(0, TICK_DEP_MASK_POSIX_TIMER); return true; } if (val & TICK_DEP_MASK_PERF_EVENTS) { trace_tick_stop(0, TICK_DEP_MASK_PERF_EVENTS); return true; } if (val & TICK_DEP_MASK_SCHED) { trace_tick_stop(0, TICK_DEP_MASK_SCHED); return true; } if (val & TICK_DEP_MASK_CLOCK_UNSTABLE) { trace_tick_stop(0, TICK_DEP_MASK_CLOCK_UNSTABLE); return true; } if (val & TICK_DEP_MASK_RCU) { trace_tick_stop(0, TICK_DEP_MASK_RCU); return true; } if (val & TICK_DEP_MASK_RCU_EXP) { trace_tick_stop(0, TICK_DEP_MASK_RCU_EXP); return true; } return false; } static bool can_stop_full_tick(int cpu, struct tick_sched *ts) { lockdep_assert_irqs_disabled(); if (unlikely(!cpu_online(cpu))) return false; if (check_tick_dependency(&tick_dep_mask)) return false; if (check_tick_dependency(&ts->tick_dep_mask)) return false; if (check_tick_dependency(¤t->tick_dep_mask)) return false; if (check_tick_dependency(¤t->signal->tick_dep_mask)) return false; return true; } static void nohz_full_kick_func(struct irq_work *work) { /* Empty, the tick restart happens on tick_nohz_irq_exit() */ } static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = IRQ_WORK_INIT_HARD(nohz_full_kick_func); /* * Kick this CPU if it's full dynticks in order to force it to * re-evaluate its dependency on the tick and restart it if necessary. * This kick, unlike tick_nohz_full_kick_cpu() and tick_nohz_full_kick_all(), * is NMI safe. */ static void tick_nohz_full_kick(void) { if (!tick_nohz_full_cpu(smp_processor_id())) return; irq_work_queue(this_cpu_ptr(&nohz_full_kick_work)); } /* * Kick the CPU if it's full dynticks in order to force it to * re-evaluate its dependency on the tick and restart it if necessary. */ void tick_nohz_full_kick_cpu(int cpu) { if (!tick_nohz_full_cpu(cpu)) return; irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu); } static void tick_nohz_kick_task(struct task_struct *tsk) { int cpu; /* * If the task is not running, run_posix_cpu_timers() * has nothing to elapse, and an IPI can then be optimized out. * * activate_task() STORE p->tick_dep_mask * STORE p->on_rq * __schedule() (switch to task 'p') smp_mb() (atomic_fetch_or()) * LOCK rq->lock LOAD p->on_rq * smp_mb__after_spin_lock() * tick_nohz_task_switch() * LOAD p->tick_dep_mask */ if (!sched_task_on_rq(tsk)) return; /* * If the task concurrently migrates to another CPU, * we guarantee it sees the new tick dependency upon * schedule. * * set_task_cpu(p, cpu); * STORE p->cpu = @cpu * __schedule() (switch to task 'p') * LOCK rq->lock * smp_mb__after_spin_lock() STORE p->tick_dep_mask * tick_nohz_task_switch() smp_mb() (atomic_fetch_or()) * LOAD p->tick_dep_mask LOAD p->cpu */ cpu = task_cpu(tsk); preempt_disable(); if (cpu_online(cpu)) tick_nohz_full_kick_cpu(cpu); preempt_enable(); } /* * Kick all full dynticks CPUs in order to force these to re-evaluate * their dependency on the tick and restart it if necessary. */ static void tick_nohz_full_kick_all(void) { int cpu; if (!tick_nohz_full_running) return; preempt_disable(); for_each_cpu_and(cpu, tick_nohz_full_mask, cpu_online_mask) tick_nohz_full_kick_cpu(cpu); preempt_enable(); } static void tick_nohz_dep_set_all(atomic_t *dep, enum tick_dep_bits bit) { int prev; prev = atomic_fetch_or(BIT(bit), dep); if (!prev) tick_nohz_full_kick_all(); } /* * Set a global tick dependency. Used by perf events that rely on freq and * unstable clocks. */ void tick_nohz_dep_set(enum tick_dep_bits bit) { tick_nohz_dep_set_all(&tick_dep_mask, bit); } void tick_nohz_dep_clear(enum tick_dep_bits bit) { atomic_andnot(BIT(bit), &tick_dep_mask); } /* * Set per-CPU tick dependency. Used by scheduler and perf events in order to * manage event-throttling. */ void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) { int prev; struct tick_sched *ts; ts = per_cpu_ptr(&tick_cpu_sched, cpu); prev = atomic_fetch_or(BIT(bit), &ts->tick_dep_mask); if (!prev) { preempt_disable(); /* Perf needs local kick that is NMI safe */ if (cpu == smp_processor_id()) { tick_nohz_full_kick(); } else { /* Remote IRQ work not NMI-safe */ if (!WARN_ON_ONCE(in_nmi())) tick_nohz_full_kick_cpu(cpu); } preempt_enable(); } } EXPORT_SYMBOL_GPL(tick_nohz_dep_set_cpu); void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) { struct tick_sched *ts = per_cpu_ptr(&tick_cpu_sched, cpu); atomic_andnot(BIT(bit), &ts->tick_dep_mask); } EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_cpu); /* * Set a per-task tick dependency. RCU needs this. Also posix CPU timers * in order to elapse per task timers. */ void tick_nohz_dep_set_task(struct task_struct *tsk, enum tick_dep_bits bit) { if (!atomic_fetch_or(BIT(bit), &tsk->tick_dep_mask)) tick_nohz_kick_task(tsk); } EXPORT_SYMBOL_GPL(tick_nohz_dep_set_task); void tick_nohz_dep_clear_task(struct task_struct *tsk, enum tick_dep_bits bit) { atomic_andnot(BIT(bit), &tsk->tick_dep_mask); } EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_task); /* * Set a per-taskgroup tick dependency. Posix CPU timers need this in order to elapse * per process timers. */ void tick_nohz_dep_set_signal(struct task_struct *tsk, enum tick_dep_bits bit) { int prev; struct signal_struct *sig = tsk->signal; prev = atomic_fetch_or(BIT(bit), &sig->tick_dep_mask); if (!prev) { struct task_struct *t; lockdep_assert_held(&tsk->sighand->siglock); __for_each_thread(sig, t) tick_nohz_kick_task(t); } } void tick_nohz_dep_clear_signal(struct signal_struct *sig, enum tick_dep_bits bit) { atomic_andnot(BIT(bit), &sig->tick_dep_mask); } /* * Re-evaluate the need for the tick as we switch the current task. * It might need the tick due to per task/process properties: * perf events, posix CPU timers, ... */ void __tick_nohz_task_switch(void) { struct tick_sched *ts; if (!tick_nohz_full_cpu(smp_processor_id())) return; ts = this_cpu_ptr(&tick_cpu_sched); if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) { if (atomic_read(¤t->tick_dep_mask) || atomic_read(¤t->signal->tick_dep_mask)) tick_nohz_full_kick(); } } /* Get the boot-time nohz CPU list from the kernel parameters. */ void __init tick_nohz_full_setup(cpumask_var_t cpumask) { alloc_bootmem_cpumask_var(&tick_nohz_full_mask); cpumask_copy(tick_nohz_full_mask, cpumask); tick_nohz_full_running = true; } bool tick_nohz_cpu_hotpluggable(unsigned int cpu) { /* * The 'tick_do_timer_cpu' CPU handles housekeeping duty (unbound * timers, workqueues, timekeeping, ...) on behalf of full dynticks * CPUs. It must remain online when nohz full is enabled. */ if (tick_nohz_full_running && tick_do_timer_cpu == cpu) return false; return true; } static int tick_nohz_cpu_down(unsigned int cpu) { return tick_nohz_cpu_hotpluggable(cpu) ? 0 : -EBUSY; } void __init tick_nohz_init(void) { int cpu, ret; if (!tick_nohz_full_running) return; /* * Full dynticks uses IRQ work to drive the tick rescheduling on safe * locking contexts. But then we need IRQ work to raise its own * interrupts to avoid circular dependency on the tick. */ if (!arch_irq_work_has_interrupt()) { pr_warn("NO_HZ: Can't run full dynticks because arch doesn't support IRQ work self-IPIs\n"); cpumask_clear(tick_nohz_full_mask); tick_nohz_full_running = false; return; } if (IS_ENABLED(CONFIG_PM_SLEEP_SMP) && !IS_ENABLED(CONFIG_PM_SLEEP_SMP_NONZERO_CPU)) { cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warn("NO_HZ: Clearing %d from nohz_full range " "for timekeeping\n", cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } } for_each_cpu(cpu, tick_nohz_full_mask) ct_cpu_track_user(cpu); ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "kernel/nohz:predown", NULL, tick_nohz_cpu_down); WARN_ON(ret < 0); pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n", cpumask_pr_args(tick_nohz_full_mask)); } #endif /* #ifdef CONFIG_NO_HZ_FULL */ /* * NOHZ - aka dynamic tick functionality */ #ifdef CONFIG_NO_HZ_COMMON /* * NO HZ enabled ? */ bool tick_nohz_enabled __read_mostly = true; unsigned long tick_nohz_active __read_mostly; /* * Enable / Disable tickless mode */ static int __init setup_tick_nohz(char *str) { return (kstrtobool(str, &tick_nohz_enabled) == 0); } __setup("nohz=", setup_tick_nohz); bool tick_nohz_tick_stopped(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); return tick_sched_flag_test(ts, TS_FLAG_STOPPED); } bool tick_nohz_tick_stopped_cpu(int cpu) { struct tick_sched *ts = per_cpu_ptr(&tick_cpu_sched, cpu); return tick_sched_flag_test(ts, TS_FLAG_STOPPED); } /** * tick_nohz_update_jiffies - update jiffies when idle was interrupted * * Called from interrupt entry when the CPU was idle * * In case the sched_tick was stopped on this CPU, we have to check if jiffies * must be updated. Otherwise an interrupt handler could use a stale jiffy * value. We do this unconditionally on any CPU, as we don't know whether the * CPU, which has the update task assigned, is in a long sleep. */ static void tick_nohz_update_jiffies(ktime_t now) { unsigned long flags; __this_cpu_write(tick_cpu_sched.idle_waketime, now); local_irq_save(flags); tick_do_update_jiffies64(now); local_irq_restore(flags); touch_softlockup_watchdog_sched(); } static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now) { ktime_t delta; if (WARN_ON_ONCE(!tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE))) return; delta = ktime_sub(now, ts->idle_entrytime); write_seqcount_begin(&ts->idle_sleeptime_seq); if (nr_iowait_cpu(smp_processor_id()) > 0) ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta); else ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta); ts->idle_entrytime = now; tick_sched_flag_clear(ts, TS_FLAG_IDLE_ACTIVE); write_seqcount_end(&ts->idle_sleeptime_seq); sched_clock_idle_wakeup_event(); } static void tick_nohz_start_idle(struct tick_sched *ts) { write_seqcount_begin(&ts->idle_sleeptime_seq); ts->idle_entrytime = ktime_get(); tick_sched_flag_set(ts, TS_FLAG_IDLE_ACTIVE); write_seqcount_end(&ts->idle_sleeptime_seq); sched_clock_idle_sleep_event(); } static u64 get_cpu_sleep_time_us(struct tick_sched *ts, ktime_t *sleeptime, bool compute_delta, u64 *last_update_time) { ktime_t now, idle; unsigned int seq; if (!tick_nohz_active) return -1; now = ktime_get(); if (last_update_time) *last_update_time = ktime_to_us(now); do { seq = read_seqcount_begin(&ts->idle_sleeptime_seq); if (tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE) && compute_delta) { ktime_t delta = ktime_sub(now, ts->idle_entrytime); idle = ktime_add(*sleeptime, delta); } else { idle = *sleeptime; } } while (read_seqcount_retry(&ts->idle_sleeptime_seq, seq)); return ktime_to_us(idle); } /** * get_cpu_idle_time_us - get the total idle time of a CPU * @cpu: CPU number to query * @last_update_time: variable to store update time in. Do not update * counters if NULL. * * Return the cumulative idle time (since boot) for a given * CPU, in microseconds. Note that this is partially broken due to * the counter of iowait tasks that can be remotely updated without * any synchronization. Therefore it is possible to observe backward * values within two consecutive reads. * * This time is measured via accounting rather than sampling, * and is as accurate as ktime_get() is. * * This function returns -1 if NOHZ is not enabled. */ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time) { struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu); return get_cpu_sleep_time_us(ts, &ts->idle_sleeptime, !nr_iowait_cpu(cpu), last_update_time); } EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); /** * get_cpu_iowait_time_us - get the total iowait time of a CPU * @cpu: CPU number to query * @last_update_time: variable to store update time in. Do not update * counters if NULL. * * Return the cumulative iowait time (since boot) for a given * CPU, in microseconds. Note this is partially broken due to * the counter of iowait tasks that can be remotely updated without * any synchronization. Therefore it is possible to observe backward * values within two consecutive reads. * * This time is measured via accounting rather than sampling, * and is as accurate as ktime_get() is. * * This function returns -1 if NOHZ is not enabled. */ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time) { struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu); return get_cpu_sleep_time_us(ts, &ts->iowait_sleeptime, nr_iowait_cpu(cpu), last_update_time); } EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us); static void tick_nohz_restart(struct tick_sched *ts, ktime_t now) { hrtimer_cancel(&ts->sched_timer); hrtimer_set_expires(&ts->sched_timer, ts->last_tick); /* Forward the time to expire in the future */ hrtimer_forward(&ts->sched_timer, now, TICK_NSEC); if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES)) { hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED_HARD); } else { tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1); } /* * Reset to make sure the next tick stop doesn't get fooled by past * cached clock deadline. */ ts->next_tick = 0; } static inline bool local_timer_softirq_pending(void) { return local_softirq_pending() & BIT(TIMER_SOFTIRQ); } /* * Read jiffies and the time when jiffies were updated last */ u64 get_jiffies_update(unsigned long *basej) { unsigned long basejiff; unsigned int seq; u64 basemono; do { seq = read_seqcount_begin(&jiffies_seq); basemono = last_jiffies_update; basejiff = jiffies; } while (read_seqcount_retry(&jiffies_seq, seq)); *basej = basejiff; return basemono; } /** * tick_nohz_next_event() - return the clock monotonic based next event * @ts: pointer to tick_sched struct * @cpu: CPU number * * Return: * *%0 - When the next event is a maximum of TICK_NSEC in the future * and the tick is not stopped yet * *%next_event - Next event based on clock monotonic */ static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) { u64 basemono, next_tick, delta, expires; unsigned long basejiff; basemono = get_jiffies_update(&basejiff); ts->last_jiffies = basejiff; ts->timer_expires_base = basemono; /* * Keep the periodic tick, when RCU, architecture or irq_work * requests it. * Aside of that, check whether the local timer softirq is * pending. If so, its a bad idea to call get_next_timer_interrupt(), * because there is an already expired timer, so it will request * immediate expiry, which rearms the hardware timer with a * minimal delta, which brings us back to this place * immediately. Lather, rinse and repeat... */ if (rcu_needs_cpu() || arch_needs_cpu() || irq_work_needs_cpu() || local_timer_softirq_pending()) { next_tick = basemono + TICK_NSEC; } else { /* * Get the next pending timer. If high resolution * timers are enabled this only takes the timer wheel * timers into account. If high resolution timers are * disabled this also looks at the next expiring * hrtimer. */ next_tick = get_next_timer_interrupt(basejiff, basemono); ts->next_timer = next_tick; } /* Make sure next_tick is never before basemono! */ if (WARN_ON_ONCE(basemono > next_tick)) next_tick = basemono; /* * If the tick is due in the next period, keep it ticking or * force prod the timer. */ delta = next_tick - basemono; if (delta <= (u64)TICK_NSEC) { /* * We've not stopped the tick yet, and there's a timer in the * next period, so no point in stopping it either, bail. */ if (!tick_sched_flag_test(ts, TS_FLAG_STOPPED)) { ts->timer_expires = 0; goto out; } } /* * If this CPU is the one which had the do_timer() duty last, we limit * the sleep time to the timekeeping 'max_deferment' value. * Otherwise we can sleep as long as we want. */ delta = timekeeping_max_deferment(); if (cpu != tick_do_timer_cpu && (tick_do_timer_cpu != TICK_DO_TIMER_NONE || !tick_sched_flag_test(ts, TS_FLAG_DO_TIMER_LAST))) delta = KTIME_MAX; /* Calculate the next expiry time */ if (delta < (KTIME_MAX - basemono)) expires = basemono + delta; else expires = KTIME_MAX; ts->timer_expires = min_t(u64, expires, next_tick); out: return ts->timer_expires; } static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu) { struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev); unsigned long basejiff = ts->last_jiffies; u64 basemono = ts->timer_expires_base; bool timer_idle = tick_sched_flag_test(ts, TS_FLAG_STOPPED); u64 expires; /* Make sure we won't be trying to stop it twice in a row. */ ts->timer_expires_base = 0; /* * Now the tick should be stopped definitely - so the timer base needs * to be marked idle as well to not miss a newly queued timer. */ expires = timer_base_try_to_set_idle(basejiff, basemono, &timer_idle); if (expires > ts->timer_expires) { /* * This path could only happen when the first timer was removed * between calculating the possible sleep length and now (when * high resolution mode is not active, timer could also be a * hrtimer). * * We have to stick to the original calculated expiry value to * not stop the tick for too long with a shallow C-state (which * was programmed by cpuidle because of an early next expiration * value). */ expires = ts->timer_expires; } /* If the timer base is not idle, retain the not yet stopped tick. */ if (!timer_idle) return; /* * If this CPU is the one which updates jiffies, then give up * the assignment and let it be taken by the CPU which runs * the tick timer next, which might be this CPU as well. If we * don't drop this here, the jiffies might be stale and * do_timer() never gets invoked. Keep track of the fact that it * was the one which had the do_timer() duty last. */ if (cpu == tick_do_timer_cpu) { tick_do_timer_cpu = TICK_DO_TIMER_NONE; tick_sched_flag_set(ts, TS_FLAG_DO_TIMER_LAST); } else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) { tick_sched_flag_clear(ts, TS_FLAG_DO_TIMER_LAST); } /* Skip reprogram of event if it's not changed */ if (tick_sched_flag_test(ts, TS_FLAG_STOPPED) && (expires == ts->next_tick)) { /* Sanity check: make sure clockevent is actually programmed */ if (expires == KTIME_MAX || ts->next_tick == hrtimer_get_expires(&ts->sched_timer)) return; WARN_ON_ONCE(1); printk_once("basemono: %llu ts->next_tick: %llu dev->next_event: %llu timer->active: %d timer->expires: %llu\n", basemono, ts->next_tick, dev->next_event, hrtimer_active(&ts->sched_timer), hrtimer_get_expires(&ts->sched_timer)); } /* * tick_nohz_stop_tick() can be called several times before * tick_nohz_restart_sched_tick() is called. This happens when * interrupts arrive which do not cause a reschedule. In the first * call we save the current tick time, so we can restart the * scheduler tick in tick_nohz_restart_sched_tick(). */ if (!tick_sched_flag_test(ts, TS_FLAG_STOPPED)) { calc_load_nohz_start(); quiet_vmstat(); ts->last_tick = hrtimer_get_expires(&ts->sched_timer); tick_sched_flag_set(ts, TS_FLAG_STOPPED); trace_tick_stop(1, TICK_DEP_MASK_NONE); } ts->next_tick = expires; /* * If the expiration time == KTIME_MAX, then we simply stop * the tick timer. */ if (unlikely(expires == KTIME_MAX)) { tick_sched_timer_cancel(ts); return; } if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES)) { hrtimer_start(&ts->sched_timer, expires, HRTIMER_MODE_ABS_PINNED_HARD); } else { hrtimer_set_expires(&ts->sched_timer, expires); tick_program_event(expires, 1); } } static void tick_nohz_retain_tick(struct tick_sched *ts) { ts->timer_expires_base = 0; } #ifdef CONFIG_NO_HZ_FULL static void tick_nohz_full_stop_tick(struct tick_sched *ts, int cpu) { if (tick_nohz_next_event(ts, cpu)) tick_nohz_stop_tick(ts, cpu); else tick_nohz_retain_tick(ts); } #endif /* CONFIG_NO_HZ_FULL */ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now) { /* Update jiffies first */ tick_do_update_jiffies64(now); /* * Clear the timer idle flag, so we avoid IPIs on remote queueing and * the clock forward checks in the enqueue path: */ timer_clear_idle(); calc_load_nohz_stop(); touch_softlockup_watchdog_sched(); /* Cancel the scheduled timer and restore the tick: */ tick_sched_flag_clear(ts, TS_FLAG_STOPPED); tick_nohz_restart(ts, now); } static void __tick_nohz_full_update_tick(struct tick_sched *ts, ktime_t now) { #ifdef CONFIG_NO_HZ_FULL int cpu = smp_processor_id(); if (can_stop_full_tick(cpu, ts)) tick_nohz_full_stop_tick(ts, cpu); else if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) tick_nohz_restart_sched_tick(ts, now); #endif } static void tick_nohz_full_update_tick(struct tick_sched *ts) { if (!tick_nohz_full_cpu(smp_processor_id())) return; if (!tick_sched_flag_test(ts, TS_FLAG_NOHZ)) return; __tick_nohz_full_update_tick(ts, ktime_get()); } /* * A pending softirq outside an IRQ (or softirq disabled section) context * should be waiting for ksoftirqd to handle it. Therefore we shouldn't * reach this code due to the need_resched() early check in can_stop_idle_tick(). * * However if we are between CPUHP_AP_SMPBOOT_THREADS and CPU_TEARDOWN_CPU on the * cpu_down() process, softirqs can still be raised while ksoftirqd is parked, * triggering the code below, since wakep_softirqd() is ignored. * */ static bool report_idle_softirq(void) { static int ratelimit; unsigned int pending = local_softirq_pending(); if (likely(!pending)) return false; /* Some softirqs claim to be safe against hotplug and ksoftirqd parking */ if (!cpu_active(smp_processor_id())) { pending &= ~SOFTIRQ_HOTPLUG_SAFE_MASK; if (!pending) return false; } if (ratelimit >= 10) return false; /* On RT, softirq handling may be waiting on some lock */ if (local_bh_blocked()) return false; pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n", pending); ratelimit++; return true; } static bool can_stop_idle_tick(int cpu, struct tick_sched *ts) { WARN_ON_ONCE(cpu_is_offline(cpu)); if (unlikely(!tick_sched_flag_test(ts, TS_FLAG_NOHZ))) return false; if (need_resched()) return false; if (unlikely(report_idle_softirq())) return false; if (tick_nohz_full_enabled()) { /* * Keep the tick alive to guarantee timekeeping progression * if there are full dynticks CPUs around */ if (tick_do_timer_cpu == cpu) return false; /* Should not happen for nohz-full */ if (WARN_ON_ONCE(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) return false; } return true; } /** * tick_nohz_idle_stop_tick - stop the idle tick from the idle task * * When the next event is more than a tick into the future, stop the idle tick */ void tick_nohz_idle_stop_tick(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); int cpu = smp_processor_id(); ktime_t expires; /* * If tick_nohz_get_sleep_length() ran tick_nohz_next_event(), the * tick timer expiration time is known already. */ if (ts->timer_expires_base) expires = ts->timer_expires; else if (can_stop_idle_tick(cpu, ts)) expires = tick_nohz_next_event(ts, cpu); else return; ts->idle_calls++; if (expires > 0LL) { int was_stopped = tick_sched_flag_test(ts, TS_FLAG_STOPPED); tick_nohz_stop_tick(ts, cpu); ts->idle_sleeps++; ts->idle_expires = expires; if (!was_stopped && tick_sched_flag_test(ts, TS_FLAG_STOPPED)) { ts->idle_jiffies = ts->last_jiffies; nohz_balance_enter_idle(cpu); } } else { tick_nohz_retain_tick(ts); } } void tick_nohz_idle_retain_tick(void) { tick_nohz_retain_tick(this_cpu_ptr(&tick_cpu_sched)); } /** * tick_nohz_idle_enter - prepare for entering idle on the current CPU * * Called when we start the idle loop. */ void tick_nohz_idle_enter(void) { struct tick_sched *ts; lockdep_assert_irqs_enabled(); local_irq_disable(); ts = this_cpu_ptr(&tick_cpu_sched); WARN_ON_ONCE(ts->timer_expires_base); tick_sched_flag_set(ts, TS_FLAG_INIDLE); tick_nohz_start_idle(ts); local_irq_enable(); } /** * tick_nohz_irq_exit - Notify the tick about IRQ exit * * A timer may have been added/modified/deleted either by the current IRQ, * or by another place using this IRQ as a notification. This IRQ may have * also updated the RCU callback list. These events may require a * re-evaluation of the next tick. Depending on the context: * * 1) If the CPU is idle and no resched is pending, just proceed with idle * time accounting. The next tick will be re-evaluated on the next idle * loop iteration. * * 2) If the CPU is nohz_full: * * 2.1) If there is any tick dependency, restart the tick if stopped. * * 2.2) If there is no tick dependency, (re-)evaluate the next tick and * stop/update it accordingly. */ void tick_nohz_irq_exit(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); if (tick_sched_flag_test(ts, TS_FLAG_INIDLE)) tick_nohz_start_idle(ts); else tick_nohz_full_update_tick(ts); } /** * tick_nohz_idle_got_tick - Check whether or not the tick handler has run */ bool tick_nohz_idle_got_tick(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); if (ts->got_idle_tick) { ts->got_idle_tick = 0; return true; } return false; } /** * tick_nohz_get_next_hrtimer - return the next expiration time for the hrtimer * or the tick, whichever expires first. Note that, if the tick has been * stopped, it returns the next hrtimer. * * Called from power state control code with interrupts disabled */ ktime_t tick_nohz_get_next_hrtimer(void) { return __this_cpu_read(tick_cpu_device.evtdev)->next_event; } /** * tick_nohz_get_sleep_length - return the expected length of the current sleep * @delta_next: duration until the next event if the tick cannot be stopped * * Called from power state control code with interrupts disabled. * * The return value of this function and/or the value returned by it through the * @delta_next pointer can be negative which must be taken into account by its * callers. */ ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next) { struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev); struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); int cpu = smp_processor_id(); /* * The idle entry time is expected to be a sufficient approximation of * the current time at this point. */ ktime_t now = ts->idle_entrytime; ktime_t next_event; WARN_ON_ONCE(!tick_sched_flag_test(ts, TS_FLAG_INIDLE)); *delta_next = ktime_sub(dev->next_event, now); if (!can_stop_idle_tick(cpu, ts)) return *delta_next; next_event = tick_nohz_next_event(ts, cpu); if (!next_event) return *delta_next; /* * If the next highres timer to expire is earlier than 'next_event', the * idle governor needs to know that. */ next_event = min_t(u64, next_event, hrtimer_next_event_without(&ts->sched_timer)); return ktime_sub(next_event, now); } /** * tick_nohz_get_idle_calls_cpu - return the current idle calls counter value * for a particular CPU. * * Called from the schedutil frequency scaling governor in scheduler context. */ unsigned long tick_nohz_get_idle_calls_cpu(int cpu) { struct tick_sched *ts = tick_get_tick_sched(cpu); return ts->idle_calls; } /** * tick_nohz_get_idle_calls - return the current idle calls counter value * * Called from the schedutil frequency scaling governor in scheduler context. */ unsigned long tick_nohz_get_idle_calls(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); return ts->idle_calls; } static void tick_nohz_account_idle_time(struct tick_sched *ts, ktime_t now) { unsigned long ticks; ts->idle_exittime = now; if (vtime_accounting_enabled_this_cpu()) return; /* * We stopped the tick in idle. update_process_times() would miss the * time we slept, as it does only a 1 tick accounting. * Enforce that this is accounted to idle ! */ ticks = jiffies - ts->idle_jiffies; /* * We might be one off. Do not randomly account a huge number of ticks! */ if (ticks && ticks < LONG_MAX) account_idle_ticks(ticks); } void tick_nohz_idle_restart_tick(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) { ktime_t now = ktime_get(); tick_nohz_restart_sched_tick(ts, now); tick_nohz_account_idle_time(ts, now); } } static void tick_nohz_idle_update_tick(struct tick_sched *ts, ktime_t now) { if (tick_nohz_full_cpu(smp_processor_id())) __tick_nohz_full_update_tick(ts, now); else tick_nohz_restart_sched_tick(ts, now); tick_nohz_account_idle_time(ts, now); } /** * tick_nohz_idle_exit - Update the tick upon idle task exit * * When the idle task exits, update the tick depending on the * following situations: * * 1) If the CPU is not in nohz_full mode (most cases), then * restart the tick. * * 2) If the CPU is in nohz_full mode (corner case): * 2.1) If the tick can be kept stopped (no tick dependencies) * then re-evaluate the next tick and try to keep it stopped * as long as possible. * 2.2) If the tick has dependencies, restart the tick. * */ void tick_nohz_idle_exit(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); bool idle_active, tick_stopped; ktime_t now; local_irq_disable(); WARN_ON_ONCE(!tick_sched_flag_test(ts, TS_FLAG_INIDLE)); WARN_ON_ONCE(ts->timer_expires_base); tick_sched_flag_clear(ts, TS_FLAG_INIDLE); idle_active = tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE); tick_stopped = tick_sched_flag_test(ts, TS_FLAG_STOPPED); if (idle_active || tick_stopped) now = ktime_get(); if (idle_active) tick_nohz_stop_idle(ts, now); if (tick_stopped) tick_nohz_idle_update_tick(ts, now); local_irq_enable(); } /* * In low-resolution mode, the tick handler must be implemented directly * at the clockevent level. hrtimer can't be used instead, because its * infrastructure actually relies on the tick itself as a backend in * low-resolution mode (see hrtimer_run_queues()). */ static void tick_nohz_lowres_handler(struct clock_event_device *dev) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); dev->next_event = KTIME_MAX; if (likely(tick_nohz_handler(&ts->sched_timer) == HRTIMER_RESTART)) tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1); } static inline void tick_nohz_activate(struct tick_sched *ts) { if (!tick_nohz_enabled) return; tick_sched_flag_set(ts, TS_FLAG_NOHZ); /* One update is enough */ if (!test_and_set_bit(0, &tick_nohz_active)) timers_update_nohz(); } /** * tick_nohz_switch_to_nohz - switch to NOHZ mode */ static void tick_nohz_switch_to_nohz(void) { if (!tick_nohz_enabled) return; if (tick_switch_to_oneshot(tick_nohz_lowres_handler)) return; /* * Recycle the hrtimer in 'ts', so we can share the * highres code. */ tick_setup_sched_timer(false); } static inline void tick_nohz_irq_enter(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); ktime_t now; if (!tick_sched_flag_test(ts, TS_FLAG_STOPPED | TS_FLAG_IDLE_ACTIVE)) return; now = ktime_get(); if (tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE)) tick_nohz_stop_idle(ts, now); /* * If all CPUs are idle we may need to update a stale jiffies value. * Note nohz_full is a special case: a timekeeper is guaranteed to stay * alive but it might be busy looping with interrupts disabled in some * rare case (typically stop machine). So we must make sure we have a * last resort. */ if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) tick_nohz_update_jiffies(now); } #else static inline void tick_nohz_switch_to_nohz(void) { } static inline void tick_nohz_irq_enter(void) { } static inline void tick_nohz_activate(struct tick_sched *ts) { } #endif /* CONFIG_NO_HZ_COMMON */ /* * Called from irq_enter() to notify about the possible interruption of idle() */ void tick_irq_enter(void) { tick_check_oneshot_broadcast_this_cpu(); tick_nohz_irq_enter(); } static int sched_skew_tick; static int __init skew_tick(char *str) { get_option(&str, &sched_skew_tick); return 0; } early_param("skew_tick", skew_tick); /** * tick_setup_sched_timer - setup the tick emulation timer * @mode: tick_nohz_mode to setup for */ void tick_setup_sched_timer(bool hrtimer) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); /* Emulate tick processing via per-CPU hrtimers: */ hrtimer_init(&ts->sched_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD); if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS) && hrtimer) { tick_sched_flag_set(ts, TS_FLAG_HIGHRES); ts->sched_timer.function = tick_nohz_handler; } /* Get the next period (per-CPU) */ hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update()); /* Offset the tick to avert 'jiffies_lock' contention. */ if (sched_skew_tick) { u64 offset = TICK_NSEC >> 1; do_div(offset, num_possible_cpus()); offset *= smp_processor_id(); hrtimer_add_expires_ns(&ts->sched_timer, offset); } hrtimer_forward_now(&ts->sched_timer, TICK_NSEC); if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS) && hrtimer) hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED_HARD); else tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1); tick_nohz_activate(ts); } /* * Shut down the tick and make sure the CPU won't try to retake the timekeeping * duty before disabling IRQs in idle for the last time. */ void tick_sched_timer_dying(int cpu) { struct tick_device *td = &per_cpu(tick_cpu_device, cpu); struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu); struct clock_event_device *dev = td->evtdev; ktime_t idle_sleeptime, iowait_sleeptime; unsigned long idle_calls, idle_sleeps; /* This must happen before hrtimers are migrated! */ tick_sched_timer_cancel(ts); /* * If the clockevents doesn't support CLOCK_EVT_STATE_ONESHOT_STOPPED, * make sure not to call low-res tick handler. */ if (tick_sched_flag_test(ts, TS_FLAG_NOHZ)) dev->event_handler = clockevents_handle_noop; idle_sleeptime = ts->idle_sleeptime; iowait_sleeptime = ts->iowait_sleeptime; idle_calls = ts->idle_calls; idle_sleeps = ts->idle_sleeps; memset(ts, 0, sizeof(*ts)); ts->idle_sleeptime = idle_sleeptime; ts->iowait_sleeptime = iowait_sleeptime; ts->idle_calls = idle_calls; ts->idle_sleeps = idle_sleeps; } /* * Async notification about clocksource changes */ void tick_clock_notify(void) { int cpu; for_each_possible_cpu(cpu) set_bit(0, &per_cpu(tick_cpu_sched, cpu).check_clocks); } /* * Async notification about clock event changes */ void tick_oneshot_notify(void) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); set_bit(0, &ts->check_clocks); } /* * Check if a change happened, which makes oneshot possible. * * Called cyclically from the hrtimer softirq (driven by the timer * softirq). 'allow_nohz' signals that we can switch into low-res NOHZ * mode, because high resolution timers are disabled (either compile * or runtime). Called with interrupts disabled. */ int tick_check_oneshot_change(int allow_nohz) { struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched); if (!test_and_clear_bit(0, &ts->check_clocks)) return 0; if (tick_sched_flag_test(ts, TS_FLAG_NOHZ)) return 0; if (!timekeeping_valid_for_hres() || !tick_is_oneshot_available()) return 0; if (!allow_nohz) return 1; tick_nohz_switch_to_nohz(); return 0; } |
28 28 28 6 6 6 6 6 6 6 6 6 6 6 6 1 22 7 18 23 22 1 7 18 20 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 | // SPDX-License-Identifier: GPL-2.0 /* * linux/mm/page_io.c * * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds * * Swap reorganised 29.12.95, * Asynchronous swapping added 30.12.95. Stephen Tweedie * Removed race in async swapping. 14.4.1996. Bruno Haible * Add swap of shared pages through the page cache. 20.2.1998. Stephen Tweedie * Always use brw_page, life becomes simpler. 12 May 1998 Eric Biederman */ #include <linux/mm.h> #include <linux/kernel_stat.h> #include <linux/gfp.h> #include <linux/pagemap.h> #include <linux/swap.h> #include <linux/bio.h> #include <linux/swapops.h> #include <linux/writeback.h> #include <linux/blkdev.h> #include <linux/psi.h> #include <linux/uio.h> #include <linux/sched/task.h> #include <linux/delayacct.h> #include <linux/zswap.h> #include "swap.h" static void __end_swap_bio_write(struct bio *bio) { struct folio *folio = bio_first_folio_all(bio); if (bio->bi_status) { /* * We failed to write the page out to swap-space. * Re-dirty the page in order to avoid it being reclaimed. * Also print a dire warning that things will go BAD (tm) * very quickly. * * Also clear PG_reclaim to avoid folio_rotate_reclaimable() */ folio_mark_dirty(folio); pr_alert_ratelimited("Write-error on swap-device (%u:%u:%llu)\n", MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)), (unsigned long long)bio->bi_iter.bi_sector); folio_clear_reclaim(folio); } folio_end_writeback(folio); } static void end_swap_bio_write(struct bio *bio) { __end_swap_bio_write(bio); bio_put(bio); } static void __end_swap_bio_read(struct bio *bio) { struct folio *folio = bio_first_folio_all(bio); if (bio->bi_status) { pr_alert_ratelimited("Read-error on swap-device (%u:%u:%llu)\n", MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)), (unsigned long long)bio->bi_iter.bi_sector); } else { folio_mark_uptodate(folio); } folio_unlock(folio); } static void end_swap_bio_read(struct bio *bio) { __end_swap_bio_read(bio); bio_put(bio); } int generic_swapfile_activate(struct swap_info_struct *sis, struct file *swap_file, sector_t *span) { struct address_space *mapping = swap_file->f_mapping; struct inode *inode = mapping->host; unsigned blocks_per_page; unsigned long page_no; unsigned blkbits; sector_t probe_block; sector_t last_block; sector_t lowest_block = -1; sector_t highest_block = 0; int nr_extents = 0; int ret; blkbits = inode->i_blkbits; blocks_per_page = PAGE_SIZE >> blkbits; /* * Map all the blocks into the extent tree. This code doesn't try * to be very smart. */ probe_block = 0; page_no = 0; last_block = i_size_read(inode) >> blkbits; while ((probe_block + blocks_per_page) <= last_block && page_no < sis->max) { unsigned block_in_page; sector_t first_block; cond_resched(); first_block = probe_block; ret = bmap(inode, &first_block); if (ret || !first_block) goto bad_bmap; /* * It must be PAGE_SIZE aligned on-disk */ if (first_block & (blocks_per_page - 1)) { probe_block++; goto reprobe; } for (block_in_page = 1; block_in_page < blocks_per_page; block_in_page++) { sector_t block; block = probe_block + block_in_page; ret = bmap(inode, &block); if (ret || !block) goto bad_bmap; if (block != first_block + block_in_page) { /* Discontiguity */ probe_block++; goto reprobe; } } first_block >>= (PAGE_SHIFT - blkbits); if (page_no) { /* exclude the header page */ if (first_block < lowest_block) lowest_block = first_block; if (first_block > highest_block) highest_block = first_block; } /* * We found a PAGE_SIZE-length, PAGE_SIZE-aligned run of blocks */ ret = add_swap_extent(sis, page_no, 1, first_block); if (ret < 0) goto out; nr_extents += ret; page_no++; probe_block += blocks_per_page; reprobe: continue; } ret = nr_extents; *span = 1 + highest_block - lowest_block; if (page_no == 0) page_no = 1; /* force Empty message */ sis->max = page_no; sis->pages = page_no - 1; sis->highest_bit = page_no - 1; out: return ret; bad_bmap: pr_err("swapon: swapfile has holes\n"); ret = -EINVAL; goto out; } /* * We may have stale swap cache pages in memory: notice * them here and get rid of the unnecessary final write. */ int swap_writepage(struct page *page, struct writeback_control *wbc) { struct folio *folio = page_folio(page); int ret; if (folio_free_swap(folio)) { folio_unlock(folio); return 0; } /* * Arch code may have to preserve more data than just the page * contents, e.g. memory tags. */ ret = arch_prepare_to_swap(&folio->page); if (ret) { folio_mark_dirty(folio); folio_unlock(folio); return ret; } if (zswap_store(folio)) { folio_start_writeback(folio); folio_unlock(folio); folio_end_writeback(folio); return 0; } if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio))) { folio_mark_dirty(folio); return AOP_WRITEPAGE_ACTIVATE; } __swap_writepage(folio, wbc); return 0; } static inline void count_swpout_vm_event(struct folio *folio) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE if (unlikely(folio_test_pmd_mappable(folio))) { count_memcg_folio_events(folio, THP_SWPOUT, 1); count_vm_event(THP_SWPOUT); } #endif count_vm_events(PSWPOUT, folio_nr_pages(folio)); } #if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP) static void bio_associate_blkg_from_page(struct bio *bio, struct folio *folio) { struct cgroup_subsys_state *css; struct mem_cgroup *memcg; memcg = folio_memcg(folio); if (!memcg) return; rcu_read_lock(); css = cgroup_e_css(memcg->css.cgroup, &io_cgrp_subsys); bio_associate_blkg_from_css(bio, css); rcu_read_unlock(); } #else #define bio_associate_blkg_from_page(bio, folio) do { } while (0) #endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */ struct swap_iocb { struct kiocb iocb; struct bio_vec bvec[SWAP_CLUSTER_MAX]; int pages; int len; }; static mempool_t *sio_pool; int sio_pool_init(void) { if (!sio_pool) { mempool_t *pool = mempool_create_kmalloc_pool( SWAP_CLUSTER_MAX, sizeof(struct swap_iocb)); if (cmpxchg(&sio_pool, NULL, pool)) mempool_destroy(pool); } if (!sio_pool) return -ENOMEM; return 0; } static void sio_write_complete(struct kiocb *iocb, long ret) { struct swap_iocb *sio = container_of(iocb, struct swap_iocb, iocb); struct page *page = sio->bvec[0].bv_page; int p; if (ret != sio->len) { /* * In the case of swap-over-nfs, this can be a * temporary failure if the system has limited * memory for allocating transmit buffers. * Mark the page dirty and avoid * folio_rotate_reclaimable but rate-limit the * messages but do not flag PageError like * the normal direct-to-bio case as it could * be temporary. */ pr_err_ratelimited("Write error %ld on dio swapfile (%llu)\n", ret, page_file_offset(page)); for (p = 0; p < sio->pages; p++) { page = sio->bvec[p].bv_page; set_page_dirty(page); ClearPageReclaim(page); } } for (p = 0; p < sio->pages; p++) end_page_writeback(sio->bvec[p].bv_page); mempool_free(sio, sio_pool); } static void swap_writepage_fs(struct folio *folio, struct writeback_control *wbc) { struct swap_iocb *sio = NULL; struct swap_info_struct *sis = swp_swap_info(folio->swap); struct file *swap_file = sis->swap_file; loff_t pos = folio_file_pos(folio); count_swpout_vm_event(folio); folio_start_writeback(folio); folio_unlock(folio); if (wbc->swap_plug) sio = *wbc->swap_plug; if (sio) { if (sio->iocb.ki_filp != swap_file || sio->iocb.ki_pos + sio->len != pos) { swap_write_unplug(sio); sio = NULL; } } if (!sio) { sio = mempool_alloc(sio_pool, GFP_NOIO); init_sync_kiocb(&sio->iocb, swap_file); sio->iocb.ki_complete = sio_write_complete; sio->iocb.ki_pos = pos; sio->pages = 0; sio->len = 0; } bvec_set_folio(&sio->bvec[sio->pages], folio, folio_size(folio), 0); sio->len += folio_size(folio); sio->pages += 1; if (sio->pages == ARRAY_SIZE(sio->bvec) || !wbc->swap_plug) { swap_write_unplug(sio); sio = NULL; } if (wbc->swap_plug) *wbc->swap_plug = sio; } static void swap_writepage_bdev_sync(struct folio *folio, struct writeback_control *wbc, struct swap_info_struct *sis) { struct bio_vec bv; struct bio bio; bio_init(&bio, sis->bdev, &bv, 1, REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc)); bio.bi_iter.bi_sector = swap_folio_sector(folio); bio_add_folio_nofail(&bio, folio, folio_size(folio), 0); bio_associate_blkg_from_page(&bio, folio); count_swpout_vm_event(folio); folio_start_writeback(folio); folio_unlock(folio); submit_bio_wait(&bio); __end_swap_bio_write(&bio); } static void swap_writepage_bdev_async(struct folio *folio, struct writeback_control *wbc, struct swap_info_struct *sis) { struct bio *bio; bio = bio_alloc(sis->bdev, 1, REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc), GFP_NOIO); bio->bi_iter.bi_sector = swap_folio_sector(folio); bio->bi_end_io = end_swap_bio_write; bio_add_folio_nofail(bio, folio, folio_size(folio), 0); bio_associate_blkg_from_page(bio, folio); count_swpout_vm_event(folio); folio_start_writeback(folio); folio_unlock(folio); submit_bio(bio); } void __swap_writepage(struct folio *folio, struct writeback_control *wbc) { struct swap_info_struct *sis = swp_swap_info(folio->swap); VM_BUG_ON_FOLIO(!folio_test_swapcache(folio), folio); /* * ->flags can be updated non-atomicially (scan_swap_map_slots), * but that will never affect SWP_FS_OPS, so the data_race * is safe. */ if (data_race(sis->flags & SWP_FS_OPS)) swap_writepage_fs(folio, wbc); else if (sis->flags & SWP_SYNCHRONOUS_IO) swap_writepage_bdev_sync(folio, wbc, sis); else swap_writepage_bdev_async(folio, wbc, sis); } void swap_write_unplug(struct swap_iocb *sio) { struct iov_iter from; struct address_space *mapping = sio->iocb.ki_filp->f_mapping; int ret; iov_iter_bvec(&from, ITER_SOURCE, sio->bvec, sio->pages, sio->len); ret = mapping->a_ops->swap_rw(&sio->iocb, &from); if (ret != -EIOCBQUEUED) sio_write_complete(&sio->iocb, ret); } static void sio_read_complete(struct kiocb *iocb, long ret) { struct swap_iocb *sio = container_of(iocb, struct swap_iocb, iocb); int p; if (ret == sio->len) { for (p = 0; p < sio->pages; p++) { struct folio *folio = page_folio(sio->bvec[p].bv_page); folio_mark_uptodate(folio); folio_unlock(folio); } count_vm_events(PSWPIN, sio->pages); } else { for (p = 0; p < sio->pages; p++) { struct folio *folio = page_folio(sio->bvec[p].bv_page); folio_unlock(folio); } pr_alert_ratelimited("Read-error on swap-device\n"); } mempool_free(sio, sio_pool); } static void swap_read_folio_fs(struct folio *folio, struct swap_iocb **plug) { struct swap_info_struct *sis = swp_swap_info(folio->swap); struct swap_iocb *sio = NULL; loff_t pos = folio_file_pos(folio); if (plug) sio = *plug; if (sio) { if (sio->iocb.ki_filp != sis->swap_file || sio->iocb.ki_pos + sio->len != pos) { swap_read_unplug(sio); sio = NULL; } } if (!sio) { sio = mempool_alloc(sio_pool, GFP_KERNEL); init_sync_kiocb(&sio->iocb, sis->swap_file); sio->iocb.ki_pos = pos; sio->iocb.ki_complete = sio_read_complete; sio->pages = 0; sio->len = 0; } bvec_set_folio(&sio->bvec[sio->pages], folio, folio_size(folio), 0); sio->len += folio_size(folio); sio->pages += 1; if (sio->pages == ARRAY_SIZE(sio->bvec) || !plug) { swap_read_unplug(sio); sio = NULL; } if (plug) *plug = sio; } static void swap_read_folio_bdev_sync(struct folio *folio, struct swap_info_struct *sis) { struct bio_vec bv; struct bio bio; bio_init(&bio, sis->bdev, &bv, 1, REQ_OP_READ); bio.bi_iter.bi_sector = swap_folio_sector(folio); bio_add_folio_nofail(&bio, folio, folio_size(folio), 0); /* * Keep this task valid during swap readpage because the oom killer may * attempt to access it in the page fault retry time check. */ get_task_struct(current); count_vm_event(PSWPIN); submit_bio_wait(&bio); __end_swap_bio_read(&bio); put_task_struct(current); } static void swap_read_folio_bdev_async(struct folio *folio, struct swap_info_struct *sis) { struct bio *bio; bio = bio_alloc(sis->bdev, 1, REQ_OP_READ, GFP_KERNEL); bio->bi_iter.bi_sector = swap_folio_sector(folio); bio->bi_end_io = end_swap_bio_read; bio_add_folio_nofail(bio, folio, folio_size(folio), 0); count_vm_event(PSWPIN); submit_bio(bio); } void swap_read_folio(struct folio *folio, bool synchronous, struct swap_iocb **plug) { struct swap_info_struct *sis = swp_swap_info(folio->swap); bool workingset = folio_test_workingset(folio); unsigned long pflags; bool in_thrashing; VM_BUG_ON_FOLIO(!folio_test_swapcache(folio) && !synchronous, folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_uptodate(folio), folio); /* * Count submission time as memory stall and delay. When the device * is congested, or the submitting cgroup IO-throttled, submission * can be a significant part of overall IO time. */ if (workingset) { delayacct_thrashing_start(&in_thrashing); psi_memstall_enter(&pflags); } delayacct_swapin_start(); if (zswap_load(folio)) { folio_mark_uptodate(folio); folio_unlock(folio); } else if (data_race(sis->flags & SWP_FS_OPS)) { swap_read_folio_fs(folio, plug); } else if (synchronous || (sis->flags & SWP_SYNCHRONOUS_IO)) { swap_read_folio_bdev_sync(folio, sis); } else { swap_read_folio_bdev_async(folio, sis); } if (workingset) { delayacct_thrashing_end(&in_thrashing); psi_memstall_leave(&pflags); } delayacct_swapin_end(); } void __swap_read_unplug(struct swap_iocb *sio) { struct iov_iter from; struct address_space *mapping = sio->iocb.ki_filp->f_mapping; int ret; iov_iter_bvec(&from, ITER_DEST, sio->bvec, sio->pages, sio->len); ret = mapping->a_ops->swap_rw(&sio->iocb, &from); if (ret != -EIOCBQUEUED) sio_read_complete(&sio->iocb, ret); } |
8 1 1 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Copyright (C) 2011 Instituto Nokia de Tecnologia * * Authors: * Aloisio Almeida Jr <aloisio.almeida@openbossa.org> * Lauro Ramos Venancio <lauro.venancio@openbossa.org> */ #include <linux/nfc.h> #include <linux/module.h> #include "nfc.h" static DEFINE_RWLOCK(proto_tab_lock); static const struct nfc_protocol *proto_tab[NFC_SOCKPROTO_MAX]; static int nfc_sock_create(struct net *net, struct socket *sock, int proto, int kern) { int rc = -EPROTONOSUPPORT; if (net != &init_net) return -EAFNOSUPPORT; if (proto < 0 || proto >= NFC_SOCKPROTO_MAX) return -EINVAL; read_lock(&proto_tab_lock); if (proto_tab[proto] && try_module_get(proto_tab[proto]->owner)) { rc = proto_tab[proto]->create(net, sock, proto_tab[proto], kern); module_put(proto_tab[proto]->owner); } read_unlock(&proto_tab_lock); return rc; } static const struct net_proto_family nfc_sock_family_ops = { .owner = THIS_MODULE, .family = PF_NFC, .create = nfc_sock_create, }; int nfc_proto_register(const struct nfc_protocol *nfc_proto) { int rc; if (nfc_proto->id < 0 || nfc_proto->id >= NFC_SOCKPROTO_MAX) return -EINVAL; rc = proto_register(nfc_proto->proto, 0); if (rc) return rc; write_lock(&proto_tab_lock); if (proto_tab[nfc_proto->id]) rc = -EBUSY; else proto_tab[nfc_proto->id] = nfc_proto; write_unlock(&proto_tab_lock); if (rc) proto_unregister(nfc_proto->proto); return rc; } EXPORT_SYMBOL(nfc_proto_register); void nfc_proto_unregister(const struct nfc_protocol *nfc_proto) { write_lock(&proto_tab_lock); proto_tab[nfc_proto->id] = NULL; write_unlock(&proto_tab_lock); proto_unregister(nfc_proto->proto); } EXPORT_SYMBOL(nfc_proto_unregister); int __init af_nfc_init(void) { return sock_register(&nfc_sock_family_ops); } void __exit af_nfc_exit(void) { sock_unregister(PF_NFC); } |
11 9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | // SPDX-License-Identifier: GPL-2.0-only /* * T10 Data Integrity Field CRC16 calculation * * Copyright (c) 2007 Oracle Corporation. All rights reserved. * Written by Martin K. Petersen <martin.petersen@oracle.com> */ #include <linux/types.h> #include <linux/module.h> #include <linux/crc-t10dif.h> #include <linux/err.h> #include <linux/init.h> #include <crypto/hash.h> #include <crypto/algapi.h> #include <linux/static_key.h> #include <linux/notifier.h> static struct crypto_shash __rcu *crct10dif_tfm; static DEFINE_STATIC_KEY_TRUE(crct10dif_fallback); static DEFINE_MUTEX(crc_t10dif_mutex); static struct work_struct crct10dif_rehash_work; static int crc_t10dif_notify(struct notifier_block *self, unsigned long val, void *data) { struct crypto_alg *alg = data; if (val != CRYPTO_MSG_ALG_LOADED || strcmp(alg->cra_name, CRC_T10DIF_STRING)) return NOTIFY_DONE; schedule_work(&crct10dif_rehash_work); return NOTIFY_OK; } static void crc_t10dif_rehash(struct work_struct *work) { struct crypto_shash *new, *old; mutex_lock(&crc_t10dif_mutex); old = rcu_dereference_protected(crct10dif_tfm, lockdep_is_held(&crc_t10dif_mutex)); new = crypto_alloc_shash(CRC_T10DIF_STRING, 0, 0); if (IS_ERR(new)) { mutex_unlock(&crc_t10dif_mutex); return; } rcu_assign_pointer(crct10dif_tfm, new); mutex_unlock(&crc_t10dif_mutex); if (old) { synchronize_rcu(); crypto_free_shash(old); } else { static_branch_disable(&crct10dif_fallback); } } static struct notifier_block crc_t10dif_nb = { .notifier_call = crc_t10dif_notify, }; __u16 crc_t10dif_update(__u16 crc, const unsigned char *buffer, size_t len) { struct { struct shash_desc shash; __u16 crc; } desc; int err; if (static_branch_unlikely(&crct10dif_fallback)) return crc_t10dif_generic(crc, buffer, len); rcu_read_lock(); desc.shash.tfm = rcu_dereference(crct10dif_tfm); desc.crc = crc; err = crypto_shash_update(&desc.shash, buffer, len); rcu_read_unlock(); BUG_ON(err); return desc.crc; } EXPORT_SYMBOL(crc_t10dif_update); __u16 crc_t10dif(const unsigned char *buffer, size_t len) { return crc_t10dif_update(0, buffer, len); } EXPORT_SYMBOL(crc_t10dif); static int __init crc_t10dif_mod_init(void) { INIT_WORK(&crct10dif_rehash_work, crc_t10dif_rehash); crypto_register_notifier(&crc_t10dif_nb); crc_t10dif_rehash(&crct10dif_rehash_work); return 0; } static void __exit crc_t10dif_mod_fini(void) { crypto_unregister_notifier(&crc_t10dif_nb); cancel_work_sync(&crct10dif_rehash_work); crypto_free_shash(rcu_dereference_protected(crct10dif_tfm, 1)); } module_init(crc_t10dif_mod_init); module_exit(crc_t10dif_mod_fini); static int crc_t10dif_transform_show(char *buffer, const struct kernel_param *kp) { struct crypto_shash *tfm; int len; if (static_branch_unlikely(&crct10dif_fallback)) return sprintf(buffer, "fallback\n"); rcu_read_lock(); tfm = rcu_dereference(crct10dif_tfm); len = snprintf(buffer, PAGE_SIZE, "%s\n", crypto_shash_driver_name(tfm)); rcu_read_unlock(); return len; } module_param_call(transform, NULL, crc_t10dif_transform_show, NULL, 0444); MODULE_DESCRIPTION("T10 DIF CRC calculation (library API)"); MODULE_LICENSE("GPL"); MODULE_SOFTDEP("pre: crct10dif"); |
2 4 1 1 5 5 5 8 7 1 34 36 35 36 8 8 2 4 1 1 2 7 1 6 7 5 5 2 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 | // SPDX-License-Identifier: GPL-2.0-only /* * INET An implementation of the TCP/IP protocol suite for the LINUX * operating system. INET is implemented using the BSD Socket * interface as the means of communication with the user level. * * Implementation of the Transmission Control Protocol(TCP). * * Authors: Ross Biro * Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG> * Mark Evans, <evansmp@uhura.aston.ac.uk> * Corey Minyard <wf-rch!minyard@relay.EU.net> * Florian La Roche, <flla@stud.uni-sb.de> * Charles Hedrick, <hedrick@klinzhai.rutgers.edu> * Linus Torvalds, <torvalds@cs.helsinki.fi> * Alan Cox, <gw4pts@gw4pts.ampr.org> * Matthew Dillon, <dillon@apollo.west.oic.com> * Arnt Gulbrandsen, <agulbra@nvg.unit.no> * Jorge Cwik, <jorge@laser.satlink.net> */ #include <net/tcp.h> #include <net/xfrm.h> #include <net/busy_poll.h> static bool tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win) { if (seq == s_win) return true; if (after(end_seq, s_win) && before(seq, e_win)) return true; return seq == e_win && seq == end_seq; } static enum tcp_tw_status tcp_timewait_check_oow_rate_limit(struct inet_timewait_sock *tw, const struct sk_buff *skb, int mib_idx) { struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); if (!tcp_oow_rate_limited(twsk_net(tw), skb, mib_idx, &tcptw->tw_last_oow_ack_time)) { /* Send ACK. Note, we do not put the bucket, * it will be released by caller. */ return TCP_TW_ACK; } /* We are rate-limiting, so just release the tw sock and drop skb. */ inet_twsk_put(tw); return TCP_TW_SUCCESS; } static void twsk_rcv_nxt_update(struct tcp_timewait_sock *tcptw, u32 seq) { #ifdef CONFIG_TCP_AO struct tcp_ao_info *ao; ao = rcu_dereference(tcptw->ao_info); if (unlikely(ao && seq < tcptw->tw_rcv_nxt)) WRITE_ONCE(ao->rcv_sne, ao->rcv_sne + 1); #endif tcptw->tw_rcv_nxt = seq; } /* * * Main purpose of TIME-WAIT state is to close connection gracefully, * when one of ends sits in LAST-ACK or CLOSING retransmitting FIN * (and, probably, tail of data) and one or more our ACKs are lost. * * What is TIME-WAIT timeout? It is associated with maximal packet * lifetime in the internet, which results in wrong conclusion, that * it is set to catch "old duplicate segments" wandering out of their path. * It is not quite correct. This timeout is calculated so that it exceeds * maximal retransmission timeout enough to allow to lose one (or more) * segments sent by peer and our ACKs. This time may be calculated from RTO. * * When TIME-WAIT socket receives RST, it means that another end * finally closed and we are allowed to kill TIME-WAIT too. * * Second purpose of TIME-WAIT is catching old duplicate segments. * Well, certainly it is pure paranoia, but if we load TIME-WAIT * with this semantics, we MUST NOT kill TIME-WAIT state with RSTs. * * If we invented some more clever way to catch duplicates * (f.e. based on PAWS), we could truncate TIME-WAIT to several RTOs. * * The algorithm below is based on FORMAL INTERPRETATION of RFCs. * When you compare it to RFCs, please, read section SEGMENT ARRIVES * from the very beginning. * * NOTE. With recycling (and later with fin-wait-2) TW bucket * is _not_ stateless. It means, that strictly speaking we must * spinlock it. I do not want! Well, probability of misbehaviour * is ridiculously low and, seems, we could use some mb() tricks * to avoid misread sequence numbers, states etc. --ANK * * We don't need to initialize tmp_out.sack_ok as we don't use the results */ enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, const struct tcphdr *th) { struct tcp_options_received tmp_opt; struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); bool paws_reject = false; tmp_opt.saw_tstamp = 0; if (th->doff > (sizeof(*th) >> 2) && tcptw->tw_ts_recent_stamp) { tcp_parse_options(twsk_net(tw), skb, &tmp_opt, 0, NULL); if (tmp_opt.saw_tstamp) { if (tmp_opt.rcv_tsecr) tmp_opt.rcv_tsecr -= tcptw->tw_ts_offset; tmp_opt.ts_recent = tcptw->tw_ts_recent; tmp_opt.ts_recent_stamp = tcptw->tw_ts_recent_stamp; paws_reject = tcp_paws_reject(&tmp_opt, th->rst); } } if (tw->tw_substate == TCP_FIN_WAIT2) { /* Just repeat all the checks of tcp_rcv_state_process() */ /* Out of window, send ACK */ if (paws_reject || !tcp_in_window(TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq, tcptw->tw_rcv_nxt, tcptw->tw_rcv_nxt + tcptw->tw_rcv_wnd)) return tcp_timewait_check_oow_rate_limit( tw, skb, LINUX_MIB_TCPACKSKIPPEDFINWAIT2); if (th->rst) goto kill; if (th->syn && !before(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt)) return TCP_TW_RST; /* Dup ACK? */ if (!th->ack || !after(TCP_SKB_CB(skb)->end_seq, tcptw->tw_rcv_nxt) || TCP_SKB_CB(skb)->end_seq == TCP_SKB_CB(skb)->seq) { inet_twsk_put(tw); return TCP_TW_SUCCESS; } /* New data or FIN. If new data arrive after half-duplex close, * reset. */ if (!th->fin || TCP_SKB_CB(skb)->end_seq != tcptw->tw_rcv_nxt + 1) return TCP_TW_RST; /* FIN arrived, enter true time-wait state. */ tw->tw_substate = TCP_TIME_WAIT; twsk_rcv_nxt_update(tcptw, TCP_SKB_CB(skb)->end_seq); if (tmp_opt.saw_tstamp) { tcptw->tw_ts_recent_stamp = ktime_get_seconds(); tcptw->tw_ts_recent = tmp_opt.rcv_tsval; } inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN); return TCP_TW_ACK; } /* * Now real TIME-WAIT state. * * RFC 1122: * "When a connection is [...] on TIME-WAIT state [...] * [a TCP] MAY accept a new SYN from the remote TCP to * reopen the connection directly, if it: * * (1) assigns its initial sequence number for the new * connection to be larger than the largest sequence * number it used on the previous connection incarnation, * and * * (2) returns to TIME-WAIT state if the SYN turns out * to be an old duplicate". */ if (!paws_reject && (TCP_SKB_CB(skb)->seq == tcptw->tw_rcv_nxt && (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq || th->rst))) { /* In window segment, it may be only reset or bare ack. */ if (th->rst) { /* This is TIME_WAIT assassination, in two flavors. * Oh well... nobody has a sufficient solution to this * protocol bug yet. */ if (!READ_ONCE(twsk_net(tw)->ipv4.sysctl_tcp_rfc1337)) { kill: inet_twsk_deschedule_put(tw); return TCP_TW_SUCCESS; } } else { inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN); } if (tmp_opt.saw_tstamp) { tcptw->tw_ts_recent = tmp_opt.rcv_tsval; tcptw->tw_ts_recent_stamp = ktime_get_seconds(); } inet_twsk_put(tw); return TCP_TW_SUCCESS; } /* Out of window segment. All the segments are ACKed immediately. The only exception is new SYN. We accept it, if it is not old duplicate and we are not in danger to be killed by delayed old duplicates. RFC check is that it has newer sequence number works at rates <40Mbit/sec. However, if paws works, it is reliable AND even more, we even may relax silly seq space cutoff. RED-PEN: we violate main RFC requirement, if this SYN will appear old duplicate (i.e. we receive RST in reply to SYN-ACK), we must return socket to time-wait state. It is not good, but not fatal yet. */ if (th->syn && !th->rst && !th->ack && !paws_reject && (after(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt) || (tmp_opt.saw_tstamp && (s32)(tcptw->tw_ts_recent - tmp_opt.rcv_tsval) < 0))) { u32 isn = tcptw->tw_snd_nxt + 65535 + 2; if (isn == 0) isn++; TCP_SKB_CB(skb)->tcp_tw_isn = isn; return TCP_TW_SYN; } if (paws_reject) __NET_INC_STATS(twsk_net(tw), LINUX_MIB_PAWSESTABREJECTED); if (!th->rst) { /* In this case we must reset the TIMEWAIT timer. * * If it is ACKless SYN it may be both old duplicate * and new good SYN with random sequence number <rcv_nxt. * Do not reschedule in the last case. */ if (paws_reject || th->ack) inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN); return tcp_timewait_check_oow_rate_limit( tw, skb, LINUX_MIB_TCPACKSKIPPEDTIMEWAIT); } inet_twsk_put(tw); return TCP_TW_SUCCESS; } EXPORT_SYMBOL(tcp_timewait_state_process); static void tcp_time_wait_init(struct sock *sk, struct tcp_timewait_sock *tcptw) { #ifdef CONFIG_TCP_MD5SIG const struct tcp_sock *tp = tcp_sk(sk); struct tcp_md5sig_key *key; /* * The timewait bucket does not have the key DB from the * sock structure. We just make a quick copy of the * md5 key being used (if indeed we are using one) * so the timewait ack generating code has the key. */ tcptw->tw_md5_key = NULL; if (!static_branch_unlikely(&tcp_md5_needed.key)) return; key = tp->af_specific->md5_lookup(sk, sk); if (key) { tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC); if (!tcptw->tw_md5_key) return; if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) goto out_free; tcp_md5_add_sigpool(); } return; out_free: WARN_ON_ONCE(1); kfree(tcptw->tw_md5_key); tcptw->tw_md5_key = NULL; #endif } /* * Move a socket to time-wait or dead fin-wait-2 state. */ void tcp_time_wait(struct sock *sk, int state, int timeo) { const struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); struct inet_timewait_sock *tw; tw = inet_twsk_alloc(sk, &net->ipv4.tcp_death_row, state); if (tw) { struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); const int rto = (icsk->icsk_rto << 2) - (icsk->icsk_rto >> 1); tw->tw_transparent = inet_test_bit(TRANSPARENT, sk); tw->tw_mark = sk->sk_mark; tw->tw_priority = READ_ONCE(sk->sk_priority); tw->tw_rcv_wscale = tp->rx_opt.rcv_wscale; tcptw->tw_rcv_nxt = tp->rcv_nxt; tcptw->tw_snd_nxt = tp->snd_nxt; tcptw->tw_rcv_wnd = tcp_receive_window(tp); tcptw->tw_ts_recent = tp->rx_opt.ts_recent; tcptw->tw_ts_recent_stamp = tp->rx_opt.ts_recent_stamp; tcptw->tw_ts_offset = tp->tsoffset; tw->tw_usec_ts = tp->tcp_usec_ts; tcptw->tw_last_oow_ack_time = 0; tcptw->tw_tx_delay = tp->tcp_tx_delay; tw->tw_txhash = sk->sk_txhash; #if IS_ENABLED(CONFIG_IPV6) if (tw->tw_family == PF_INET6) { struct ipv6_pinfo *np = inet6_sk(sk); tw->tw_v6_daddr = sk->sk_v6_daddr; tw->tw_v6_rcv_saddr = sk->sk_v6_rcv_saddr; tw->tw_tclass = np->tclass; tw->tw_flowlabel = be32_to_cpu(np->flow_label & IPV6_FLOWLABEL_MASK); tw->tw_ipv6only = sk->sk_ipv6only; } #endif tcp_time_wait_init(sk, tcptw); tcp_ao_time_wait(tcptw, tp); /* Get the TIME_WAIT timeout firing. */ if (timeo < rto) timeo = rto; if (state == TCP_TIME_WAIT) timeo = TCP_TIMEWAIT_LEN; /* tw_timer is pinned, so we need to make sure BH are disabled * in following section, otherwise timer handler could run before * we complete the initialization. */ local_bh_disable(); inet_twsk_schedule(tw, timeo); /* Linkage updates. * Note that access to tw after this point is illegal. */ inet_twsk_hashdance(tw, sk, net->ipv4.tcp_death_row.hashinfo); local_bh_enable(); } else { /* Sorry, if we're out of memory, just CLOSE this * socket up. We've got bigger problems than * non-graceful socket closings. */ NET_INC_STATS(net, LINUX_MIB_TCPTIMEWAITOVERFLOW); } tcp_update_metrics(sk); tcp_done(sk); } EXPORT_SYMBOL(tcp_time_wait); #ifdef CONFIG_TCP_MD5SIG static void tcp_md5_twsk_free_rcu(struct rcu_head *head) { struct tcp_md5sig_key *key; key = container_of(head, struct tcp_md5sig_key, rcu); kfree(key); static_branch_slow_dec_deferred(&tcp_md5_needed); tcp_md5_release_sigpool(); } #endif void tcp_twsk_destructor(struct sock *sk) { #ifdef CONFIG_TCP_MD5SIG if (static_branch_unlikely(&tcp_md5_needed.key)) { struct tcp_timewait_sock *twsk = tcp_twsk(sk); if (twsk->tw_md5_key) call_rcu(&twsk->tw_md5_key->rcu, tcp_md5_twsk_free_rcu); } #endif tcp_ao_destroy_sock(sk, true); } EXPORT_SYMBOL_GPL(tcp_twsk_destructor); void tcp_twsk_purge(struct list_head *net_exit_list, int family) { bool purged_once = false; struct net *net; list_for_each_entry(net, net_exit_list, exit_list) { if (net->ipv4.tcp_death_row.hashinfo->pernet) { /* Even if tw_refcount == 1, we must clean up kernel reqsk */ inet_twsk_purge(net->ipv4.tcp_death_row.hashinfo, family); } else if (!purged_once) { /* The last refcount is decremented in tcp_sk_exit_batch() */ if (refcount_read(&net->ipv4.tcp_death_row.tw_refcount) == 1) continue; inet_twsk_purge(&tcp_hashinfo, family); purged_once = true; } } } EXPORT_SYMBOL_GPL(tcp_twsk_purge); /* Warning : This function is called without sk_listener being locked. * Be sure to read socket fields once, as their value could change under us. */ void tcp_openreq_init_rwin(struct request_sock *req, const struct sock *sk_listener, const struct dst_entry *dst) { struct inet_request_sock *ireq = inet_rsk(req); const struct tcp_sock *tp = tcp_sk(sk_listener); int full_space = tcp_full_space(sk_listener); u32 window_clamp; __u8 rcv_wscale; u32 rcv_wnd; int mss; mss = tcp_mss_clamp(tp, dst_metric_advmss(dst)); window_clamp = READ_ONCE(tp->window_clamp); /* Set this up on the first call only */ req->rsk_window_clamp = window_clamp ? : dst_metric(dst, RTAX_WINDOW); /* limit the window selection if the user enforce a smaller rx buffer */ if (sk_listener->sk_userlocks & SOCK_RCVBUF_LOCK && (req->rsk_window_clamp > full_space || req->rsk_window_clamp == 0)) req->rsk_window_clamp = full_space; rcv_wnd = tcp_rwnd_init_bpf((struct sock *)req); if (rcv_wnd == 0) rcv_wnd = dst_metric(dst, RTAX_INITRWND); else if (full_space < rcv_wnd * mss) full_space = rcv_wnd * mss; /* tcp_full_space because it is guaranteed to be the first packet */ tcp_select_initial_window(sk_listener, full_space, mss - (ireq->tstamp_ok ? TCPOLEN_TSTAMP_ALIGNED : 0), &req->rsk_rcv_wnd, &req->rsk_window_clamp, ireq->wscale_ok, &rcv_wscale, rcv_wnd); ireq->rcv_wscale = rcv_wscale; } EXPORT_SYMBOL(tcp_openreq_init_rwin); static void tcp_ecn_openreq_child(struct tcp_sock *tp, const struct request_sock *req) { tp->ecn_flags = inet_rsk(req)->ecn_ok ? TCP_ECN_OK : 0; } void tcp_ca_openreq_child(struct sock *sk, const struct dst_entry *dst) { struct inet_connection_sock *icsk = inet_csk(sk); u32 ca_key = dst_metric(dst, RTAX_CC_ALGO); bool ca_got_dst = false; if (ca_key != TCP_CA_UNSPEC) { const struct tcp_congestion_ops *ca; rcu_read_lock(); ca = tcp_ca_find_key(ca_key); if (likely(ca && bpf_try_module_get(ca, ca->owner))) { icsk->icsk_ca_dst_locked = tcp_ca_dst_locked(dst); icsk->icsk_ca_ops = ca; ca_got_dst = true; } rcu_read_unlock(); } /* If no valid choice made yet, assign current system default ca. */ if (!ca_got_dst && (!icsk->icsk_ca_setsockopt || !bpf_try_module_get(icsk->icsk_ca_ops, icsk->icsk_ca_ops->owner))) tcp_assign_congestion_control(sk); tcp_set_ca_state(sk, TCP_CA_Open); } EXPORT_SYMBOL_GPL(tcp_ca_openreq_child); static void smc_check_reset_syn_req(const struct tcp_sock *oldtp, struct request_sock *req, struct tcp_sock *newtp) { #if IS_ENABLED(CONFIG_SMC) struct inet_request_sock *ireq; if (static_branch_unlikely(&tcp_have_smc)) { ireq = inet_rsk(req); if (oldtp->syn_smc && !ireq->smc_ok) newtp->syn_smc = 0; } #endif } /* This is not only more efficient than what we used to do, it eliminates * a lot of code duplication between IPv4/IPv6 SYN recv processing. -DaveM * * Actually, we could lots of memory writes here. tp of listening * socket contains all necessary default parameters. */ struct sock *tcp_create_openreq_child(const struct sock *sk, struct request_sock *req, struct sk_buff *skb) { struct sock *newsk = inet_csk_clone_lock(sk, req, GFP_ATOMIC); const struct inet_request_sock *ireq = inet_rsk(req); struct tcp_request_sock *treq = tcp_rsk(req); struct inet_connection_sock *newicsk; const struct tcp_sock *oldtp; struct tcp_sock *newtp; u32 seq; #ifdef CONFIG_TCP_AO struct tcp_ao_key *ao_key; #endif if (!newsk) return NULL; newicsk = inet_csk(newsk); newtp = tcp_sk(newsk); oldtp = tcp_sk(sk); smc_check_reset_syn_req(oldtp, req, newtp); /* Now setup tcp_sock */ newtp->pred_flags = 0; seq = treq->rcv_isn + 1; newtp->rcv_wup = seq; WRITE_ONCE(newtp->copied_seq, seq); WRITE_ONCE(newtp->rcv_nxt, seq); newtp->segs_in = 1; seq = treq->snt_isn + 1; newtp->snd_sml = newtp->snd_una = seq; WRITE_ONCE(newtp->snd_nxt, seq); newtp->snd_up = seq; INIT_LIST_HEAD(&newtp->tsq_node); INIT_LIST_HEAD(&newtp->tsorted_sent_queue); tcp_init_wl(newtp, treq->rcv_isn); minmax_reset(&newtp->rtt_min, tcp_jiffies32, ~0U); newicsk->icsk_ack.lrcvtime = tcp_jiffies32; newtp->lsndtime = tcp_jiffies32; newsk->sk_txhash = READ_ONCE(treq->txhash); newtp->total_retrans = req->num_retrans; tcp_init_xmit_timers(newsk); WRITE_ONCE(newtp->write_seq, newtp->pushed_seq = treq->snt_isn + 1); if (sock_flag(newsk, SOCK_KEEPOPEN)) inet_csk_reset_keepalive_timer(newsk, keepalive_time_when(newtp)); newtp->rx_opt.tstamp_ok = ireq->tstamp_ok; newtp->rx_opt.sack_ok = ireq->sack_ok; newtp->window_clamp = req->rsk_window_clamp; newtp->rcv_ssthresh = req->rsk_rcv_wnd; newtp->rcv_wnd = req->rsk_rcv_wnd; newtp->rx_opt.wscale_ok = ireq->wscale_ok; if (newtp->rx_opt.wscale_ok) { newtp->rx_opt.snd_wscale = ireq->snd_wscale; newtp->rx_opt.rcv_wscale = ireq->rcv_wscale; } else { newtp->rx_opt.snd_wscale = newtp->rx_opt.rcv_wscale = 0; newtp->window_clamp = min(newtp->window_clamp, 65535U); } newtp->snd_wnd = ntohs(tcp_hdr(skb)->window) << newtp->rx_opt.snd_wscale; newtp->max_window = newtp->snd_wnd; if (newtp->rx_opt.tstamp_ok) { newtp->tcp_usec_ts = treq->req_usec_ts; newtp->rx_opt.ts_recent = READ_ONCE(req->ts_recent); newtp->rx_opt.ts_recent_stamp = ktime_get_seconds(); newtp->tcp_header_len = sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED; } else { newtp->tcp_usec_ts = 0; newtp->rx_opt.ts_recent_stamp = 0; newtp->tcp_header_len = sizeof(struct tcphdr); } if (req->num_timeout) { newtp->total_rto = req->num_timeout; newtp->undo_marker = treq->snt_isn; if (newtp->tcp_usec_ts) { newtp->retrans_stamp = treq->snt_synack; newtp->total_rto_time = (u32)(tcp_clock_us() - newtp->retrans_stamp) / USEC_PER_MSEC; } else { newtp->retrans_stamp = div_u64(treq->snt_synack, USEC_PER_SEC / TCP_TS_HZ); newtp->total_rto_time = tcp_clock_ms() - newtp->retrans_stamp; } newtp->total_rto_recoveries = 1; } newtp->tsoffset = treq->ts_off; #ifdef CONFIG_TCP_MD5SIG newtp->md5sig_info = NULL; /*XXX*/ #endif #ifdef CONFIG_TCP_AO newtp->ao_info = NULL; ao_key = treq->af_specific->ao_lookup(sk, req, tcp_rsk(req)->ao_keyid, -1); if (ao_key) newtp->tcp_header_len += tcp_ao_len_aligned(ao_key); #endif if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len) newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len; newtp->rx_opt.mss_clamp = req->mss; tcp_ecn_openreq_child(newtp, req); newtp->fastopen_req = NULL; RCU_INIT_POINTER(newtp->fastopen_rsk, NULL); newtp->bpf_chg_cc_inprogress = 0; tcp_bpf_clone(sk, newsk); __TCP_INC_STATS(sock_net(sk), TCP_MIB_PASSIVEOPENS); return newsk; } EXPORT_SYMBOL(tcp_create_openreq_child); /* * Process an incoming packet for SYN_RECV sockets represented as a * request_sock. Normally sk is the listener socket but for TFO it * points to the child socket. * * XXX (TFO) - The current impl contains a special check for ack * validation and inside tcp_v4_reqsk_send_ack(). Can we do better? * * We don't need to initialize tmp_opt.sack_ok as we don't use the results * * Note: If @fastopen is true, this can be called from process context. * Otherwise, this is from BH context. */ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, struct request_sock *req, bool fastopen, bool *req_stolen) { struct tcp_options_received tmp_opt; struct sock *child; const struct tcphdr *th = tcp_hdr(skb); __be32 flg = tcp_flag_word(th) & (TCP_FLAG_RST|TCP_FLAG_SYN|TCP_FLAG_ACK); bool paws_reject = false; bool own_req; tmp_opt.saw_tstamp = 0; if (th->doff > (sizeof(struct tcphdr)>>2)) { tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL); if (tmp_opt.saw_tstamp) { tmp_opt.ts_recent = READ_ONCE(req->ts_recent); if (tmp_opt.rcv_tsecr) tmp_opt.rcv_tsecr -= tcp_rsk(req)->ts_off; /* We do not store true stamp, but it is not required, * it can be estimated (approximately) * from another data. */ tmp_opt.ts_recent_stamp = ktime_get_seconds() - reqsk_timeout(req, TCP_RTO_MAX) / HZ; paws_reject = tcp_paws_reject(&tmp_opt, th->rst); } } /* Check for pure retransmitted SYN. */ if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn && flg == TCP_FLAG_SYN && !paws_reject) { /* * RFC793 draws (Incorrectly! It was fixed in RFC1122) * this case on figure 6 and figure 8, but formal * protocol description says NOTHING. * To be more exact, it says that we should send ACK, * because this segment (at least, if it has no data) * is out of window. * * CONCLUSION: RFC793 (even with RFC1122) DOES NOT * describe SYN-RECV state. All the description * is wrong, we cannot believe to it and should * rely only on common sense and implementation * experience. * * Enforce "SYN-ACK" according to figure 8, figure 6 * of RFC793, fixed by RFC1122. * * Note that even if there is new data in the SYN packet * they will be thrown away too. * * Reset timer after retransmitting SYNACK, similar to * the idea of fast retransmit in recovery. */ if (!tcp_oow_rate_limited(sock_net(sk), skb, LINUX_MIB_TCPACKSKIPPEDSYNRECV, &tcp_rsk(req)->last_oow_ack_time) && !inet_rtx_syn_ack(sk, req)) { unsigned long expires = jiffies; expires += reqsk_timeout(req, TCP_RTO_MAX); if (!fastopen) mod_timer_pending(&req->rsk_timer, expires); else req->rsk_timer.expires = expires; } return NULL; } /* Further reproduces section "SEGMENT ARRIVES" for state SYN-RECEIVED of RFC793. It is broken, however, it does not work only when SYNs are crossed. You would think that SYN crossing is impossible here, since we should have a SYN_SENT socket (from connect()) on our end, but this is not true if the crossed SYNs were sent to both ends by a malicious third party. We must defend against this, and to do that we first verify the ACK (as per RFC793, page 36) and reset if it is invalid. Is this a true full defense? To convince ourselves, let us consider a way in which the ACK test can still pass in this 'malicious crossed SYNs' case. Malicious sender sends identical SYNs (and thus identical sequence numbers) to both A and B: A: gets SYN, seq=7 B: gets SYN, seq=7 By our good fortune, both A and B select the same initial send sequence number of seven :-) A: sends SYN|ACK, seq=7, ack_seq=8 B: sends SYN|ACK, seq=7, ack_seq=8 So we are now A eating this SYN|ACK, ACK test passes. So does sequence test, SYN is truncated, and thus we consider it a bare ACK. If icsk->icsk_accept_queue.rskq_defer_accept, we silently drop this bare ACK. Otherwise, we create an established connection. Both ends (listening sockets) accept the new incoming connection and try to talk to each other. 8-) Note: This case is both harmless, and rare. Possibility is about the same as us discovering intelligent life on another plant tomorrow. But generally, we should (RFC lies!) to accept ACK from SYNACK both here and in tcp_rcv_state_process(). tcp_rcv_state_process() does not, hence, we do not too. Note that the case is absolutely generic: we cannot optimize anything here without violating protocol. All the checks must be made before attempt to create socket. */ /* RFC793 page 36: "If the connection is in any non-synchronized state ... * and the incoming segment acknowledges something not yet * sent (the segment carries an unacceptable ACK) ... * a reset is sent." * * Invalid ACK: reset will be sent by listening socket. * Note that the ACK validity check for a Fast Open socket is done * elsewhere and is checked directly against the child socket rather * than req because user data may have been sent out. */ if ((flg & TCP_FLAG_ACK) && !fastopen && (TCP_SKB_CB(skb)->ack_seq != tcp_rsk(req)->snt_isn + 1)) return sk; /* Also, it would be not so bad idea to check rcv_tsecr, which * is essentially ACK extension and too early or too late values * should cause reset in unsynchronized states. */ /* RFC793: "first check sequence number". */ if (paws_reject || !tcp_in_window(TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq, tcp_rsk(req)->rcv_nxt, tcp_rsk(req)->rcv_nxt + req->rsk_rcv_wnd)) { /* Out of window: send ACK and drop. */ if (!(flg & TCP_FLAG_RST) && !tcp_oow_rate_limited(sock_net(sk), skb, LINUX_MIB_TCPACKSKIPPEDSYNRECV, &tcp_rsk(req)->last_oow_ack_time)) req->rsk_ops->send_ack(sk, skb, req); if (paws_reject) NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWSESTABREJECTED); return NULL; } /* In sequence, PAWS is OK. */ /* TODO: We probably should defer ts_recent change once * we take ownership of @req. */ if (tmp_opt.saw_tstamp && !after(TCP_SKB_CB(skb)->seq, tcp_rsk(req)->rcv_nxt)) WRITE_ONCE(req->ts_recent, tmp_opt.rcv_tsval); if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn) { /* Truncate SYN, it is out of window starting at tcp_rsk(req)->rcv_isn + 1. */ flg &= ~TCP_FLAG_SYN; } /* RFC793: "second check the RST bit" and * "fourth, check the SYN bit" */ if (flg & (TCP_FLAG_RST|TCP_FLAG_SYN)) { TCP_INC_STATS(sock_net(sk), TCP_MIB_ATTEMPTFAILS); goto embryonic_reset; } /* ACK sequence verified above, just make sure ACK is * set. If ACK not set, just silently drop the packet. * * XXX (TFO) - if we ever allow "data after SYN", the * following check needs to be removed. */ if (!(flg & TCP_FLAG_ACK)) return NULL; /* For Fast Open no more processing is needed (sk is the * child socket). */ if (fastopen) return sk; /* While TCP_DEFER_ACCEPT is active, drop bare ACK. */ if (req->num_timeout < READ_ONCE(inet_csk(sk)->icsk_accept_queue.rskq_defer_accept) && TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) { inet_rsk(req)->acked = 1; __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPDEFERACCEPTDROP); return NULL; } /* OK, ACK is valid, create big socket and * feed this segment to it. It will repeat all * the tests. THIS SEGMENT MUST MOVE SOCKET TO * ESTABLISHED STATE. If it will be dropped after * socket is created, wait for troubles. */ child = inet_csk(sk)->icsk_af_ops->syn_recv_sock(sk, skb, req, NULL, req, &own_req); if (!child) goto listen_overflow; if (own_req && rsk_drop_req(req)) { reqsk_queue_removed(&inet_csk(req->rsk_listener)->icsk_accept_queue, req); inet_csk_reqsk_queue_drop_and_put(req->rsk_listener, req); return child; } sock_rps_save_rxhash(child, skb); tcp_synack_rtt_meas(child, req); *req_stolen = !own_req; return inet_csk_complete_hashdance(sk, child, req, own_req); listen_overflow: if (sk != req->rsk_listener) __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMIGRATEREQFAILURE); if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_abort_on_overflow)) { inet_rsk(req)->acked = 1; return NULL; } embryonic_reset: if (!(flg & TCP_FLAG_RST)) { /* Received a bad SYN pkt - for TFO We try not to reset * the local connection unless it's really necessary to * avoid becoming vulnerable to outside attack aiming at * resetting legit local connections. */ req->rsk_ops->send_reset(sk, skb); } else if (fastopen) { /* received a valid RST pkt */ reqsk_fastopen_remove(sk, req, true); tcp_reset(sk, skb); } if (!fastopen) { bool unlinked = inet_csk_reqsk_queue_drop(sk, req); if (unlinked) __NET_INC_STATS(sock_net(sk), LINUX_MIB_EMBRYONICRSTS); *req_stolen = !unlinked; } return NULL; } EXPORT_SYMBOL(tcp_check_req); /* * Queue segment on the new socket if the new socket is active, * otherwise we just shortcircuit this and continue with * the new socket. * * For the vast majority of cases child->sk_state will be TCP_SYN_RECV * when entering. But other states are possible due to a race condition * where after __inet_lookup_established() fails but before the listener * locked is obtained, other packets cause the same connection to * be created. */ enum skb_drop_reason tcp_child_process(struct sock *parent, struct sock *child, struct sk_buff *skb) __releases(&((child)->sk_lock.slock)) { enum skb_drop_reason reason = SKB_NOT_DROPPED_YET; int state = child->sk_state; /* record sk_napi_id and sk_rx_queue_mapping of child. */ sk_mark_napi_id_set(child, skb); tcp_segs_in(tcp_sk(child), skb); if (!sock_owned_by_user(child)) { reason = tcp_rcv_state_process(child, skb); /* Wakeup parent, send SIGIO */ if (state == TCP_SYN_RECV && child->sk_state != state) parent->sk_data_ready(parent); } else { /* Alas, it is possible again, because we do lookup * in main socket hash table and lock on listening * socket does not protect us more. */ __sk_add_backlog(child, skb); } bh_unlock_sock(child); sock_put(child); return reason; } EXPORT_SYMBOL(tcp_child_process); |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | /* SPDX-License-Identifier: GPL-2.0 */ /* include/net/dsfield.h - Manipulation of the Differentiated Services field */ /* Written 1998-2000 by Werner Almesberger, EPFL ICA */ #ifndef __NET_DSFIELD_H #define __NET_DSFIELD_H #include <linux/types.h> #include <linux/ip.h> #include <linux/ipv6.h> #include <asm/byteorder.h> static inline __u8 ipv4_get_dsfield(const struct iphdr *iph) { return iph->tos; } static inline __u8 ipv6_get_dsfield(const struct ipv6hdr *ipv6h) { return ntohs(*(__force const __be16 *)ipv6h) >> 4; } static inline void ipv4_change_dsfield(struct iphdr *iph,__u8 mask, __u8 value) { __u32 check = ntohs((__force __be16)iph->check); __u8 dsfield; dsfield = (iph->tos & mask) | value; check += iph->tos; if ((check+1) >> 16) check = (check+1) & 0xffff; check -= dsfield; check += check >> 16; /* adjust carry */ iph->check = (__force __sum16)htons(check); iph->tos = dsfield; } static inline void ipv6_change_dsfield(struct ipv6hdr *ipv6h,__u8 mask, __u8 value) { __be16 *p = (__force __be16 *)ipv6h; *p = (*p & htons((((u16)mask << 4) | 0xf00f))) | htons((u16)value << 4); } #endif |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 | // SPDX-License-Identifier: GPL-2.0 /* * Media device request objects * * Copyright 2018 Cisco Systems, Inc. and/or its affiliates. All rights reserved. * Copyright (C) 2018 Intel Corporation * * Author: Hans Verkuil <hans.verkuil@cisco.com> * Author: Sakari Ailus <sakari.ailus@linux.intel.com> */ #ifndef MEDIA_REQUEST_H #define MEDIA_REQUEST_H #include <linux/list.h> #include <linux/slab.h> #include <linux/spinlock.h> #include <linux/refcount.h> #include <media/media-device.h> /** * enum media_request_state - media request state * * @MEDIA_REQUEST_STATE_IDLE: Idle * @MEDIA_REQUEST_STATE_VALIDATING: Validating the request, no state changes * allowed * @MEDIA_REQUEST_STATE_QUEUED: Queued * @MEDIA_REQUEST_STATE_COMPLETE: Completed, the request is done * @MEDIA_REQUEST_STATE_CLEANING: Cleaning, the request is being re-inited * @MEDIA_REQUEST_STATE_UPDATING: The request is being updated, i.e. * request objects are being added, * modified or removed * @NR_OF_MEDIA_REQUEST_STATE: The number of media request states, used * internally for sanity check purposes */ enum media_request_state { MEDIA_REQUEST_STATE_IDLE, MEDIA_REQUEST_STATE_VALIDATING, MEDIA_REQUEST_STATE_QUEUED, MEDIA_REQUEST_STATE_COMPLETE, MEDIA_REQUEST_STATE_CLEANING, MEDIA_REQUEST_STATE_UPDATING, NR_OF_MEDIA_REQUEST_STATE, }; struct media_request_object; /** * struct media_request - Media device request * @mdev: Media device this request belongs to * @kref: Reference count * @debug_str: Prefix for debug messages (process name:fd) * @state: The state of the request * @updating_count: count the number of request updates that are in progress * @access_count: count the number of request accesses that are in progress * @objects: List of @struct media_request_object request objects * @num_incomplete_objects: The number of incomplete objects in the request * @poll_wait: Wait queue for poll * @lock: Serializes access to this struct */ struct media_request { struct media_device *mdev; struct kref kref; char debug_str[TASK_COMM_LEN + 11]; enum media_request_state state; unsigned int updating_count; unsigned int access_count; struct list_head objects; unsigned int num_incomplete_objects; wait_queue_head_t poll_wait; spinlock_t lock; }; #ifdef CONFIG_MEDIA_CONTROLLER /** * media_request_lock_for_access - Lock the request to access its objects * * @req: The media request * * Use before accessing a completed request. A reference to the request must * be held during the access. This usually takes place automatically through * a file handle. Use @media_request_unlock_for_access when done. */ static inline int __must_check media_request_lock_for_access(struct media_request *req) { unsigned long flags; int ret = -EBUSY; spin_lock_irqsave(&req->lock, flags); if (req->state == MEDIA_REQUEST_STATE_COMPLETE) { req->access_count++; ret = 0; } spin_unlock_irqrestore(&req->lock, flags); return ret; } /** * media_request_unlock_for_access - Unlock a request previously locked for * access * * @req: The media request * * Unlock a request that has previously been locked using * @media_request_lock_for_access. */ static inline void media_request_unlock_for_access(struct media_request *req) { unsigned long flags; spin_lock_irqsave(&req->lock, flags); if (!WARN_ON(!req->access_count)) req->access_count--; spin_unlock_irqrestore(&req->lock, flags); } /** * media_request_lock_for_update - Lock the request for updating its objects * * @req: The media request * * Use before updating a request, i.e. adding, modifying or removing a request * object in it. A reference to the request must be held during the update. This * usually takes place automatically through a file handle. Use * @media_request_unlock_for_update when done. */ static inline int __must_check media_request_lock_for_update(struct media_request *req) { unsigned long flags; int ret = 0; spin_lock_irqsave(&req->lock, flags); if (req->state == MEDIA_REQUEST_STATE_IDLE || req->state == MEDIA_REQUEST_STATE_UPDATING) { req->state = MEDIA_REQUEST_STATE_UPDATING; req->updating_count++; } else { ret = -EBUSY; } spin_unlock_irqrestore(&req->lock, flags); return ret; } /** * media_request_unlock_for_update - Unlock a request previously locked for * update * * @req: The media request * * Unlock a request that has previously been locked using * @media_request_lock_for_update. */ static inline void media_request_unlock_for_update(struct media_request *req) { unsigned long flags; spin_lock_irqsave(&req->lock, flags); WARN_ON(req->updating_count <= 0); if (!--req->updating_count) req->state = MEDIA_REQUEST_STATE_IDLE; spin_unlock_irqrestore(&req->lock, flags); } /** * media_request_get - Get the media request * * @req: The media request * * Get the media request. */ static inline void media_request_get(struct media_request *req) { kref_get(&req->kref); } /** * media_request_put - Put the media request * * @req: The media request * * Put the media request. The media request will be released * when the refcount reaches 0. */ void media_request_put(struct media_request *req); /** * media_request_get_by_fd - Get a media request by fd * * @mdev: Media device this request belongs to * @request_fd: The file descriptor of the request * * Get the request represented by @request_fd that is owned * by the media device. * * Return a -EBADR error pointer if requests are not supported * by this driver. Return -EINVAL if the request was not found. * Return the pointer to the request if found: the caller will * have to call @media_request_put when it finished using the * request. */ struct media_request * media_request_get_by_fd(struct media_device *mdev, int request_fd); /** * media_request_alloc - Allocate the media request * * @mdev: Media device this request belongs to * @alloc_fd: Store the request's file descriptor in this int * * Allocated the media request and put the fd in @alloc_fd. */ int media_request_alloc(struct media_device *mdev, int *alloc_fd); #else static inline void media_request_get(struct media_request *req) { } static inline void media_request_put(struct media_request *req) { } static inline struct media_request * media_request_get_by_fd(struct media_device *mdev, int request_fd) { return ERR_PTR(-EBADR); } #endif /** * struct media_request_object_ops - Media request object operations * @prepare: Validate and prepare the request object, optional. * @unprepare: Unprepare the request object, optional. * @queue: Queue the request object, optional. * @unbind: Unbind the request object, optional. * @release: Release the request object, required. */ struct media_request_object_ops { int (*prepare)(struct media_request_object *object); void (*unprepare)(struct media_request_object *object); void (*queue)(struct media_request_object *object); void (*unbind)(struct media_request_object *object); void (*release)(struct media_request_object *object); }; /** * struct media_request_object - An opaque object that belongs to a media * request * * @ops: object's operations * @priv: object's priv pointer * @req: the request this object belongs to (can be NULL) * @list: List entry of the object for @struct media_request * @kref: Reference count of the object, acquire before releasing req->lock * @completed: If true, then this object was completed. * * An object related to the request. This struct is always embedded in * another struct that contains the actual data for this request object. */ struct media_request_object { const struct media_request_object_ops *ops; void *priv; struct media_request *req; struct list_head list; struct kref kref; bool completed; }; #ifdef CONFIG_MEDIA_CONTROLLER /** * media_request_object_get - Get a media request object * * @obj: The object * * Get a media request object. */ static inline void media_request_object_get(struct media_request_object *obj) { kref_get(&obj->kref); } /** * media_request_object_put - Put a media request object * * @obj: The object * * Put a media request object. Once all references are gone, the * object's memory is released. */ void media_request_object_put(struct media_request_object *obj); /** * media_request_object_find - Find an object in a request * * @req: The media request * @ops: Find an object with this ops value * @priv: Find an object with this priv value * * Both @ops and @priv must be non-NULL. * * Returns the object pointer or NULL if not found. The caller must * call media_request_object_put() once it finished using the object. * * Since this function needs to walk the list of objects it takes * the @req->lock spin lock to make this safe. */ struct media_request_object * media_request_object_find(struct media_request *req, const struct media_request_object_ops *ops, void *priv); /** * media_request_object_init - Initialise a media request object * * @obj: The object * * Initialise a media request object. The object will be released using the * release callback of the ops once it has no references (this function * initialises references to one). */ void media_request_object_init(struct media_request_object *obj); /** * media_request_object_bind - Bind a media request object to a request * * @req: The media request * @ops: The object ops for this object * @priv: A driver-specific priv pointer associated with this object * @is_buffer: Set to true if the object a buffer object. * @obj: The object * * Bind this object to the request and set the ops and priv values of * the object so it can be found later with media_request_object_find(). * * Every bound object must be unbound or completed by the kernel at some * point in time, otherwise the request will never complete. When the * request is released all completed objects will be unbound by the * request core code. * * Buffer objects will be added to the end of the request's object * list, non-buffer objects will be added to the front of the list. * This ensures that all buffer objects are at the end of the list * and that all non-buffer objects that they depend on are processed * first. */ int media_request_object_bind(struct media_request *req, const struct media_request_object_ops *ops, void *priv, bool is_buffer, struct media_request_object *obj); /** * media_request_object_unbind - Unbind a media request object * * @obj: The object * * Unbind the media request object from the request. */ void media_request_object_unbind(struct media_request_object *obj); /** * media_request_object_complete - Mark the media request object as complete * * @obj: The object * * Mark the media request object as complete. Only bound objects can * be completed. */ void media_request_object_complete(struct media_request_object *obj); #else static inline int __must_check media_request_lock_for_access(struct media_request *req) { return -EINVAL; } static inline void media_request_unlock_for_access(struct media_request *req) { } static inline int __must_check media_request_lock_for_update(struct media_request *req) { return -EINVAL; } static inline void media_request_unlock_for_update(struct media_request *req) { } static inline void media_request_object_get(struct media_request_object *obj) { } static inline void media_request_object_put(struct media_request_object *obj) { } static inline struct media_request_object * media_request_object_find(struct media_request *req, const struct media_request_object_ops *ops, void *priv) { return NULL; } static inline void media_request_object_init(struct media_request_object *obj) { obj->ops = NULL; obj->req = NULL; } static inline int media_request_object_bind(struct media_request *req, const struct media_request_object_ops *ops, void *priv, bool is_buffer, struct media_request_object *obj) { return 0; } static inline void media_request_object_unbind(struct media_request_object *obj) { } static inline void media_request_object_complete(struct media_request_object *obj) { } #endif #endif |
3247 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | /* SPDX-License-Identifier: GPL-2.0 */ #undef TRACE_SYSTEM #define TRACE_SYSTEM printk #if !defined(_TRACE_PRINTK_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_PRINTK_H #include <linux/tracepoint.h> TRACE_EVENT(console, TP_PROTO(const char *text, size_t len), TP_ARGS(text, len), TP_STRUCT__entry( __dynamic_array(char, msg, len + 1) ), TP_fast_assign( /* * Each trace entry is printed in a new line. * If the msg finishes with '\n', cut it off * to avoid blank lines in the trace. */ if ((len > 0) && (text[len-1] == '\n')) len -= 1; memcpy(__get_str(msg), text, len); __get_str(msg)[len] = 0; ), TP_printk("%s", __get_str(msg)) ); #endif /* _TRACE_PRINTK_H */ /* This part must be outside protection */ #include <trace/define_trace.h> |
16 28 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _LINUX_NLS_H #define _LINUX_NLS_H #include <linux/init.h> /* Unicode has changed over the years. Unicode code points no longer * fit into 16 bits; as of Unicode 5 valid code points range from 0 * to 0x10ffff (17 planes, where each plane holds 65536 code points). * * The original decision to represent Unicode characters as 16-bit * wchar_t values is now outdated. But plane 0 still includes the * most commonly used characters, so we will retain it. The newer * 32-bit unicode_t type can be used when it is necessary to * represent the full Unicode character set. */ /* Plane-0 Unicode character */ typedef u16 wchar_t; #define MAX_WCHAR_T 0xffff /* Arbitrary Unicode character */ typedef u32 unicode_t; struct nls_table { const char *charset; const char *alias; int (*uni2char) (wchar_t uni, unsigned char *out, int boundlen); int (*char2uni) (const unsigned char *rawstring, int boundlen, wchar_t *uni); const unsigned char *charset2lower; const unsigned char *charset2upper; struct module *owner; struct nls_table *next; }; /* this value hold the maximum octet of charset */ #define NLS_MAX_CHARSET_SIZE 6 /* for UTF-8 */ /* Byte order for UTF-16 strings */ enum utf16_endian { UTF16_HOST_ENDIAN, UTF16_LITTLE_ENDIAN, UTF16_BIG_ENDIAN }; /* nls_base.c */ extern int __register_nls(struct nls_table *, struct module *); extern int unregister_nls(struct nls_table *); extern struct nls_table *load_nls(const char *charset); extern void unload_nls(struct nls_table *); extern struct nls_table *load_nls_default(void); #define register_nls(nls) __register_nls((nls), THIS_MODULE) extern int utf8_to_utf32(const u8 *s, int len, unicode_t *pu); extern int utf32_to_utf8(unicode_t u, u8 *s, int maxlen); extern int utf8s_to_utf16s(const u8 *s, int len, enum utf16_endian endian, wchar_t *pwcs, int maxlen); extern int utf16s_to_utf8s(const wchar_t *pwcs, int len, enum utf16_endian endian, u8 *s, int maxlen); static inline unsigned char nls_tolower(struct nls_table *t, unsigned char c) { unsigned char nc = t->charset2lower[c]; return nc ? nc : c; } static inline unsigned char nls_toupper(struct nls_table *t, unsigned char c) { unsigned char nc = t->charset2upper[c]; return nc ? nc : c; } static inline int nls_strnicmp(struct nls_table *t, const unsigned char *s1, const unsigned char *s2, int len) { while (len--) { if (nls_tolower(t, *s1++) != nls_tolower(t, *s2++)) return 1; } return 0; } /* * nls_nullsize - return length of null character for codepage * @codepage - codepage for which to return length of NULL terminator * * Since we can't guarantee that the null terminator will be a particular * length, we have to check against the codepage. If there's a problem * determining it, assume a single-byte NULL terminator. */ static inline int nls_nullsize(const struct nls_table *codepage) { int charlen; char tmp[NLS_MAX_CHARSET_SIZE]; charlen = codepage->uni2char(0, tmp, NLS_MAX_CHARSET_SIZE); return charlen > 0 ? charlen : 1; } #define MODULE_ALIAS_NLS(name) MODULE_ALIAS("nls_" __stringify(name)) #endif /* _LINUX_NLS_H */ |
1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 | // SPDX-License-Identifier: GPL-2.0-or-later /* * IPVS: Weighted Round-Robin Scheduling module * * Authors: Wensong Zhang <wensong@linuxvirtualserver.org> * * Changes: * Wensong Zhang : changed the ip_vs_wrr_schedule to return dest * Wensong Zhang : changed some comestics things for debugging * Wensong Zhang : changed for the d-linked destination list * Wensong Zhang : added the ip_vs_wrr_update_svc * Julian Anastasov : fixed the bug of returning destination * with weight 0 when all weights are zero */ #define KMSG_COMPONENT "IPVS" #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt #include <linux/module.h> #include <linux/kernel.h> #include <linux/slab.h> #include <linux/net.h> #include <linux/gcd.h> #include <net/ip_vs.h> /* The WRR algorithm depends on some caclulations: * - mw: maximum weight * - di: weight step, greatest common divisor from all weights * - cw: current required weight * As result, all weights are in the [di..mw] range with a step=di. * * First, we start with cw = mw and select dests with weight >= cw. * Then cw is reduced with di and all dests are checked again. * Last pass should be with cw = di. We have mw/di passes in total: * * pass 1: cw = max weight * pass 2: cw = max weight - di * pass 3: cw = max weight - 2 * di * ... * last pass: cw = di * * Weights are supposed to be >= di but we run in parallel with * weight changes, it is possible some dest weight to be reduced * below di, bad if it is the only available dest. * * So, we modify how mw is calculated, now it is reduced with (di - 1), * so that last cw is 1 to catch such dests with weight below di: * pass 1: cw = max weight - (di - 1) * pass 2: cw = max weight - di - (di - 1) * pass 3: cw = max weight - 2 * di - (di - 1) * ... * last pass: cw = 1 * */ /* * current destination pointer for weighted round-robin scheduling */ struct ip_vs_wrr_mark { struct ip_vs_dest *cl; /* current dest or head */ int cw; /* current weight */ int mw; /* maximum weight */ int di; /* decreasing interval */ struct rcu_head rcu_head; }; static int ip_vs_wrr_gcd_weight(struct ip_vs_service *svc) { struct ip_vs_dest *dest; int weight; int g = 0; list_for_each_entry(dest, &svc->destinations, n_list) { weight = atomic_read(&dest->weight); if (weight > 0) { if (g > 0) g = gcd(weight, g); else g = weight; } } return g ? g : 1; } /* * Get the maximum weight of the service destinations. */ static int ip_vs_wrr_max_weight(struct ip_vs_service *svc) { struct ip_vs_dest *dest; int new_weight, weight = 0; list_for_each_entry(dest, &svc->destinations, n_list) { new_weight = atomic_read(&dest->weight); if (new_weight > weight) weight = new_weight; } return weight; } static int ip_vs_wrr_init_svc(struct ip_vs_service *svc) { struct ip_vs_wrr_mark *mark; /* * Allocate the mark variable for WRR scheduling */ mark = kmalloc(sizeof(struct ip_vs_wrr_mark), GFP_KERNEL); if (mark == NULL) return -ENOMEM; mark->cl = list_entry(&svc->destinations, struct ip_vs_dest, n_list); mark->di = ip_vs_wrr_gcd_weight(svc); mark->mw = ip_vs_wrr_max_weight(svc) - (mark->di - 1); mark->cw = mark->mw; svc->sched_data = mark; return 0; } static void ip_vs_wrr_done_svc(struct ip_vs_service *svc) { struct ip_vs_wrr_mark *mark = svc->sched_data; /* * Release the mark variable */ kfree_rcu(mark, rcu_head); } static int ip_vs_wrr_dest_changed(struct ip_vs_service *svc, struct ip_vs_dest *dest) { struct ip_vs_wrr_mark *mark = svc->sched_data; spin_lock_bh(&svc->sched_lock); mark->cl = list_entry(&svc->destinations, struct ip_vs_dest, n_list); mark->di = ip_vs_wrr_gcd_weight(svc); mark->mw = ip_vs_wrr_max_weight(svc) - (mark->di - 1); if (mark->cw > mark->mw || !mark->cw) mark->cw = mark->mw; else if (mark->di > 1) mark->cw = (mark->cw / mark->di) * mark->di + 1; spin_unlock_bh(&svc->sched_lock); return 0; } /* * Weighted Round-Robin Scheduling */ static struct ip_vs_dest * ip_vs_wrr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb, struct ip_vs_iphdr *iph) { struct ip_vs_dest *dest, *last, *stop = NULL; struct ip_vs_wrr_mark *mark = svc->sched_data; bool last_pass = false, restarted = false; IP_VS_DBG(6, "%s(): Scheduling...\n", __func__); spin_lock_bh(&svc->sched_lock); dest = mark->cl; /* No available dests? */ if (mark->mw == 0) goto err_noavail; last = dest; /* Stop only after all dests were checked for weight >= 1 (last pass) */ while (1) { list_for_each_entry_continue_rcu(dest, &svc->destinations, n_list) { if (!(dest->flags & IP_VS_DEST_F_OVERLOAD) && atomic_read(&dest->weight) >= mark->cw) goto found; if (dest == stop) goto err_over; } mark->cw -= mark->di; if (mark->cw <= 0) { mark->cw = mark->mw; /* Stop if we tried last pass from first dest: * 1. last_pass: we started checks when cw > di but * then all dests were checked for w >= 1 * 2. last was head: the first and only traversal * was for weight >= 1, for all dests. */ if (last_pass || &last->n_list == &svc->destinations) goto err_over; restarted = true; } last_pass = mark->cw <= mark->di; if (last_pass && restarted && &last->n_list != &svc->destinations) { /* First traversal was for w >= 1 but only * for dests after 'last', now do the same * for all dests up to 'last'. */ stop = last; } } found: IP_VS_DBG_BUF(6, "WRR: server %s:%u " "activeconns %d refcnt %d weight %d\n", IP_VS_DBG_ADDR(dest->af, &dest->addr), ntohs(dest->port), atomic_read(&dest->activeconns), refcount_read(&dest->refcnt), atomic_read(&dest->weight)); mark->cl = dest; out: spin_unlock_bh(&svc->sched_lock); return dest; err_noavail: mark->cl = dest; dest = NULL; ip_vs_scheduler_err(svc, "no destination available"); goto out; err_over: mark->cl = dest; dest = NULL; ip_vs_scheduler_err(svc, "no destination available: " "all destinations are overloaded"); goto out; } static struct ip_vs_scheduler ip_vs_wrr_scheduler = { .name = "wrr", .refcnt = ATOMIC_INIT(0), .module = THIS_MODULE, .n_list = LIST_HEAD_INIT(ip_vs_wrr_scheduler.n_list), .init_service = ip_vs_wrr_init_svc, .done_service = ip_vs_wrr_done_svc, .add_dest = ip_vs_wrr_dest_changed, .del_dest = ip_vs_wrr_dest_changed, .upd_dest = ip_vs_wrr_dest_changed, .schedule = ip_vs_wrr_schedule, }; static int __init ip_vs_wrr_init(void) { return register_ip_vs_scheduler(&ip_vs_wrr_scheduler) ; } static void __exit ip_vs_wrr_cleanup(void) { unregister_ip_vs_scheduler(&ip_vs_wrr_scheduler); synchronize_rcu(); } module_init(ip_vs_wrr_init); module_exit(ip_vs_wrr_cleanup); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("ipvs weighted round-robin scheduler"); |
758 683 70 739 11 2 2 1 1 1 2 29 105 87 16 80 74 9 2 3 3 6 4 5 125 87 38 77 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Generic parts * Linux ethernet bridge * * Authors: * Lennert Buytenhek <buytenh@gnu.org> */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> #include <linux/init.h> #include <linux/llc.h> #include <net/llc.h> #include <net/stp.h> #include <net/switchdev.h> #include "br_private.h" /* * Handle changes in state of network devices enslaved to a bridge. * * Note: don't care about up/down if bridge itself is down, because * port state is checked when bridge is brought up. */ static int br_device_event(struct notifier_block *unused, unsigned long event, void *ptr) { struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(ptr); struct netdev_notifier_pre_changeaddr_info *prechaddr_info; struct net_device *dev = netdev_notifier_info_to_dev(ptr); struct net_bridge_port *p; struct net_bridge *br; bool notified = false; bool changed_addr; int err; if (netif_is_bridge_master(dev)) { err = br_vlan_bridge_event(dev, event, ptr); if (err) return notifier_from_errno(err); if (event == NETDEV_REGISTER) { /* register of bridge completed, add sysfs entries */ err = br_sysfs_addbr(dev); if (err) return notifier_from_errno(err); return NOTIFY_DONE; } } /* not a port of a bridge */ p = br_port_get_rtnl(dev); if (!p) return NOTIFY_DONE; br = p->br; switch (event) { case NETDEV_CHANGEMTU: br_mtu_auto_adjust(br); break; case NETDEV_PRE_CHANGEADDR: if (br->dev->addr_assign_type == NET_ADDR_SET) break; prechaddr_info = ptr; err = dev_pre_changeaddr_notify(br->dev, prechaddr_info->dev_addr, extack); if (err) return notifier_from_errno(err); break; case NETDEV_CHANGEADDR: spin_lock_bh(&br->lock); br_fdb_changeaddr(p, dev->dev_addr); changed_addr = br_stp_recalculate_bridge_id(br); spin_unlock_bh(&br->lock); if (changed_addr) call_netdevice_notifiers(NETDEV_CHANGEADDR, br->dev); break; case NETDEV_CHANGE: br_port_carrier_check(p, ¬ified); break; case NETDEV_FEAT_CHANGE: netdev_update_features(br->dev); break; case NETDEV_DOWN: spin_lock_bh(&br->lock); if (br->dev->flags & IFF_UP) { br_stp_disable_port(p); notified = true; } spin_unlock_bh(&br->lock); break; case NETDEV_UP: if (netif_running(br->dev) && netif_oper_up(dev)) { spin_lock_bh(&br->lock); br_stp_enable_port(p); notified = true; spin_unlock_bh(&br->lock); } break; case NETDEV_UNREGISTER: br_del_if(br, dev); break; case NETDEV_CHANGENAME: err = br_sysfs_renameif(p); if (err) return notifier_from_errno(err); break; case NETDEV_PRE_TYPE_CHANGE: /* Forbid underlying device to change its type. */ return NOTIFY_BAD; case NETDEV_RESEND_IGMP: /* Propagate to master device */ call_netdevice_notifiers(event, br->dev); break; } if (event != NETDEV_UNREGISTER) br_vlan_port_event(p, event); /* Events that may cause spanning tree to refresh */ if (!notified && (event == NETDEV_CHANGEADDR || event == NETDEV_UP || event == NETDEV_CHANGE || event == NETDEV_DOWN)) br_ifinfo_notify(RTM_NEWLINK, NULL, p); return NOTIFY_DONE; } static struct notifier_block br_device_notifier = { .notifier_call = br_device_event }; /* called with RTNL or RCU */ static int br_switchdev_event(struct notifier_block *unused, unsigned long event, void *ptr) { struct net_device *dev = switchdev_notifier_info_to_dev(ptr); struct net_bridge_port *p; struct net_bridge *br; struct switchdev_notifier_fdb_info *fdb_info; int err = NOTIFY_DONE; p = br_port_get_rtnl_rcu(dev); if (!p) goto out; br = p->br; switch (event) { case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_add(br, p, fdb_info->addr, fdb_info->vid, fdb_info->locked, false); if (err) { err = notifier_from_errno(err); break; } br_fdb_offloaded_set(br, p, fdb_info->addr, fdb_info->vid, fdb_info->offloaded); break; case SWITCHDEV_FDB_DEL_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_del(br, p, fdb_info->addr, fdb_info->vid, false); if (err) err = notifier_from_errno(err); break; case SWITCHDEV_FDB_OFFLOADED: fdb_info = ptr; br_fdb_offloaded_set(br, p, fdb_info->addr, fdb_info->vid, fdb_info->offloaded); break; case SWITCHDEV_FDB_FLUSH_TO_BRIDGE: fdb_info = ptr; /* Don't delete static entries */ br_fdb_delete_by_port(br, p, fdb_info->vid, 0); break; } out: return err; } static struct notifier_block br_switchdev_notifier = { .notifier_call = br_switchdev_event, }; /* called under rtnl_mutex */ static int br_switchdev_blocking_event(struct notifier_block *nb, unsigned long event, void *ptr) { struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(ptr); struct net_device *dev = switchdev_notifier_info_to_dev(ptr); struct switchdev_notifier_brport_info *brport_info; const struct switchdev_brport *b; struct net_bridge_port *p; int err = NOTIFY_DONE; p = br_port_get_rtnl(dev); if (!p) goto out; switch (event) { case SWITCHDEV_BRPORT_OFFLOADED: brport_info = ptr; b = &brport_info->brport; err = br_switchdev_port_offload(p, b->dev, b->ctx, b->atomic_nb, b->blocking_nb, b->tx_fwd_offload, extack); err = notifier_from_errno(err); break; case SWITCHDEV_BRPORT_UNOFFLOADED: brport_info = ptr; b = &brport_info->brport; br_switchdev_port_unoffload(p, b->ctx, b->atomic_nb, b->blocking_nb); break; case SWITCHDEV_BRPORT_REPLAY: brport_info = ptr; b = &brport_info->brport; err = br_switchdev_port_replay(p, b->dev, b->ctx, b->atomic_nb, b->blocking_nb, extack); err = notifier_from_errno(err); break; } out: return err; } static struct notifier_block br_switchdev_blocking_notifier = { .notifier_call = br_switchdev_blocking_event, }; /* br_boolopt_toggle - change user-controlled boolean option * * @br: bridge device * @opt: id of the option to change * @on: new option value * @extack: extack for error messages * * Changes the value of the respective boolean option to @on taking care of * any internal option value mapping and configuration. */ int br_boolopt_toggle(struct net_bridge *br, enum br_boolopt_id opt, bool on, struct netlink_ext_ack *extack) { int err = 0; switch (opt) { case BR_BOOLOPT_NO_LL_LEARN: br_opt_toggle(br, BROPT_NO_LL_LEARN, on); break; case BR_BOOLOPT_MCAST_VLAN_SNOOPING: err = br_multicast_toggle_vlan_snooping(br, on, extack); break; case BR_BOOLOPT_MST_ENABLE: err = br_mst_set_enabled(br, on, extack); break; default: /* shouldn't be called with unsupported options */ WARN_ON(1); break; } return err; } int br_boolopt_get(const struct net_bridge *br, enum br_boolopt_id opt) { switch (opt) { case BR_BOOLOPT_NO_LL_LEARN: return br_opt_get(br, BROPT_NO_LL_LEARN); case BR_BOOLOPT_MCAST_VLAN_SNOOPING: return br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED); case BR_BOOLOPT_MST_ENABLE: return br_opt_get(br, BROPT_MST_ENABLED); default: /* shouldn't be called with unsupported options */ WARN_ON(1); break; } return 0; } int br_boolopt_multi_toggle(struct net_bridge *br, struct br_boolopt_multi *bm, struct netlink_ext_ack *extack) { unsigned long bitmap = bm->optmask; int err = 0; int opt_id; for_each_set_bit(opt_id, &bitmap, BR_BOOLOPT_MAX) { bool on = !!(bm->optval & BIT(opt_id)); err = br_boolopt_toggle(br, opt_id, on, extack); if (err) { br_debug(br, "boolopt multi-toggle error: option: %d current: %d new: %d error: %d\n", opt_id, br_boolopt_get(br, opt_id), on, err); break; } } return err; } void br_boolopt_multi_get(const struct net_bridge *br, struct br_boolopt_multi *bm) { u32 optval = 0; int opt_id; for (opt_id = 0; opt_id < BR_BOOLOPT_MAX; opt_id++) optval |= (br_boolopt_get(br, opt_id) << opt_id); bm->optval = optval; bm->optmask = GENMASK((BR_BOOLOPT_MAX - 1), 0); } /* private bridge options, controlled by the kernel */ void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on) { bool cur = !!br_opt_get(br, opt); br_debug(br, "toggle option: %d state: %d -> %d\n", opt, cur, on); if (cur == on) return; if (on) set_bit(opt, &br->options); else clear_bit(opt, &br->options); } static void __net_exit br_net_exit_batch_rtnl(struct list_head *net_list, struct list_head *dev_to_kill) { struct net_device *dev; struct net *net; ASSERT_RTNL(); list_for_each_entry(net, net_list, exit_list) for_each_netdev(net, dev) if (netif_is_bridge_master(dev)) br_dev_delete(dev, dev_to_kill); } static struct pernet_operations br_net_ops = { .exit_batch_rtnl = br_net_exit_batch_rtnl, }; static const struct stp_proto br_stp_proto = { .rcv = br_stp_rcv, }; static int __init br_init(void) { int err; BUILD_BUG_ON(sizeof(struct br_input_skb_cb) > sizeof_field(struct sk_buff, cb)); err = stp_proto_register(&br_stp_proto); if (err < 0) { pr_err("bridge: can't register sap for STP\n"); return err; } err = br_fdb_init(); if (err) goto err_out; err = register_pernet_subsys(&br_net_ops); if (err) goto err_out1; err = br_nf_core_init(); if (err) goto err_out2; err = register_netdevice_notifier(&br_device_notifier); if (err) goto err_out3; err = register_switchdev_notifier(&br_switchdev_notifier); if (err) goto err_out4; err = register_switchdev_blocking_notifier(&br_switchdev_blocking_notifier); if (err) goto err_out5; err = br_netlink_init(); if (err) goto err_out6; brioctl_set(br_ioctl_stub); #if IS_ENABLED(CONFIG_ATM_LANE) br_fdb_test_addr_hook = br_fdb_test_addr; #endif #if IS_MODULE(CONFIG_BRIDGE_NETFILTER) pr_info("bridge: filtering via arp/ip/ip6tables is no longer available " "by default. Update your scripts to load br_netfilter if you " "need this.\n"); #endif return 0; err_out6: unregister_switchdev_blocking_notifier(&br_switchdev_blocking_notifier); err_out5: unregister_switchdev_notifier(&br_switchdev_notifier); err_out4: unregister_netdevice_notifier(&br_device_notifier); err_out3: br_nf_core_fini(); err_out2: unregister_pernet_subsys(&br_net_ops); err_out1: br_fdb_fini(); err_out: stp_proto_unregister(&br_stp_proto); return err; } static void __exit br_deinit(void) { stp_proto_unregister(&br_stp_proto); br_netlink_fini(); unregister_switchdev_blocking_notifier(&br_switchdev_blocking_notifier); unregister_switchdev_notifier(&br_switchdev_notifier); unregister_netdevice_notifier(&br_device_notifier); brioctl_set(NULL); unregister_pernet_subsys(&br_net_ops); rcu_barrier(); /* Wait for completion of call_rcu()'s */ br_nf_core_fini(); #if IS_ENABLED(CONFIG_ATM_LANE) br_fdb_test_addr_hook = NULL; #endif br_fdb_fini(); } module_init(br_init) module_exit(br_deinit) MODULE_LICENSE("GPL"); MODULE_VERSION(BR_VERSION); MODULE_ALIAS_RTNL_LINK("bridge"); MODULE_DESCRIPTION("Ethernet bridge driver"); |
8 40 196 189 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* SCTP kernel Implementation * (C) Copyright IBM Corp. 2001, 2004 * Copyright (C) 1999-2001 Cisco, Motorola * * This file is part of the SCTP kernel implementation * * These are the definitions needed for the command object. * * Please send any bug reports or fixes you make to the * email address(es): * lksctp developers <linux-sctp@vger.kernel.org> * * Written or modified by: * La Monte H.P. Yarroll <piggy@acm.org> * Karl Knutson <karl@athena.chicago.il.us> * Ardelle Fan <ardelle.fan@intel.com> * Sridhar Samudrala <sri@us.ibm.com> */ #ifndef __net_sctp_command_h__ #define __net_sctp_command_h__ #include <net/sctp/constants.h> #include <net/sctp/structs.h> enum sctp_verb { SCTP_CMD_NOP = 0, /* Do nothing. */ SCTP_CMD_NEW_ASOC, /* Register a new association. */ SCTP_CMD_DELETE_TCB, /* Delete the current association. */ SCTP_CMD_NEW_STATE, /* Enter a new state. */ SCTP_CMD_REPORT_TSN, /* Record the arrival of a TSN. */ SCTP_CMD_GEN_SACK, /* Send a Selective ACK (maybe). */ SCTP_CMD_PROCESS_SACK, /* Process an inbound SACK. */ SCTP_CMD_GEN_INIT_ACK, /* Generate an INIT ACK chunk. */ SCTP_CMD_PEER_INIT, /* Process a INIT from the peer. */ SCTP_CMD_GEN_COOKIE_ECHO, /* Generate a COOKIE ECHO chunk. */ SCTP_CMD_CHUNK_ULP, /* Send a chunk to the sockets layer. */ SCTP_CMD_EVENT_ULP, /* Send a notification to the sockets layer. */ SCTP_CMD_REPLY, /* Send a chunk to our peer. */ SCTP_CMD_SEND_PKT, /* Send a full packet to our peer. */ SCTP_CMD_RETRAN, /* Mark a transport for retransmission. */ SCTP_CMD_ECN_CE, /* Do delayed CE processing. */ SCTP_CMD_ECN_ECNE, /* Do delayed ECNE processing. */ SCTP_CMD_ECN_CWR, /* Do delayed CWR processing. */ SCTP_CMD_TIMER_START, /* Start a timer. */ SCTP_CMD_TIMER_START_ONCE, /* Start a timer once */ SCTP_CMD_TIMER_RESTART, /* Restart a timer. */ SCTP_CMD_TIMER_STOP, /* Stop a timer. */ SCTP_CMD_INIT_CHOOSE_TRANSPORT, /* Choose transport for an INIT. */ SCTP_CMD_INIT_COUNTER_RESET, /* Reset init counter. */ SCTP_CMD_INIT_COUNTER_INC, /* Increment init counter. */ SCTP_CMD_INIT_RESTART, /* High level, do init timer work. */ SCTP_CMD_COOKIEECHO_RESTART, /* High level, do cookie-echo timer work. */ SCTP_CMD_INIT_FAILED, /* High level, do init failure work. */ SCTP_CMD_REPORT_DUP, /* Report a duplicate TSN. */ SCTP_CMD_STRIKE, /* Mark a strike against a transport. */ SCTP_CMD_HB_TIMERS_START, /* Start the heartbeat timers. */ SCTP_CMD_HB_TIMER_UPDATE, /* Update a heartbeat timers. */ SCTP_CMD_HB_TIMERS_STOP, /* Stop the heartbeat timers. */ SCTP_CMD_PROBE_TIMER_UPDATE, /* Update a probe timer. */ SCTP_CMD_TRANSPORT_HB_SENT, /* Reset the status of a transport. */ SCTP_CMD_TRANSPORT_IDLE, /* Do manipulations on idle transport */ SCTP_CMD_TRANSPORT_ON, /* Mark the transport as active. */ SCTP_CMD_REPORT_ERROR, /* Pass this error back out of the sm. */ SCTP_CMD_REPORT_BAD_TAG, /* Verification tags didn't match. */ SCTP_CMD_PROCESS_CTSN, /* Sideeffect from shutdown. */ SCTP_CMD_ASSOC_FAILED, /* Handle association failure. */ SCTP_CMD_DISCARD_PACKET, /* Discard the whole packet. */ SCTP_CMD_GEN_SHUTDOWN, /* Generate a SHUTDOWN chunk. */ SCTP_CMD_PURGE_OUTQUEUE, /* Purge all data waiting to be sent. */ SCTP_CMD_SETUP_T2, /* Hi-level, setup T2-shutdown parms. */ SCTP_CMD_RTO_PENDING, /* Set transport's rto_pending. */ SCTP_CMD_PART_DELIVER, /* Partial data delivery considerations. */ SCTP_CMD_RENEGE, /* Renege data on an association. */ SCTP_CMD_SETUP_T4, /* ADDIP, setup T4 RTO timer parms. */ SCTP_CMD_PROCESS_OPERR, /* Process an ERROR chunk. */ SCTP_CMD_REPORT_FWDTSN, /* Report new cumulative TSN Ack. */ SCTP_CMD_PROCESS_FWDTSN, /* Skips were reported, so process further. */ SCTP_CMD_CLEAR_INIT_TAG, /* Clears association peer's inittag. */ SCTP_CMD_DEL_NON_PRIMARY, /* Removes non-primary peer transports. */ SCTP_CMD_T3_RTX_TIMERS_STOP, /* Stops T3-rtx pending timers */ SCTP_CMD_FORCE_PRIM_RETRAN, /* Forces retrans. over primary path. */ SCTP_CMD_SET_SK_ERR, /* Set sk_err */ SCTP_CMD_ASSOC_CHANGE, /* generate and send assoc_change event */ SCTP_CMD_ADAPTATION_IND, /* generate and send adaptation event */ SCTP_CMD_PEER_NO_AUTH, /* generate and send authentication event */ SCTP_CMD_ASSOC_SHKEY, /* generate the association shared keys */ SCTP_CMD_T1_RETRAN, /* Mark for retransmission after T1 timeout */ SCTP_CMD_UPDATE_INITTAG, /* Update peer inittag */ SCTP_CMD_SEND_MSG, /* Send the whole use message */ SCTP_CMD_PURGE_ASCONF_QUEUE, /* Purge all asconf queues.*/ SCTP_CMD_SET_ASOC, /* Restore association context */ SCTP_CMD_LAST }; /* How many commands can you put in an struct sctp_cmd_seq? * This is a rather arbitrary number, ideally derived from a careful * analysis of the state functions, but in reality just taken from * thin air in the hopes othat we don't trigger a kernel panic. */ #define SCTP_MAX_NUM_COMMANDS 20 union sctp_arg { void *zero_all; /* Set to NULL to clear the entire union */ __s32 i32; __u32 u32; __be32 be32; __u16 u16; __u8 u8; int error; __be16 err; enum sctp_state state; enum sctp_event_timeout to; struct sctp_chunk *chunk; struct sctp_association *asoc; struct sctp_transport *transport; struct sctp_bind_addr *bp; struct sctp_init_chunk *init; struct sctp_ulpevent *ulpevent; struct sctp_packet *packet; struct sctp_sackhdr *sackh; struct sctp_datamsg *msg; }; /* We are simulating ML type constructors here. * * SCTP_ARG_CONSTRUCTOR(NAME, TYPE, ELT) builds a function called * SCTP_NAME() which takes an argument of type TYPE and returns an * union sctp_arg. It does this by inserting the sole argument into * the ELT union element of a local union sctp_arg. * * E.g., SCTP_ARG_CONSTRUCTOR(I32, __s32, i32) builds SCTP_I32(arg), * which takes an __s32 and returns a union sctp_arg containing the * __s32. So, after foo = SCTP_I32(arg), foo.i32 == arg. */ #define SCTP_ARG_CONSTRUCTOR(name, type, elt) \ static inline union sctp_arg \ SCTP_## name (type arg) \ { union sctp_arg retval;\ retval.zero_all = NULL;\ retval.elt = arg;\ return retval;\ } SCTP_ARG_CONSTRUCTOR(I32, __s32, i32) SCTP_ARG_CONSTRUCTOR(U32, __u32, u32) SCTP_ARG_CONSTRUCTOR(BE32, __be32, be32) SCTP_ARG_CONSTRUCTOR(U16, __u16, u16) SCTP_ARG_CONSTRUCTOR(U8, __u8, u8) SCTP_ARG_CONSTRUCTOR(ERROR, int, error) SCTP_ARG_CONSTRUCTOR(PERR, __be16, err) /* protocol error */ SCTP_ARG_CONSTRUCTOR(STATE, enum sctp_state, state) SCTP_ARG_CONSTRUCTOR(TO, enum sctp_event_timeout, to) SCTP_ARG_CONSTRUCTOR(CHUNK, struct sctp_chunk *, chunk) SCTP_ARG_CONSTRUCTOR(ASOC, struct sctp_association *, asoc) SCTP_ARG_CONSTRUCTOR(TRANSPORT, struct sctp_transport *, transport) SCTP_ARG_CONSTRUCTOR(BA, struct sctp_bind_addr *, bp) SCTP_ARG_CONSTRUCTOR(PEER_INIT, struct sctp_init_chunk *, init) SCTP_ARG_CONSTRUCTOR(ULPEVENT, struct sctp_ulpevent *, ulpevent) SCTP_ARG_CONSTRUCTOR(PACKET, struct sctp_packet *, packet) SCTP_ARG_CONSTRUCTOR(SACKH, struct sctp_sackhdr *, sackh) SCTP_ARG_CONSTRUCTOR(DATAMSG, struct sctp_datamsg *, msg) static inline union sctp_arg SCTP_FORCE(void) { return SCTP_I32(1); } static inline union sctp_arg SCTP_NOFORCE(void) { return SCTP_I32(0); } static inline union sctp_arg SCTP_NULL(void) { union sctp_arg retval; retval.zero_all = NULL; return retval; } struct sctp_cmd { union sctp_arg obj; enum sctp_verb verb; }; struct sctp_cmd_seq { struct sctp_cmd cmds[SCTP_MAX_NUM_COMMANDS]; struct sctp_cmd *last_used_slot; struct sctp_cmd *next_cmd; }; /* Initialize a block of memory as a command sequence. * Return 0 if the initialization fails. */ static inline int sctp_init_cmd_seq(struct sctp_cmd_seq *seq) { /* cmds[] is filled backwards to simplify the overflow BUG() check */ seq->last_used_slot = seq->cmds + SCTP_MAX_NUM_COMMANDS; seq->next_cmd = seq->last_used_slot; return 1; /* We always succeed. */ } /* Add a command to an struct sctp_cmd_seq. * * Use the SCTP_* constructors defined by SCTP_ARG_CONSTRUCTOR() above * to wrap data which goes in the obj argument. */ static inline void sctp_add_cmd_sf(struct sctp_cmd_seq *seq, enum sctp_verb verb, union sctp_arg obj) { struct sctp_cmd *cmd = seq->last_used_slot - 1; BUG_ON(cmd < seq->cmds); cmd->verb = verb; cmd->obj = obj; seq->last_used_slot = cmd; } /* Return the next command structure in an sctp_cmd_seq. * Return NULL at the end of the sequence. */ static inline struct sctp_cmd *sctp_next_cmd(struct sctp_cmd_seq *seq) { if (seq->next_cmd <= seq->last_used_slot) return NULL; return --seq->next_cmd; } #endif /* __net_sctp_command_h__ */ |
49 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _LINUX_MBCACHE_H #define _LINUX_MBCACHE_H #include <linux/hash.h> #include <linux/list_bl.h> #include <linux/list.h> #include <linux/atomic.h> #include <linux/fs.h> struct mb_cache; /* Cache entry flags */ enum { MBE_REFERENCED_B = 0, MBE_REUSABLE_B }; struct mb_cache_entry { /* List of entries in cache - protected by cache->c_list_lock */ struct list_head e_list; /* * Hash table list - protected by hash chain bitlock. The entry is * guaranteed to be hashed while e_refcnt > 0. */ struct hlist_bl_node e_hash_list; /* * Entry refcount. Once it reaches zero, entry is unhashed and freed. * While refcount > 0, the entry is guaranteed to stay in the hash and * e.g. mb_cache_entry_try_delete() will fail. */ atomic_t e_refcnt; /* Key in hash - stable during lifetime of the entry */ u32 e_key; unsigned long e_flags; /* User provided value - stable during lifetime of the entry */ u64 e_value; }; struct mb_cache *mb_cache_create(int bucket_bits); void mb_cache_destroy(struct mb_cache *cache); int mb_cache_entry_create(struct mb_cache *cache, gfp_t mask, u32 key, u64 value, bool reusable); void __mb_cache_entry_free(struct mb_cache *cache, struct mb_cache_entry *entry); void mb_cache_entry_wait_unused(struct mb_cache_entry *entry); static inline void mb_cache_entry_put(struct mb_cache *cache, struct mb_cache_entry *entry) { unsigned int cnt = atomic_dec_return(&entry->e_refcnt); if (cnt > 0) { if (cnt <= 2) wake_up_var(&entry->e_refcnt); return; } __mb_cache_entry_free(cache, entry); } struct mb_cache_entry *mb_cache_entry_delete_or_get(struct mb_cache *cache, u32 key, u64 value); struct mb_cache_entry *mb_cache_entry_get(struct mb_cache *cache, u32 key, u64 value); struct mb_cache_entry *mb_cache_entry_find_first(struct mb_cache *cache, u32 key); struct mb_cache_entry *mb_cache_entry_find_next(struct mb_cache *cache, struct mb_cache_entry *entry); void mb_cache_entry_touch(struct mb_cache *cache, struct mb_cache_entry *entry); #endif /* _LINUX_MBCACHE_H */ |
560 569 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | // SPDX-License-Identifier: GPL-2.0 /* * This file contains functions which manage high resolution tick * related events. * * Copyright(C) 2005-2006, Thomas Gleixner <tglx@linutronix.de> * Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar * Copyright(C) 2006-2007, Timesys Corp., Thomas Gleixner */ #include <linux/cpu.h> #include <linux/err.h> #include <linux/hrtimer.h> #include <linux/interrupt.h> #include <linux/percpu.h> #include <linux/profile.h> #include <linux/sched.h> #include "tick-internal.h" /** * tick_program_event - program the CPU local timer device for the next event */ int tick_program_event(ktime_t expires, int force) { struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev); if (unlikely(expires == KTIME_MAX)) { /* * We don't need the clock event device any more, stop it. */ clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT_STOPPED); dev->next_event = KTIME_MAX; return 0; } if (unlikely(clockevent_state_oneshot_stopped(dev))) { /* * We need the clock event again, configure it in ONESHOT mode * before using it. */ clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT); } return clockevents_program_event(dev, expires, force); } /** * tick_resume_oneshot - resume oneshot mode */ void tick_resume_oneshot(void) { struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev); clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT); clockevents_program_event(dev, ktime_get(), true); } /** * tick_setup_oneshot - setup the event device for oneshot mode (hres or nohz) */ void tick_setup_oneshot(struct clock_event_device *newdev, void (*handler)(struct clock_event_device *), ktime_t next_event) { newdev->event_handler = handler; clockevents_switch_state(newdev, CLOCK_EVT_STATE_ONESHOT); clockevents_program_event(newdev, next_event, true); } /** * tick_switch_to_oneshot - switch to oneshot mode */ int tick_switch_to_oneshot(void (*handler)(struct clock_event_device *)) { struct tick_device *td = this_cpu_ptr(&tick_cpu_device); struct clock_event_device *dev = td->evtdev; if (!dev || !(dev->features & CLOCK_EVT_FEAT_ONESHOT) || !tick_device_is_functional(dev)) { pr_info("Clockevents: could not switch to one-shot mode:"); if (!dev) { pr_cont(" no tick device\n"); } else { if (!tick_device_is_functional(dev)) pr_cont(" %s is not functional.\n", dev->name); else pr_cont(" %s does not support one-shot mode.\n", dev->name); } return -EINVAL; } td->mode = TICKDEV_MODE_ONESHOT; dev->event_handler = handler; clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT); tick_broadcast_switch_to_oneshot(); return 0; } /** * tick_oneshot_mode_active - check whether the system is in oneshot mode * * returns 1 when either nohz or highres are enabled. otherwise 0. */ int tick_oneshot_mode_active(void) { unsigned long flags; int ret; local_irq_save(flags); ret = __this_cpu_read(tick_cpu_device.mode) == TICKDEV_MODE_ONESHOT; local_irq_restore(flags); return ret; } #ifdef CONFIG_HIGH_RES_TIMERS /** * tick_init_highres - switch to high resolution mode * * Called with interrupts disabled. */ int tick_init_highres(void) { return tick_switch_to_oneshot(hrtimer_interrupt); } #endif |
47 46 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | // SPDX-License-Identifier: GPL-2.0-or-later /* mpihelp-lshift.c - MPI helper functions * Copyright (C) 1994, 1996, 1998, 2001 Free Software Foundation, Inc. * * This file is part of GnuPG. * * Note: This code is heavily based on the GNU MP Library. * Actually it's the same code with only minor changes in the * way the data is stored; this is to support the abstraction * of an optional secure memory allocation which may be used * to avoid revealing of sensitive data due to paging etc. * The GNU MP Library itself is published under the LGPL; * however I decided to publish this code under the plain GPL. */ #include "mpi-internal.h" /* Shift U (pointed to by UP and USIZE digits long) CNT bits to the left * and store the USIZE least significant digits of the result at WP. * Return the bits shifted out from the most significant digit. * * Argument constraints: * 1. 0 < CNT < BITS_PER_MP_LIMB * 2. If the result is to be written over the input, WP must be >= UP. */ mpi_limb_t mpihelp_lshift(mpi_ptr_t wp, mpi_ptr_t up, mpi_size_t usize, unsigned int cnt) { mpi_limb_t high_limb, low_limb; unsigned sh_1, sh_2; mpi_size_t i; mpi_limb_t retval; sh_1 = cnt; wp += 1; sh_2 = BITS_PER_MPI_LIMB - sh_1; i = usize - 1; low_limb = up[i]; retval = low_limb >> sh_2; high_limb = low_limb; while (--i >= 0) { low_limb = up[i]; wp[i] = (high_limb << sh_1) | (low_limb >> sh_2); high_limb = low_limb; } wp[i] = high_limb << sh_1; return retval; } |
31 11 107 30 22 54 14 50 11 7 39 11 13 9 7 4 12 13 11 19 11 19 20 20 10 38 38 34 16 16 16 18 20 20 3 20 37 3 39 40 37 13 29 27 14 23 4 4 17 12 20 3 21 13 3 8 13 6 7 45 1 47 17 2 44 26 1 27 16 3 17 15 4 14 17 7 15 37 34 4 33 16 16 4 17 19 1 18 18 1 8 8 8 8 8 8 7 8 8 8 8 51 92 21 55 54 56 44 53 42 12 55 62 60 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 | /* * net/tipc/name_table.c: TIPC name table code * * Copyright (c) 2000-2006, 2014-2018, Ericsson AB * Copyright (c) 2004-2008, 2010-2014, Wind River Systems * Copyright (c) 2020-2021, Red Hat Inc * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the names of the copyright holders nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * Alternatively, this software may be distributed under the terms of the * GNU General Public License ("GPL") version 2 as published by the Free * Software Foundation. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ #include <net/sock.h> #include <linux/list_sort.h> #include <linux/rbtree_augmented.h> #include "core.h" #include "netlink.h" #include "name_table.h" #include "name_distr.h" #include "subscr.h" #include "bcast.h" #include "addr.h" #include "node.h" #include "group.h" /** * struct service_range - container for all bindings of a service range * @lower: service range lower bound * @upper: service range upper bound * @tree_node: member of service range RB tree * @max: largest 'upper' in this node subtree * @local_publ: list of identical publications made from this node * Used by closest_first lookup and multicast lookup algorithm * @all_publ: all publications identical to this one, whatever node and scope * Used by round-robin lookup algorithm */ struct service_range { u32 lower; u32 upper; struct rb_node tree_node; u32 max; struct list_head local_publ; struct list_head all_publ; }; /** * struct tipc_service - container for all published instances of a service type * @type: 32 bit 'type' value for service * @publ_cnt: increasing counter for publications in this service * @ranges: rb tree containing all service ranges for this service * @service_list: links to adjacent name ranges in hash chain * @subscriptions: list of subscriptions for this service type * @lock: spinlock controlling access to pertaining service ranges/publications * @rcu: RCU callback head used for deferred freeing */ struct tipc_service { u32 type; u32 publ_cnt; struct rb_root ranges; struct hlist_node service_list; struct list_head subscriptions; spinlock_t lock; /* Covers service range list */ struct rcu_head rcu; }; #define service_range_upper(sr) ((sr)->upper) RB_DECLARE_CALLBACKS_MAX(static, sr_callbacks, struct service_range, tree_node, u32, max, service_range_upper) #define service_range_entry(rbtree_node) \ (container_of(rbtree_node, struct service_range, tree_node)) #define service_range_overlap(sr, start, end) \ ((sr)->lower <= (end) && (sr)->upper >= (start)) /** * service_range_foreach_match - iterate over tipc service rbtree for each * range match * @sr: the service range pointer as a loop cursor * @sc: the pointer to tipc service which holds the service range rbtree * @start: beginning of the search range (end >= start) for matching * @end: end of the search range (end >= start) for matching */ #define service_range_foreach_match(sr, sc, start, end) \ for (sr = service_range_match_first((sc)->ranges.rb_node, \ start, \ end); \ sr; \ sr = service_range_match_next(&(sr)->tree_node, \ start, \ end)) /** * service_range_match_first - find first service range matching a range * @n: the root node of service range rbtree for searching * @start: beginning of the search range (end >= start) for matching * @end: end of the search range (end >= start) for matching * * Return: the leftmost service range node in the rbtree that overlaps the * specific range if any. Otherwise, returns NULL. */ static struct service_range *service_range_match_first(struct rb_node *n, u32 start, u32 end) { struct service_range *sr; struct rb_node *l, *r; /* Non overlaps in tree at all? */ if (!n || service_range_entry(n)->max < start) return NULL; while (n) { l = n->rb_left; if (l && service_range_entry(l)->max >= start) { /* A leftmost overlap range node must be one in the left * subtree. If not, it has lower > end, then nodes on * the right side cannot satisfy the condition either. */ n = l; continue; } /* No one in the left subtree can match, return if this node is * an overlap i.e. leftmost. */ sr = service_range_entry(n); if (service_range_overlap(sr, start, end)) return sr; /* Ok, try to lookup on the right side */ r = n->rb_right; if (sr->lower <= end && r && service_range_entry(r)->max >= start) { n = r; continue; } break; } return NULL; } /** * service_range_match_next - find next service range matching a range * @n: a node in service range rbtree from which the searching starts * @start: beginning of the search range (end >= start) for matching * @end: end of the search range (end >= start) for matching * * Return: the next service range node to the given node in the rbtree that * overlaps the specific range if any. Otherwise, returns NULL. */ static struct service_range *service_range_match_next(struct rb_node *n, u32 start, u32 end) { struct service_range *sr; struct rb_node *p, *r; while (n) { r = n->rb_right; if (r && service_range_entry(r)->max >= start) /* A next overlap range node must be one in the right * subtree. If not, it has lower > end, then any next * successor (- an ancestor) of this node cannot * satisfy the condition either. */ return service_range_match_first(r, start, end); /* No one in the right subtree can match, go up to find an * ancestor of this node which is parent of a left-hand child. */ while ((p = rb_parent(n)) && n == p->rb_right) n = p; if (!p) break; /* Return if this ancestor is an overlap */ sr = service_range_entry(p); if (service_range_overlap(sr, start, end)) return sr; /* Ok, try to lookup more from this ancestor */ if (sr->lower <= end) { n = p; continue; } break; } return NULL; } static int hash(int x) { return x & (TIPC_NAMETBL_SIZE - 1); } /** * tipc_publ_create - create a publication structure * @ua: the service range the user is binding to * @sk: the address of the socket that is bound * @key: publication key */ static struct publication *tipc_publ_create(struct tipc_uaddr *ua, struct tipc_socket_addr *sk, u32 key) { struct publication *p = kzalloc(sizeof(*p), GFP_ATOMIC); if (!p) return NULL; p->sr = ua->sr; p->sk = *sk; p->scope = ua->scope; p->key = key; INIT_LIST_HEAD(&p->binding_sock); INIT_LIST_HEAD(&p->binding_node); INIT_LIST_HEAD(&p->local_publ); INIT_LIST_HEAD(&p->all_publ); INIT_LIST_HEAD(&p->list); return p; } /** * tipc_service_create - create a service structure for the specified 'type' * @net: network namespace * @ua: address representing the service to be bound * * Allocates a single range structure and sets it to all 0's. */ static struct tipc_service *tipc_service_create(struct net *net, struct tipc_uaddr *ua) { struct name_table *nt = tipc_name_table(net); struct tipc_service *service; struct hlist_head *hd; service = kzalloc(sizeof(*service), GFP_ATOMIC); if (!service) { pr_warn("Service creation failed, no memory\n"); return NULL; } spin_lock_init(&service->lock); service->type = ua->sr.type; service->ranges = RB_ROOT; INIT_HLIST_NODE(&service->service_list); INIT_LIST_HEAD(&service->subscriptions); hd = &nt->services[hash(ua->sr.type)]; hlist_add_head_rcu(&service->service_list, hd); return service; } /* tipc_service_find_range - find service range matching publication parameters */ static struct service_range *tipc_service_find_range(struct tipc_service *sc, struct tipc_uaddr *ua) { struct service_range *sr; service_range_foreach_match(sr, sc, ua->sr.lower, ua->sr.upper) { /* Look for exact match */ if (sr->lower == ua->sr.lower && sr->upper == ua->sr.upper) return sr; } return NULL; } static struct service_range *tipc_service_create_range(struct tipc_service *sc, struct publication *p) { struct rb_node **n, *parent = NULL; struct service_range *sr; u32 lower = p->sr.lower; u32 upper = p->sr.upper; n = &sc->ranges.rb_node; while (*n) { parent = *n; sr = service_range_entry(parent); if (lower == sr->lower && upper == sr->upper) return sr; if (sr->max < upper) sr->max = upper; if (lower <= sr->lower) n = &parent->rb_left; else n = &parent->rb_right; } sr = kzalloc(sizeof(*sr), GFP_ATOMIC); if (!sr) return NULL; sr->lower = lower; sr->upper = upper; sr->max = upper; INIT_LIST_HEAD(&sr->local_publ); INIT_LIST_HEAD(&sr->all_publ); rb_link_node(&sr->tree_node, parent, n); rb_insert_augmented(&sr->tree_node, &sc->ranges, &sr_callbacks); return sr; } static bool tipc_service_insert_publ(struct net *net, struct tipc_service *sc, struct publication *p) { struct tipc_subscription *sub, *tmp; struct service_range *sr; struct publication *_p; u32 node = p->sk.node; bool first = false; bool res = false; u32 key = p->key; spin_lock_bh(&sc->lock); sr = tipc_service_create_range(sc, p); if (!sr) goto exit; first = list_empty(&sr->all_publ); /* Return if the publication already exists */ list_for_each_entry(_p, &sr->all_publ, all_publ) { if (_p->key == key && (!_p->sk.node || _p->sk.node == node)) { pr_debug("Failed to bind duplicate %u,%u,%u/%u:%u/%u\n", p->sr.type, p->sr.lower, p->sr.upper, node, p->sk.ref, key); goto exit; } } if (in_own_node(net, p->sk.node)) list_add(&p->local_publ, &sr->local_publ); list_add(&p->all_publ, &sr->all_publ); p->id = sc->publ_cnt++; /* Any subscriptions waiting for notification? */ list_for_each_entry_safe(sub, tmp, &sc->subscriptions, service_list) { tipc_sub_report_overlap(sub, p, TIPC_PUBLISHED, first); } res = true; exit: if (!res) pr_warn("Failed to bind to %u,%u,%u\n", p->sr.type, p->sr.lower, p->sr.upper); spin_unlock_bh(&sc->lock); return res; } /** * tipc_service_remove_publ - remove a publication from a service * @r: service_range to remove publication from * @sk: address publishing socket * @key: target publication key */ static struct publication *tipc_service_remove_publ(struct service_range *r, struct tipc_socket_addr *sk, u32 key) { struct publication *p; u32 node = sk->node; list_for_each_entry(p, &r->all_publ, all_publ) { if (p->key != key || (node && node != p->sk.node)) continue; list_del(&p->all_publ); list_del(&p->local_publ); return p; } return NULL; } /* * Code reused: time_after32() for the same purpose */ #define publication_after(pa, pb) time_after32((pa)->id, (pb)->id) static int tipc_publ_sort(void *priv, const struct list_head *a, const struct list_head *b) { struct publication *pa, *pb; pa = container_of(a, struct publication, list); pb = container_of(b, struct publication, list); return publication_after(pa, pb); } /** * tipc_service_subscribe - attach a subscription, and optionally * issue the prescribed number of events if there is any service * range overlapping with the requested range * @service: the tipc_service to attach the @sub to * @sub: the subscription to attach */ static void tipc_service_subscribe(struct tipc_service *service, struct tipc_subscription *sub) { struct publication *p, *first, *tmp; struct list_head publ_list; struct service_range *sr; u32 filter, lower, upper; filter = sub->s.filter; lower = sub->s.seq.lower; upper = sub->s.seq.upper; tipc_sub_get(sub); list_add(&sub->service_list, &service->subscriptions); if (filter & TIPC_SUB_NO_STATUS) return; INIT_LIST_HEAD(&publ_list); service_range_foreach_match(sr, service, lower, upper) { first = NULL; list_for_each_entry(p, &sr->all_publ, all_publ) { if (filter & TIPC_SUB_PORTS) list_add_tail(&p->list, &publ_list); else if (!first || publication_after(first, p)) /* Pick this range's *first* publication */ first = p; } if (first) list_add_tail(&first->list, &publ_list); } /* Sort the publications before reporting */ list_sort(NULL, &publ_list, tipc_publ_sort); list_for_each_entry_safe(p, tmp, &publ_list, list) { tipc_sub_report_overlap(sub, p, TIPC_PUBLISHED, true); list_del_init(&p->list); } } static struct tipc_service *tipc_service_find(struct net *net, struct tipc_uaddr *ua) { struct name_table *nt = tipc_name_table(net); struct hlist_head *service_head; struct tipc_service *service; service_head = &nt->services[hash(ua->sr.type)]; hlist_for_each_entry_rcu(service, service_head, service_list) { if (service->type == ua->sr.type) return service; } return NULL; }; struct publication *tipc_nametbl_insert_publ(struct net *net, struct tipc_uaddr *ua, struct tipc_socket_addr *sk, u32 key) { struct tipc_service *sc; struct publication *p; p = tipc_publ_create(ua, sk, key); if (!p) return NULL; sc = tipc_service_find(net, ua); if (!sc) sc = tipc_service_create(net, ua); if (sc && tipc_service_insert_publ(net, sc, p)) return p; kfree(p); return NULL; } struct publication *tipc_nametbl_remove_publ(struct net *net, struct tipc_uaddr *ua, struct tipc_socket_addr *sk, u32 key) { struct tipc_subscription *sub, *tmp; struct publication *p = NULL; struct service_range *sr; struct tipc_service *sc; bool last; sc = tipc_service_find(net, ua); if (!sc) goto exit; spin_lock_bh(&sc->lock); sr = tipc_service_find_range(sc, ua); if (!sr) goto unlock; p = tipc_service_remove_publ(sr, sk, key); if (!p) goto unlock; /* Notify any waiting subscriptions */ last = list_empty(&sr->all_publ); list_for_each_entry_safe(sub, tmp, &sc->subscriptions, service_list) { tipc_sub_report_overlap(sub, p, TIPC_WITHDRAWN, last); } /* Remove service range item if this was its last publication */ if (list_empty(&sr->all_publ)) { rb_erase_augmented(&sr->tree_node, &sc->ranges, &sr_callbacks); kfree(sr); } /* Delete service item if no more publications and subscriptions */ if (RB_EMPTY_ROOT(&sc->ranges) && list_empty(&sc->subscriptions)) { hlist_del_init_rcu(&sc->service_list); kfree_rcu(sc, rcu); } unlock: spin_unlock_bh(&sc->lock); exit: if (!p) { pr_err("Failed to remove unknown binding: %u,%u,%u/%u:%u/%u\n", ua->sr.type, ua->sr.lower, ua->sr.upper, sk->node, sk->ref, key); } return p; } /** * tipc_nametbl_lookup_anycast - perform service instance to socket translation * @net: network namespace * @ua: service address to look up * @sk: address to socket we want to find * * On entry, a non-zero 'sk->node' indicates the node where we want lookup to be * performed, which may not be this one. * * On exit: * * - If lookup is deferred to another node, leave 'sk->node' unchanged and * return 'true'. * - If lookup is successful, set the 'sk->node' and 'sk->ref' (== portid) which * represent the bound socket and return 'true'. * - If lookup fails, return 'false' * * Note that for legacy users (node configured with Z.C.N address format) the * 'closest-first' lookup algorithm must be maintained, i.e., if sk.node is 0 * we must look in the local binding list first */ bool tipc_nametbl_lookup_anycast(struct net *net, struct tipc_uaddr *ua, struct tipc_socket_addr *sk) { struct tipc_net *tn = tipc_net(net); bool legacy = tn->legacy_addr_format; u32 self = tipc_own_addr(net); u32 inst = ua->sa.instance; struct service_range *r; struct tipc_service *sc; struct publication *p; struct list_head *l; bool res = false; if (!tipc_in_scope(legacy, sk->node, self)) return true; rcu_read_lock(); sc = tipc_service_find(net, ua); if (unlikely(!sc)) goto exit; spin_lock_bh(&sc->lock); service_range_foreach_match(r, sc, inst, inst) { /* Select lookup algo: local, closest-first or round-robin */ if (sk->node == self) { l = &r->local_publ; if (list_empty(l)) continue; p = list_first_entry(l, struct publication, local_publ); list_move_tail(&p->local_publ, &r->local_publ); } else if (legacy && !sk->node && !list_empty(&r->local_publ)) { l = &r->local_publ; p = list_first_entry(l, struct publication, local_publ); list_move_tail(&p->local_publ, &r->local_publ); } else { l = &r->all_publ; p = list_first_entry(l, struct publication, all_publ); list_move_tail(&p->all_publ, &r->all_publ); } *sk = p->sk; res = true; /* Todo: as for legacy, pick the first matching range only, a * "true" round-robin will be performed as needed. */ break; } spin_unlock_bh(&sc->lock); exit: rcu_read_unlock(); return res; } /* tipc_nametbl_lookup_group(): lookup destinaton(s) in a communication group * Returns a list of one (== group anycast) or more (== group multicast) * destination socket/node pairs matching the given address. * The requester may or may not want to exclude himself from the list. */ bool tipc_nametbl_lookup_group(struct net *net, struct tipc_uaddr *ua, struct list_head *dsts, int *dstcnt, u32 exclude, bool mcast) { u32 self = tipc_own_addr(net); u32 inst = ua->sa.instance; struct service_range *sr; struct tipc_service *sc; struct publication *p; *dstcnt = 0; rcu_read_lock(); sc = tipc_service_find(net, ua); if (unlikely(!sc)) goto exit; spin_lock_bh(&sc->lock); /* Todo: a full search i.e. service_range_foreach_match() instead? */ sr = service_range_match_first(sc->ranges.rb_node, inst, inst); if (!sr) goto no_match; list_for_each_entry(p, &sr->all_publ, all_publ) { if (p->scope != ua->scope) continue; if (p->sk.ref == exclude && p->sk.node == self) continue; tipc_dest_push(dsts, p->sk.node, p->sk.ref); (*dstcnt)++; if (mcast) continue; list_move_tail(&p->all_publ, &sr->all_publ); break; } no_match: spin_unlock_bh(&sc->lock); exit: rcu_read_unlock(); return !list_empty(dsts); } /* tipc_nametbl_lookup_mcast_sockets(): look up node local destinaton sockets * matching the given address * Used on nodes which have received a multicast/broadcast message * Returns a list of local sockets */ void tipc_nametbl_lookup_mcast_sockets(struct net *net, struct tipc_uaddr *ua, struct list_head *dports) { struct service_range *sr; struct tipc_service *sc; struct publication *p; u8 scope = ua->scope; rcu_read_lock(); sc = tipc_service_find(net, ua); if (!sc) goto exit; spin_lock_bh(&sc->lock); service_range_foreach_match(sr, sc, ua->sr.lower, ua->sr.upper) { list_for_each_entry(p, &sr->local_publ, local_publ) { if (scope == p->scope || scope == TIPC_ANY_SCOPE) tipc_dest_push(dports, 0, p->sk.ref); } } spin_unlock_bh(&sc->lock); exit: rcu_read_unlock(); } /* tipc_nametbl_lookup_mcast_nodes(): look up all destination nodes matching * the given address. Used in sending node. * Used on nodes which are sending out a multicast/broadcast message * Returns a list of nodes, including own node if applicable */ void tipc_nametbl_lookup_mcast_nodes(struct net *net, struct tipc_uaddr *ua, struct tipc_nlist *nodes) { struct service_range *sr; struct tipc_service *sc; struct publication *p; rcu_read_lock(); sc = tipc_service_find(net, ua); if (!sc) goto exit; spin_lock_bh(&sc->lock); service_range_foreach_match(sr, sc, ua->sr.lower, ua->sr.upper) { list_for_each_entry(p, &sr->all_publ, all_publ) { tipc_nlist_add(nodes, p->sk.node); } } spin_unlock_bh(&sc->lock); exit: rcu_read_unlock(); } /* tipc_nametbl_build_group - build list of communication group members */ void tipc_nametbl_build_group(struct net *net, struct tipc_group *grp, struct tipc_uaddr *ua) { struct service_range *sr; struct tipc_service *sc; struct publication *p; struct rb_node *n; rcu_read_lock(); sc = tipc_service_find(net, ua); if (!sc) goto exit; spin_lock_bh(&sc->lock); for (n = rb_first(&sc->ranges); n; n = rb_next(n)) { sr = container_of(n, struct service_range, tree_node); list_for_each_entry(p, &sr->all_publ, all_publ) { if (p->scope != ua->scope) continue; tipc_group_add_member(grp, p->sk.node, p->sk.ref, p->sr.lower); } } spin_unlock_bh(&sc->lock); exit: rcu_read_unlock(); } /* tipc_nametbl_publish - add service binding to name table */ struct publication *tipc_nametbl_publish(struct net *net, struct tipc_uaddr *ua, struct tipc_socket_addr *sk, u32 key) { struct name_table *nt = tipc_name_table(net); struct tipc_net *tn = tipc_net(net); struct publication *p = NULL; struct sk_buff *skb = NULL; u32 rc_dests; spin_lock_bh(&tn->nametbl_lock); if (nt->local_publ_count >= TIPC_MAX_PUBL) { pr_warn("Bind failed, max limit %u reached\n", TIPC_MAX_PUBL); goto exit; } p = tipc_nametbl_insert_publ(net, ua, sk, key); if (p) { nt->local_publ_count++; skb = tipc_named_publish(net, p); } rc_dests = nt->rc_dests; exit: spin_unlock_bh(&tn->nametbl_lock); if (skb) tipc_node_broadcast(net, skb, rc_dests); return p; } /** * tipc_nametbl_withdraw - withdraw a service binding * @net: network namespace * @ua: service address/range being unbound * @sk: address of the socket being unbound from * @key: target publication key */ void tipc_nametbl_withdraw(struct net *net, struct tipc_uaddr *ua, struct tipc_socket_addr *sk, u32 key) { struct name_table *nt = tipc_name_table(net); struct tipc_net *tn = tipc_net(net); struct sk_buff *skb = NULL; struct publication *p; u32 rc_dests; spin_lock_bh(&tn->nametbl_lock); p = tipc_nametbl_remove_publ(net, ua, sk, key); if (p) { nt->local_publ_count--; skb = tipc_named_withdraw(net, p); list_del_init(&p->binding_sock); kfree_rcu(p, rcu); } rc_dests = nt->rc_dests; spin_unlock_bh(&tn->nametbl_lock); if (skb) tipc_node_broadcast(net, skb, rc_dests); } /** * tipc_nametbl_subscribe - add a subscription object to the name table * @sub: subscription to add */ bool tipc_nametbl_subscribe(struct tipc_subscription *sub) { struct tipc_net *tn = tipc_net(sub->net); u32 type = sub->s.seq.type; struct tipc_service *sc; struct tipc_uaddr ua; bool res = true; tipc_uaddr(&ua, TIPC_SERVICE_RANGE, TIPC_NODE_SCOPE, type, sub->s.seq.lower, sub->s.seq.upper); spin_lock_bh(&tn->nametbl_lock); sc = tipc_service_find(sub->net, &ua); if (!sc) sc = tipc_service_create(sub->net, &ua); if (sc) { spin_lock_bh(&sc->lock); tipc_service_subscribe(sc, sub); spin_unlock_bh(&sc->lock); } else { pr_warn("Failed to subscribe for {%u,%u,%u}\n", type, sub->s.seq.lower, sub->s.seq.upper); res = false; } spin_unlock_bh(&tn->nametbl_lock); return res; } /** * tipc_nametbl_unsubscribe - remove a subscription object from name table * @sub: subscription to remove */ void tipc_nametbl_unsubscribe(struct tipc_subscription *sub) { struct tipc_net *tn = tipc_net(sub->net); struct tipc_service *sc; struct tipc_uaddr ua; tipc_uaddr(&ua, TIPC_SERVICE_RANGE, TIPC_NODE_SCOPE, sub->s.seq.type, sub->s.seq.lower, sub->s.seq.upper); spin_lock_bh(&tn->nametbl_lock); sc = tipc_service_find(sub->net, &ua); if (!sc) goto exit; spin_lock_bh(&sc->lock); list_del_init(&sub->service_list); tipc_sub_put(sub); /* Delete service item if no more publications and subscriptions */ if (RB_EMPTY_ROOT(&sc->ranges) && list_empty(&sc->subscriptions)) { hlist_del_init_rcu(&sc->service_list); kfree_rcu(sc, rcu); } spin_unlock_bh(&sc->lock); exit: spin_unlock_bh(&tn->nametbl_lock); } int tipc_nametbl_init(struct net *net) { struct tipc_net *tn = tipc_net(net); struct name_table *nt; int i; nt = kzalloc(sizeof(*nt), GFP_KERNEL); if (!nt) return -ENOMEM; for (i = 0; i < TIPC_NAMETBL_SIZE; i++) INIT_HLIST_HEAD(&nt->services[i]); INIT_LIST_HEAD(&nt->node_scope); INIT_LIST_HEAD(&nt->cluster_scope); rwlock_init(&nt->cluster_scope_lock); tn->nametbl = nt; spin_lock_init(&tn->nametbl_lock); return 0; } /** * tipc_service_delete - purge all publications for a service and delete it * @net: the associated network namespace * @sc: tipc_service to delete */ static void tipc_service_delete(struct net *net, struct tipc_service *sc) { struct service_range *sr, *tmpr; struct publication *p, *tmp; spin_lock_bh(&sc->lock); rbtree_postorder_for_each_entry_safe(sr, tmpr, &sc->ranges, tree_node) { list_for_each_entry_safe(p, tmp, &sr->all_publ, all_publ) { tipc_service_remove_publ(sr, &p->sk, p->key); kfree_rcu(p, rcu); } rb_erase_augmented(&sr->tree_node, &sc->ranges, &sr_callbacks); kfree(sr); } hlist_del_init_rcu(&sc->service_list); spin_unlock_bh(&sc->lock); kfree_rcu(sc, rcu); } void tipc_nametbl_stop(struct net *net) { struct name_table *nt = tipc_name_table(net); struct tipc_net *tn = tipc_net(net); struct hlist_head *service_head; struct tipc_service *service; u32 i; /* Verify name table is empty and purge any lingering * publications, then release the name table */ spin_lock_bh(&tn->nametbl_lock); for (i = 0; i < TIPC_NAMETBL_SIZE; i++) { if (hlist_empty(&nt->services[i])) continue; service_head = &nt->services[i]; hlist_for_each_entry_rcu(service, service_head, service_list) { tipc_service_delete(net, service); } } spin_unlock_bh(&tn->nametbl_lock); synchronize_net(); kfree(nt); } static int __tipc_nl_add_nametable_publ(struct tipc_nl_msg *msg, struct tipc_service *service, struct service_range *sr, u32 *last_key) { struct publication *p; struct nlattr *attrs; struct nlattr *b; void *hdr; if (*last_key) { list_for_each_entry(p, &sr->all_publ, all_publ) if (p->key == *last_key) break; if (list_entry_is_head(p, &sr->all_publ, all_publ)) return -EPIPE; } else { p = list_first_entry(&sr->all_publ, struct publication, all_publ); } list_for_each_entry_from(p, &sr->all_publ, all_publ) { *last_key = p->key; hdr = genlmsg_put(msg->skb, msg->portid, msg->seq, &tipc_genl_family, NLM_F_MULTI, TIPC_NL_NAME_TABLE_GET); if (!hdr) return -EMSGSIZE; attrs = nla_nest_start_noflag(msg->skb, TIPC_NLA_NAME_TABLE); if (!attrs) goto msg_full; b = nla_nest_start_noflag(msg->skb, TIPC_NLA_NAME_TABLE_PUBL); if (!b) goto attr_msg_full; if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_TYPE, service->type)) goto publ_msg_full; if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_LOWER, sr->lower)) goto publ_msg_full; if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_UPPER, sr->upper)) goto publ_msg_full; if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_SCOPE, p->scope)) goto publ_msg_full; if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_NODE, p->sk.node)) goto publ_msg_full; if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_REF, p->sk.ref)) goto publ_msg_full; if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_KEY, p->key)) goto publ_msg_full; nla_nest_end(msg->skb, b); nla_nest_end(msg->skb, attrs); genlmsg_end(msg->skb, hdr); } *last_key = 0; return 0; publ_msg_full: nla_nest_cancel(msg->skb, b); attr_msg_full: nla_nest_cancel(msg->skb, attrs); msg_full: genlmsg_cancel(msg->skb, hdr); return -EMSGSIZE; } static int __tipc_nl_service_range_list(struct tipc_nl_msg *msg, struct tipc_service *sc, u32 *last_lower, u32 *last_key) { struct service_range *sr; struct rb_node *n; int err; for (n = rb_first(&sc->ranges); n; n = rb_next(n)) { sr = container_of(n, struct service_range, tree_node); if (sr->lower < *last_lower) continue; err = __tipc_nl_add_nametable_publ(msg, sc, sr, last_key); if (err) { *last_lower = sr->lower; return err; } } *last_lower = 0; return 0; } static int tipc_nl_service_list(struct net *net, struct tipc_nl_msg *msg, u32 *last_type, u32 *last_lower, u32 *last_key) { struct tipc_net *tn = tipc_net(net); struct tipc_service *service = NULL; struct hlist_head *head; struct tipc_uaddr ua; int err; int i; if (*last_type) i = hash(*last_type); else i = 0; for (; i < TIPC_NAMETBL_SIZE; i++) { head = &tn->nametbl->services[i]; if (*last_type || (!i && *last_key && (*last_lower == *last_key))) { tipc_uaddr(&ua, TIPC_SERVICE_RANGE, TIPC_NODE_SCOPE, *last_type, *last_lower, *last_lower); service = tipc_service_find(net, &ua); if (!service) return -EPIPE; } else { hlist_for_each_entry_rcu(service, head, service_list) break; if (!service) continue; } hlist_for_each_entry_from_rcu(service, service_list) { spin_lock_bh(&service->lock); err = __tipc_nl_service_range_list(msg, service, last_lower, last_key); if (err) { *last_type = service->type; spin_unlock_bh(&service->lock); return err; } spin_unlock_bh(&service->lock); } *last_type = 0; } return 0; } int tipc_nl_name_table_dump(struct sk_buff *skb, struct netlink_callback *cb) { struct net *net = sock_net(skb->sk); u32 last_type = cb->args[0]; u32 last_lower = cb->args[1]; u32 last_key = cb->args[2]; int done = cb->args[3]; struct tipc_nl_msg msg; int err; if (done) return 0; msg.skb = skb; msg.portid = NETLINK_CB(cb->skb).portid; msg.seq = cb->nlh->nlmsg_seq; rcu_read_lock(); err = tipc_nl_service_list(net, &msg, &last_type, &last_lower, &last_key); if (!err) { done = 1; } else if (err != -EMSGSIZE) { /* We never set seq or call nl_dump_check_consistent() this * means that setting prev_seq here will cause the consistence * check to fail in the netlink callback handler. Resulting in * the NLMSG_DONE message having the NLM_F_DUMP_INTR flag set if * we got an error. */ cb->prev_seq = 1; } rcu_read_unlock(); cb->args[0] = last_type; cb->args[1] = last_lower; cb->args[2] = last_key; cb->args[3] = done; return skb->len; } struct tipc_dest *tipc_dest_find(struct list_head *l, u32 node, u32 port) { struct tipc_dest *dst; list_for_each_entry(dst, l, list) { if (dst->node == node && dst->port == port) return dst; } return NULL; } bool tipc_dest_push(struct list_head *l, u32 node, u32 port) { struct tipc_dest *dst; if (tipc_dest_find(l, node, port)) return false; dst = kmalloc(sizeof(*dst), GFP_ATOMIC); if (unlikely(!dst)) return false; dst->node = node; dst->port = port; list_add(&dst->list, l); return true; } bool tipc_dest_pop(struct list_head *l, u32 *node, u32 *port) { struct tipc_dest *dst; if (list_empty(l)) return false; dst = list_first_entry(l, typeof(*dst), list); if (port) *port = dst->port; if (node) *node = dst->node; list_del(&dst->list); kfree(dst); return true; } bool tipc_dest_del(struct list_head *l, u32 node, u32 port) { struct tipc_dest *dst; dst = tipc_dest_find(l, node, port); if (!dst) return false; list_del(&dst->list); kfree(dst); return true; } void tipc_dest_list_purge(struct list_head *l) { struct tipc_dest *dst, *tmp; list_for_each_entry_safe(dst, tmp, l, list) { list_del(&dst->list); kfree(dst); } } |
240 239 238 234 1 1 4 1 3 2 18 17 16 8 1 1 46 47 1 17 16 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 | // SPDX-License-Identifier: GPL-2.0-only /* * fs/crypto/hooks.c * * Encryption hooks for higher-level filesystem operations. */ #include "fscrypt_private.h" /** * fscrypt_file_open() - prepare to open a possibly-encrypted regular file * @inode: the inode being opened * @filp: the struct file being set up * * Currently, an encrypted regular file can only be opened if its encryption key * is available; access to the raw encrypted contents is not supported. * Therefore, we first set up the inode's encryption key (if not already done) * and return an error if it's unavailable. * * We also verify that if the parent directory (from the path via which the file * is being opened) is encrypted, then the inode being opened uses the same * encryption policy. This is needed as part of the enforcement that all files * in an encrypted directory tree use the same encryption policy, as a * protection against certain types of offline attacks. Note that this check is * needed even when opening an *unencrypted* file, since it's forbidden to have * an unencrypted file in an encrypted directory. * * Return: 0 on success, -ENOKEY if the key is missing, or another -errno code */ int fscrypt_file_open(struct inode *inode, struct file *filp) { int err; struct dentry *dir; err = fscrypt_require_key(inode); if (err) return err; dir = dget_parent(file_dentry(filp)); if (IS_ENCRYPTED(d_inode(dir)) && !fscrypt_has_permitted_context(d_inode(dir), inode)) { fscrypt_warn(inode, "Inconsistent encryption context (parent directory: %lu)", d_inode(dir)->i_ino); err = -EPERM; } dput(dir); return err; } EXPORT_SYMBOL_GPL(fscrypt_file_open); int __fscrypt_prepare_link(struct inode *inode, struct inode *dir, struct dentry *dentry) { if (fscrypt_is_nokey_name(dentry)) return -ENOKEY; /* * We don't need to separately check that the directory inode's key is * available, as it's implied by the dentry not being a no-key name. */ if (!fscrypt_has_permitted_context(dir, inode)) return -EXDEV; return 0; } EXPORT_SYMBOL_GPL(__fscrypt_prepare_link); int __fscrypt_prepare_rename(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry, unsigned int flags) { if (fscrypt_is_nokey_name(old_dentry) || fscrypt_is_nokey_name(new_dentry)) return -ENOKEY; /* * We don't need to separately check that the directory inodes' keys are * available, as it's implied by the dentries not being no-key names. */ if (old_dir != new_dir) { if (IS_ENCRYPTED(new_dir) && !fscrypt_has_permitted_context(new_dir, d_inode(old_dentry))) return -EXDEV; if ((flags & RENAME_EXCHANGE) && IS_ENCRYPTED(old_dir) && !fscrypt_has_permitted_context(old_dir, d_inode(new_dentry))) return -EXDEV; } return 0; } EXPORT_SYMBOL_GPL(__fscrypt_prepare_rename); int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry, struct fscrypt_name *fname) { int err = fscrypt_setup_filename(dir, &dentry->d_name, 1, fname); if (err && err != -ENOENT) return err; fscrypt_prepare_dentry(dentry, fname->is_nokey_name); return err; } EXPORT_SYMBOL_GPL(__fscrypt_prepare_lookup); /** * fscrypt_prepare_lookup_partial() - prepare lookup without filename setup * @dir: the encrypted directory being searched * @dentry: the dentry being looked up in @dir * * This function should be used by the ->lookup and ->atomic_open methods of * filesystems that handle filename encryption and no-key name encoding * themselves and thus can't use fscrypt_prepare_lookup(). Like * fscrypt_prepare_lookup(), this will try to set up the directory's encryption * key and will set DCACHE_NOKEY_NAME on the dentry if the key is unavailable. * However, this function doesn't set up a struct fscrypt_name for the filename. * * Return: 0 on success; -errno on error. Note that the encryption key being * unavailable is not considered an error. It is also not an error if * the encryption policy is unsupported by this kernel; that is treated * like the key being unavailable, so that files can still be deleted. */ int fscrypt_prepare_lookup_partial(struct inode *dir, struct dentry *dentry) { int err = fscrypt_get_encryption_info(dir, true); bool is_nokey_name = (!err && !fscrypt_has_encryption_key(dir)); fscrypt_prepare_dentry(dentry, is_nokey_name); return err; } EXPORT_SYMBOL_GPL(fscrypt_prepare_lookup_partial); int __fscrypt_prepare_readdir(struct inode *dir) { return fscrypt_get_encryption_info(dir, true); } EXPORT_SYMBOL_GPL(__fscrypt_prepare_readdir); int __fscrypt_prepare_setattr(struct dentry *dentry, struct iattr *attr) { if (attr->ia_valid & ATTR_SIZE) return fscrypt_require_key(d_inode(dentry)); return 0; } EXPORT_SYMBOL_GPL(__fscrypt_prepare_setattr); /** * fscrypt_prepare_setflags() - prepare to change flags with FS_IOC_SETFLAGS * @inode: the inode on which flags are being changed * @oldflags: the old flags * @flags: the new flags * * The caller should be holding i_rwsem for write. * * Return: 0 on success; -errno if the flags change isn't allowed or if * another error occurs. */ int fscrypt_prepare_setflags(struct inode *inode, unsigned int oldflags, unsigned int flags) { struct fscrypt_inode_info *ci; struct fscrypt_master_key *mk; int err; /* * When the CASEFOLD flag is set on an encrypted directory, we must * derive the secret key needed for the dirhash. This is only possible * if the directory uses a v2 encryption policy. */ if (IS_ENCRYPTED(inode) && (flags & ~oldflags & FS_CASEFOLD_FL)) { err = fscrypt_require_key(inode); if (err) return err; ci = inode->i_crypt_info; if (ci->ci_policy.version != FSCRYPT_POLICY_V2) return -EINVAL; mk = ci->ci_master_key; down_read(&mk->mk_sem); if (mk->mk_present) err = fscrypt_derive_dirhash_key(ci, mk); else err = -ENOKEY; up_read(&mk->mk_sem); return err; } return 0; } /** * fscrypt_prepare_symlink() - prepare to create a possibly-encrypted symlink * @dir: directory in which the symlink is being created * @target: plaintext symlink target * @len: length of @target excluding null terminator * @max_len: space the filesystem has available to store the symlink target * @disk_link: (out) the on-disk symlink target being prepared * * This function computes the size the symlink target will require on-disk, * stores it in @disk_link->len, and validates it against @max_len. An * encrypted symlink may be longer than the original. * * Additionally, @disk_link->name is set to @target if the symlink will be * unencrypted, but left NULL if the symlink will be encrypted. For encrypted * symlinks, the filesystem must call fscrypt_encrypt_symlink() to create the * on-disk target later. (The reason for the two-step process is that some * filesystems need to know the size of the symlink target before creating the * inode, e.g. to determine whether it will be a "fast" or "slow" symlink.) * * Return: 0 on success, -ENAMETOOLONG if the symlink target is too long, * -ENOKEY if the encryption key is missing, or another -errno code if a problem * occurred while setting up the encryption key. */ int fscrypt_prepare_symlink(struct inode *dir, const char *target, unsigned int len, unsigned int max_len, struct fscrypt_str *disk_link) { const union fscrypt_policy *policy; /* * To calculate the size of the encrypted symlink target we need to know * the amount of NUL padding, which is determined by the flags set in * the encryption policy which will be inherited from the directory. */ policy = fscrypt_policy_to_inherit(dir); if (policy == NULL) { /* Not encrypted */ disk_link->name = (unsigned char *)target; disk_link->len = len + 1; if (disk_link->len > max_len) return -ENAMETOOLONG; return 0; } if (IS_ERR(policy)) return PTR_ERR(policy); /* * Calculate the size of the encrypted symlink and verify it won't * exceed max_len. Note that for historical reasons, encrypted symlink * targets are prefixed with the ciphertext length, despite this * actually being redundant with i_size. This decreases by 2 bytes the * longest symlink target we can accept. * * We could recover 1 byte by not counting a null terminator, but * counting it (even though it is meaningless for ciphertext) is simpler * for now since filesystems will assume it is there and subtract it. */ if (!__fscrypt_fname_encrypted_size(policy, len, max_len - sizeof(struct fscrypt_symlink_data) - 1, &disk_link->len)) return -ENAMETOOLONG; disk_link->len += sizeof(struct fscrypt_symlink_data) + 1; disk_link->name = NULL; return 0; } EXPORT_SYMBOL_GPL(fscrypt_prepare_symlink); int __fscrypt_encrypt_symlink(struct inode *inode, const char *target, unsigned int len, struct fscrypt_str *disk_link) { int err; struct qstr iname = QSTR_INIT(target, len); struct fscrypt_symlink_data *sd; unsigned int ciphertext_len; /* * fscrypt_prepare_new_inode() should have already set up the new * symlink inode's encryption key. We don't wait until now to do it, * since we may be in a filesystem transaction now. */ if (WARN_ON_ONCE(!fscrypt_has_encryption_key(inode))) return -ENOKEY; if (disk_link->name) { /* filesystem-provided buffer */ sd = (struct fscrypt_symlink_data *)disk_link->name; } else { sd = kmalloc(disk_link->len, GFP_NOFS); if (!sd) return -ENOMEM; } ciphertext_len = disk_link->len - sizeof(*sd) - 1; sd->len = cpu_to_le16(ciphertext_len); err = fscrypt_fname_encrypt(inode, &iname, sd->encrypted_path, ciphertext_len); if (err) goto err_free_sd; /* * Null-terminating the ciphertext doesn't make sense, but we still * count the null terminator in the length, so we might as well * initialize it just in case the filesystem writes it out. */ sd->encrypted_path[ciphertext_len] = '\0'; /* Cache the plaintext symlink target for later use by get_link() */ err = -ENOMEM; inode->i_link = kmemdup(target, len + 1, GFP_NOFS); if (!inode->i_link) goto err_free_sd; if (!disk_link->name) disk_link->name = (unsigned char *)sd; return 0; err_free_sd: if (!disk_link->name) kfree(sd); return err; } EXPORT_SYMBOL_GPL(__fscrypt_encrypt_symlink); /** * fscrypt_get_symlink() - get the target of an encrypted symlink * @inode: the symlink inode * @caddr: the on-disk contents of the symlink * @max_size: size of @caddr buffer * @done: if successful, will be set up to free the returned target if needed * * If the symlink's encryption key is available, we decrypt its target. * Otherwise, we encode its target for presentation. * * This may sleep, so the filesystem must have dropped out of RCU mode already. * * Return: the presentable symlink target or an ERR_PTR() */ const char *fscrypt_get_symlink(struct inode *inode, const void *caddr, unsigned int max_size, struct delayed_call *done) { const struct fscrypt_symlink_data *sd; struct fscrypt_str cstr, pstr; bool has_key; int err; /* This is for encrypted symlinks only */ if (WARN_ON_ONCE(!IS_ENCRYPTED(inode))) return ERR_PTR(-EINVAL); /* If the decrypted target is already cached, just return it. */ pstr.name = READ_ONCE(inode->i_link); if (pstr.name) return pstr.name; /* * Try to set up the symlink's encryption key, but we can continue * regardless of whether the key is available or not. */ err = fscrypt_get_encryption_info(inode, false); if (err) return ERR_PTR(err); has_key = fscrypt_has_encryption_key(inode); /* * For historical reasons, encrypted symlink targets are prefixed with * the ciphertext length, even though this is redundant with i_size. */ if (max_size < sizeof(*sd) + 1) return ERR_PTR(-EUCLEAN); sd = caddr; cstr.name = (unsigned char *)sd->encrypted_path; cstr.len = le16_to_cpu(sd->len); if (cstr.len == 0) return ERR_PTR(-EUCLEAN); if (cstr.len + sizeof(*sd) > max_size) return ERR_PTR(-EUCLEAN); err = fscrypt_fname_alloc_buffer(cstr.len, &pstr); if (err) return ERR_PTR(err); err = fscrypt_fname_disk_to_usr(inode, 0, 0, &cstr, &pstr); if (err) goto err_kfree; err = -EUCLEAN; if (pstr.name[0] == '\0') goto err_kfree; pstr.name[pstr.len] = '\0'; /* * Cache decrypted symlink targets in i_link for later use. Don't cache * symlink targets encoded without the key, since those become outdated * once the key is added. This pairs with the READ_ONCE() above and in * the VFS path lookup code. */ if (!has_key || cmpxchg_release(&inode->i_link, NULL, pstr.name) != NULL) set_delayed_call(done, kfree_link, pstr.name); return pstr.name; err_kfree: kfree(pstr.name); return ERR_PTR(err); } EXPORT_SYMBOL_GPL(fscrypt_get_symlink); /** * fscrypt_symlink_getattr() - set the correct st_size for encrypted symlinks * @path: the path for the encrypted symlink being queried * @stat: the struct being filled with the symlink's attributes * * Override st_size of encrypted symlinks to be the length of the decrypted * symlink target (or the no-key encoded symlink target, if the key is * unavailable) rather than the length of the encrypted symlink target. This is * necessary for st_size to match the symlink target that userspace actually * sees. POSIX requires this, and some userspace programs depend on it. * * This requires reading the symlink target from disk if needed, setting up the * inode's encryption key if possible, and then decrypting or encoding the * symlink target. This makes lstat() more heavyweight than is normally the * case. However, decrypted symlink targets will be cached in ->i_link, so * usually the symlink won't have to be read and decrypted again later if/when * it is actually followed, readlink() is called, or lstat() is called again. * * Return: 0 on success, -errno on failure */ int fscrypt_symlink_getattr(const struct path *path, struct kstat *stat) { struct dentry *dentry = path->dentry; struct inode *inode = d_inode(dentry); const char *link; DEFINE_DELAYED_CALL(done); /* * To get the symlink target that userspace will see (whether it's the * decrypted target or the no-key encoded target), we can just get it in * the same way the VFS does during path resolution and readlink(). */ link = READ_ONCE(inode->i_link); if (!link) { link = inode->i_op->get_link(dentry, inode, &done); if (IS_ERR(link)) return PTR_ERR(link); } stat->size = strlen(link); do_delayed_call(&done); return 0; } EXPORT_SYMBOL_GPL(fscrypt_symlink_getattr); |
3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | /* file-mmu.c: ramfs MMU-based file operations * * Resizable simple ram filesystem for Linux. * * Copyright (C) 2000 Linus Torvalds. * 2000 Transmeta Corp. * * Usage limits added by David Gibson, Linuxcare Australia. * This file is released under the GPL. */ /* * NOTE! This filesystem is probably most useful * not as a real filesystem, but as an example of * how virtual filesystems can be written. * * It doesn't get much simpler than this. Consider * that this file implements the full semantics of * a POSIX-compliant read-write filesystem. * * Note in particular how the filesystem does not * need to implement any data structures of its own * to keep track of the virtual data: using the VFS * caches is sufficient. */ #include <linux/fs.h> #include <linux/mm.h> #include <linux/ramfs.h> #include <linux/sched.h> #include "internal.h" static unsigned long ramfs_mmu_get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); } const struct file_operations ramfs_file_operations = { .read_iter = generic_file_read_iter, .write_iter = generic_file_write_iter, .mmap = generic_file_mmap, .fsync = noop_fsync, .splice_read = filemap_splice_read, .splice_write = iter_file_splice_write, .llseek = generic_file_llseek, .get_unmapped_area = ramfs_mmu_get_unmapped_area, }; const struct inode_operations ramfs_file_inode_operations = { .setattr = simple_setattr, .getattr = simple_getattr, }; |
1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 | // SPDX-License-Identifier: GPL-2.0+ /* * Nano River Technologies viperboard GPIO lib driver * * (C) 2012 by Lemonage GmbH * Author: Lars Poeschel <poeschel@lemonage.de> * All rights reserved. */ #include <linux/kernel.h> #include <linux/errno.h> #include <linux/module.h> #include <linux/slab.h> #include <linux/types.h> #include <linux/mutex.h> #include <linux/platform_device.h> #include <linux/usb.h> #include <linux/gpio/driver.h> #include <linux/mfd/viperboard.h> #define VPRBRD_GPIOA_CLK_1MHZ 0 #define VPRBRD_GPIOA_CLK_100KHZ 1 #define VPRBRD_GPIOA_CLK_10KHZ 2 #define VPRBRD_GPIOA_CLK_1KHZ 3 #define VPRBRD_GPIOA_CLK_100HZ 4 #define VPRBRD_GPIOA_CLK_10HZ 5 #define VPRBRD_GPIOA_FREQ_DEFAULT 1000 #define VPRBRD_GPIOA_CMD_CONT 0x00 #define VPRBRD_GPIOA_CMD_PULSE 0x01 #define VPRBRD_GPIOA_CMD_PWM 0x02 #define VPRBRD_GPIOA_CMD_SETOUT 0x03 #define VPRBRD_GPIOA_CMD_SETIN 0x04 #define VPRBRD_GPIOA_CMD_SETINT 0x05 #define VPRBRD_GPIOA_CMD_GETIN 0x06 #define VPRBRD_GPIOB_CMD_SETDIR 0x00 #define VPRBRD_GPIOB_CMD_SETVAL 0x01 struct vprbrd_gpioa_msg { u8 cmd; u8 clk; u8 offset; u8 t1; u8 t2; u8 invert; u8 pwmlevel; u8 outval; u8 risefall; u8 answer; u8 __fill; } __packed; struct vprbrd_gpiob_msg { u8 cmd; u16 val; u16 mask; } __packed; struct vprbrd_gpio { struct gpio_chip gpioa; /* gpio a related things */ u32 gpioa_out; u32 gpioa_val; struct gpio_chip gpiob; /* gpio b related things */ u32 gpiob_out; u32 gpiob_val; struct vprbrd *vb; }; /* gpioa sampling clock module parameter */ static unsigned char gpioa_clk; static unsigned int gpioa_freq = VPRBRD_GPIOA_FREQ_DEFAULT; module_param(gpioa_freq, uint, 0); MODULE_PARM_DESC(gpioa_freq, "gpio-a sampling freq in Hz (default is 1000Hz) valid values: 10, 100, 1000, 10000, 100000, 1000000"); /* ----- begin of gipo a chip -------------------------------------------- */ static int vprbrd_gpioa_get(struct gpio_chip *chip, unsigned int offset) { int ret, answer, error = 0; struct vprbrd_gpio *gpio = gpiochip_get_data(chip); struct vprbrd *vb = gpio->vb; struct vprbrd_gpioa_msg *gamsg = (struct vprbrd_gpioa_msg *)vb->buf; /* if io is set to output, just return the saved value */ if (gpio->gpioa_out & (1 << offset)) return !!(gpio->gpioa_val & (1 << offset)); mutex_lock(&vb->lock); gamsg->cmd = VPRBRD_GPIOA_CMD_GETIN; gamsg->clk = 0x00; gamsg->offset = offset; gamsg->t1 = 0x00; gamsg->t2 = 0x00; gamsg->invert = 0x00; gamsg->pwmlevel = 0x00; gamsg->outval = 0x00; gamsg->risefall = 0x00; gamsg->answer = 0x00; gamsg->__fill = 0x00; ret = usb_control_msg(vb->usb_dev, usb_sndctrlpipe(vb->usb_dev, 0), VPRBRD_USB_REQUEST_GPIOA, VPRBRD_USB_TYPE_OUT, 0x0000, 0x0000, gamsg, sizeof(struct vprbrd_gpioa_msg), VPRBRD_USB_TIMEOUT_MS); if (ret != sizeof(struct vprbrd_gpioa_msg)) error = -EREMOTEIO; ret = usb_control_msg(vb->usb_dev, usb_rcvctrlpipe(vb->usb_dev, 0), VPRBRD_USB_REQUEST_GPIOA, VPRBRD_USB_TYPE_IN, 0x0000, 0x0000, gamsg, sizeof(struct vprbrd_gpioa_msg), VPRBRD_USB_TIMEOUT_MS); answer = gamsg->answer & 0x01; mutex_unlock(&vb->lock); if (ret != sizeof(struct vprbrd_gpioa_msg)) error = -EREMOTEIO; if (error) return error; return answer; } static void vprbrd_gpioa_set(struct gpio_chip *chip, unsigned int offset, int value) { int ret; struct vprbrd_gpio *gpio = gpiochip_get_data(chip); struct vprbrd *vb = gpio->vb; struct vprbrd_gpioa_msg *gamsg = (struct vprbrd_gpioa_msg *)vb->buf; if (gpio->gpioa_out & (1 << offset)) { if (value) gpio->gpioa_val |= (1 << offset); else gpio->gpioa_val &= ~(1 << offset); mutex_lock(&vb->lock); gamsg->cmd = VPRBRD_GPIOA_CMD_SETOUT; gamsg->clk = 0x00; gamsg->offset = offset; gamsg->t1 = 0x00; gamsg->t2 = 0x00; gamsg->invert = 0x00; gamsg->pwmlevel = 0x00; gamsg->outval = value; gamsg->risefall = 0x00; gamsg->answer = 0x00; gamsg->__fill = 0x00; ret = usb_control_msg(vb->usb_dev, usb_sndctrlpipe(vb->usb_dev, 0), VPRBRD_USB_REQUEST_GPIOA, VPRBRD_USB_TYPE_OUT, 0x0000, 0x0000, gamsg, sizeof(struct vprbrd_gpioa_msg), VPRBRD_USB_TIMEOUT_MS); mutex_unlock(&vb->lock); if (ret != sizeof(struct vprbrd_gpioa_msg)) dev_err(chip->parent, "usb error setting pin value\n"); } } static int vprbrd_gpioa_direction_input(struct gpio_chip *chip, unsigned int offset) { int ret; struct vprbrd_gpio *gpio = gpiochip_get_data(chip); struct vprbrd *vb = gpio->vb; struct vprbrd_gpioa_msg *gamsg = (struct vprbrd_gpioa_msg *)vb->buf; gpio->gpioa_out &= ~(1 << offset); mutex_lock(&vb->lock); gamsg->cmd = VPRBRD_GPIOA_CMD_SETIN; gamsg->clk = gpioa_clk; gamsg->offset = offset; gamsg->t1 = 0x00; gamsg->t2 = 0x00; gamsg->invert = 0x00; gamsg->pwmlevel = 0x00; gamsg->outval = 0x00; gamsg->risefall = 0x00; gamsg->answer = 0x00; gamsg->__fill = 0x00; ret = usb_control_msg(vb->usb_dev, usb_sndctrlpipe(vb->usb_dev, 0), VPRBRD_USB_REQUEST_GPIOA, VPRBRD_USB_TYPE_OUT, 0x0000, 0x0000, gamsg, sizeof(struct vprbrd_gpioa_msg), VPRBRD_USB_TIMEOUT_MS); mutex_unlock(&vb->lock); if (ret != sizeof(struct vprbrd_gpioa_msg)) return -EREMOTEIO; return 0; } static int vprbrd_gpioa_direction_output(struct gpio_chip *chip, unsigned int offset, int value) { int ret; struct vprbrd_gpio *gpio = gpiochip_get_data(chip); struct vprbrd *vb = gpio->vb; struct vprbrd_gpioa_msg *gamsg = (struct vprbrd_gpioa_msg *)vb->buf; gpio->gpioa_out |= (1 << offset); if (value) gpio->gpioa_val |= (1 << offset); else gpio->gpioa_val &= ~(1 << offset); mutex_lock(&vb->lock); gamsg->cmd = VPRBRD_GPIOA_CMD_SETOUT; gamsg->clk = 0x00; gamsg->offset = offset; gamsg->t1 = 0x00; gamsg->t2 = 0x00; gamsg->invert = 0x00; gamsg->pwmlevel = 0x00; gamsg->outval = value; gamsg->risefall = 0x00; gamsg->answer = 0x00; gamsg->__fill = 0x00; ret = usb_control_msg(vb->usb_dev, usb_sndctrlpipe(vb->usb_dev, 0), VPRBRD_USB_REQUEST_GPIOA, VPRBRD_USB_TYPE_OUT, 0x0000, 0x0000, gamsg, sizeof(struct vprbrd_gpioa_msg), VPRBRD_USB_TIMEOUT_MS); mutex_unlock(&vb->lock); if (ret != sizeof(struct vprbrd_gpioa_msg)) return -EREMOTEIO; return 0; } /* ----- end of gpio a chip ---------------------------------------------- */ /* ----- begin of gipo b chip -------------------------------------------- */ static int vprbrd_gpiob_setdir(struct vprbrd *vb, unsigned int offset, unsigned int dir) { struct vprbrd_gpiob_msg *gbmsg = (struct vprbrd_gpiob_msg *)vb->buf; int ret; gbmsg->cmd = VPRBRD_GPIOB_CMD_SETDIR; gbmsg->val = cpu_to_be16(dir << offset); gbmsg->mask = cpu_to_be16(0x0001 << offset); ret = usb_control_msg(vb->usb_dev, usb_sndctrlpipe(vb->usb_dev, 0), VPRBRD_USB_REQUEST_GPIOB, VPRBRD_USB_TYPE_OUT, 0x0000, 0x0000, gbmsg, sizeof(struct vprbrd_gpiob_msg), VPRBRD_USB_TIMEOUT_MS); if (ret != sizeof(struct vprbrd_gpiob_msg)) return -EREMOTEIO; return 0; } static int vprbrd_gpiob_get(struct gpio_chip *chip, unsigned int offset) { int ret; u16 val; struct vprbrd_gpio *gpio = gpiochip_get_data(chip); struct vprbrd *vb = gpio->vb; struct vprbrd_gpiob_msg *gbmsg = (struct vprbrd_gpiob_msg *)vb->buf; /* if io is set to output, just return the saved value */ if (gpio->gpiob_out & (1 << offset)) return gpio->gpiob_val & (1 << offset); mutex_lock(&vb->lock); ret = usb_control_msg(vb->usb_dev, usb_rcvctrlpipe(vb->usb_dev, 0), VPRBRD_USB_REQUEST_GPIOB, VPRBRD_USB_TYPE_IN, 0x0000, 0x0000, gbmsg, sizeof(struct vprbrd_gpiob_msg), VPRBRD_USB_TIMEOUT_MS); val = gbmsg->val; mutex_unlock(&vb->lock); if (ret != sizeof(struct vprbrd_gpiob_msg)) return ret; /* cache the read values */ gpio->gpiob_val = be16_to_cpu(val); return (gpio->gpiob_val >> offset) & 0x1; } static void vprbrd_gpiob_set(struct gpio_chip *chip, unsigned int offset, int value) { int ret; struct vprbrd_gpio *gpio = gpiochip_get_data(chip); struct vprbrd *vb = gpio->vb; struct vprbrd_gpiob_msg *gbmsg = (struct vprbrd_gpiob_msg *)vb->buf; if (gpio->gpiob_out & (1 << offset)) { if (value) gpio->gpiob_val |= (1 << offset); else gpio->gpiob_val &= ~(1 << offset); mutex_lock(&vb->lock); gbmsg->cmd = VPRBRD_GPIOB_CMD_SETVAL; gbmsg->val = cpu_to_be16(value << offset); gbmsg->mask = cpu_to_be16(0x0001 << offset); ret = usb_control_msg(vb->usb_dev, usb_sndctrlpipe(vb->usb_dev, 0), VPRBRD_USB_REQUEST_GPIOB, VPRBRD_USB_TYPE_OUT, 0x0000, 0x0000, gbmsg, sizeof(struct vprbrd_gpiob_msg), VPRBRD_USB_TIMEOUT_MS); mutex_unlock(&vb->lock); if (ret != sizeof(struct vprbrd_gpiob_msg)) dev_err(chip->parent, "usb error setting pin value\n"); } } static int vprbrd_gpiob_direction_input(struct gpio_chip *chip, unsigned int offset) { int ret; struct vprbrd_gpio *gpio = gpiochip_get_data(chip); struct vprbrd *vb = gpio->vb; gpio->gpiob_out &= ~(1 << offset); mutex_lock(&vb->lock); ret = vprbrd_gpiob_setdir(vb, offset, 0); mutex_unlock(&vb->lock); if (ret) dev_err(chip->parent, "usb error setting pin to input\n"); return ret; } static int vprbrd_gpiob_direction_output(struct gpio_chip *chip, unsigned int offset, int value) { int ret; struct vprbrd_gpio *gpio = gpiochip_get_data(chip); struct vprbrd *vb = gpio->vb; gpio->gpiob_out |= (1 << offset); mutex_lock(&vb->lock); ret = vprbrd_gpiob_setdir(vb, offset, 1); if (ret) dev_err(chip->parent, "usb error setting pin to output\n"); mutex_unlock(&vb->lock); vprbrd_gpiob_set(chip, offset, value); return ret; } /* ----- end of gpio b chip ---------------------------------------------- */ static int vprbrd_gpio_probe(struct platform_device *pdev) { struct vprbrd *vb = dev_get_drvdata(pdev->dev.parent); struct vprbrd_gpio *vb_gpio; int ret; vb_gpio = devm_kzalloc(&pdev->dev, sizeof(*vb_gpio), GFP_KERNEL); if (vb_gpio == NULL) return -ENOMEM; vb_gpio->vb = vb; /* registering gpio a */ vb_gpio->gpioa.label = "viperboard gpio a"; vb_gpio->gpioa.parent = &pdev->dev; vb_gpio->gpioa.owner = THIS_MODULE; vb_gpio->gpioa.base = -1; vb_gpio->gpioa.ngpio = 16; vb_gpio->gpioa.can_sleep = true; vb_gpio->gpioa.set = vprbrd_gpioa_set; vb_gpio->gpioa.get = vprbrd_gpioa_get; vb_gpio->gpioa.direction_input = vprbrd_gpioa_direction_input; vb_gpio->gpioa.direction_output = vprbrd_gpioa_direction_output; ret = devm_gpiochip_add_data(&pdev->dev, &vb_gpio->gpioa, vb_gpio); if (ret < 0) return ret; /* registering gpio b */ vb_gpio->gpiob.label = "viperboard gpio b"; vb_gpio->gpiob.parent = &pdev->dev; vb_gpio->gpiob.owner = THIS_MODULE; vb_gpio->gpiob.base = -1; vb_gpio->gpiob.ngpio = 16; vb_gpio->gpiob.can_sleep = true; vb_gpio->gpiob.set = vprbrd_gpiob_set; vb_gpio->gpiob.get = vprbrd_gpiob_get; vb_gpio->gpiob.direction_input = vprbrd_gpiob_direction_input; vb_gpio->gpiob.direction_output = vprbrd_gpiob_direction_output; return devm_gpiochip_add_data(&pdev->dev, &vb_gpio->gpiob, vb_gpio); } static struct platform_driver vprbrd_gpio_driver = { .driver.name = "viperboard-gpio", .probe = vprbrd_gpio_probe, }; static int __init vprbrd_gpio_init(void) { switch (gpioa_freq) { case 1000000: gpioa_clk = VPRBRD_GPIOA_CLK_1MHZ; break; case 100000: gpioa_clk = VPRBRD_GPIOA_CLK_100KHZ; break; case 10000: gpioa_clk = VPRBRD_GPIOA_CLK_10KHZ; break; case 1000: gpioa_clk = VPRBRD_GPIOA_CLK_1KHZ; break; case 100: gpioa_clk = VPRBRD_GPIOA_CLK_100HZ; break; case 10: gpioa_clk = VPRBRD_GPIOA_CLK_10HZ; break; default: pr_warn("invalid gpioa_freq (%d)\n", gpioa_freq); gpioa_clk = VPRBRD_GPIOA_CLK_1KHZ; } return platform_driver_register(&vprbrd_gpio_driver); } subsys_initcall(vprbrd_gpio_init); static void __exit vprbrd_gpio_exit(void) { platform_driver_unregister(&vprbrd_gpio_driver); } module_exit(vprbrd_gpio_exit); MODULE_AUTHOR("Lars Poeschel <poeschel@lemonage.de>"); MODULE_DESCRIPTION("GPIO driver for Nano River Techs Viperboard"); MODULE_LICENSE("GPL"); MODULE_ALIAS("platform:viperboard-gpio"); |
22 146 14 151 128 12 1 125 129 11 86 2 4 90 123 94 27 1 123 110 69 3 58 1 3 25 61 39 34 117 27 93 137 165 67 66 2 157 123 2 6 131 185 1 240 101 17 23 135 124 4 123 109 189 4 157 57 131 89 83 3 163 39 16 177 43 17 189 137 15 54 49 5 95 93 16 5 234 57 159 165 8 123 33 310 8 308 300 250 250 112 126 3 7 18 30 16 1 17 1 17 39 34 1 1 3 60 93 20 5 12 3 3 3 9 5 8 11 7 261 24 18 89 201 4 210 208 12 59 166 199 100 167 5 16 128 34 20 75 14 5 5 12 11 1 9 2 9 3 9 3 10 2 9 3 11 1 11 1 11 1 10 2 10 2 1 1 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 | // SPDX-License-Identifier: GPL-2.0-only /* (C) 1999-2001 Paul `Rusty' Russell * (C) 2002-2004 Netfilter Core Team <coreteam@netfilter.org> * (C) 2002-2013 Jozsef Kadlecsik <kadlec@netfilter.org> * (C) 2006-2012 Patrick McHardy <kaber@trash.net> */ #include <linux/types.h> #include <linux/timer.h> #include <linux/module.h> #include <linux/in.h> #include <linux/tcp.h> #include <linux/spinlock.h> #include <linux/skbuff.h> #include <linux/ipv6.h> #include <net/ip6_checksum.h> #include <asm/unaligned.h> #include <net/tcp.h> #include <linux/netfilter.h> #include <linux/netfilter_ipv4.h> #include <linux/netfilter_ipv6.h> #include <net/netfilter/nf_conntrack.h> #include <net/netfilter/nf_conntrack_l4proto.h> #include <net/netfilter/nf_conntrack_ecache.h> #include <net/netfilter/nf_conntrack_seqadj.h> #include <net/netfilter/nf_conntrack_synproxy.h> #include <net/netfilter/nf_conntrack_timeout.h> #include <net/netfilter/nf_log.h> #include <net/netfilter/ipv4/nf_conntrack_ipv4.h> #include <net/netfilter/ipv6/nf_conntrack_ipv6.h> /* FIXME: Examine ipfilter's timeouts and conntrack transitions more closely. They're more complex. --RR */ static const char *const tcp_conntrack_names[] = { "NONE", "SYN_SENT", "SYN_RECV", "ESTABLISHED", "FIN_WAIT", "CLOSE_WAIT", "LAST_ACK", "TIME_WAIT", "CLOSE", "SYN_SENT2", }; enum nf_ct_tcp_action { NFCT_TCP_IGNORE, NFCT_TCP_INVALID, NFCT_TCP_ACCEPT, }; #define SECS * HZ #define MINS * 60 SECS #define HOURS * 60 MINS #define DAYS * 24 HOURS static const unsigned int tcp_timeouts[TCP_CONNTRACK_TIMEOUT_MAX] = { [TCP_CONNTRACK_SYN_SENT] = 2 MINS, [TCP_CONNTRACK_SYN_RECV] = 60 SECS, [TCP_CONNTRACK_ESTABLISHED] = 5 DAYS, [TCP_CONNTRACK_FIN_WAIT] = 2 MINS, [TCP_CONNTRACK_CLOSE_WAIT] = 60 SECS, [TCP_CONNTRACK_LAST_ACK] = 30 SECS, [TCP_CONNTRACK_TIME_WAIT] = 2 MINS, [TCP_CONNTRACK_CLOSE] = 10 SECS, [TCP_CONNTRACK_SYN_SENT2] = 2 MINS, /* RFC1122 says the R2 limit should be at least 100 seconds. Linux uses 15 packets as limit, which corresponds to ~13-30min depending on RTO. */ [TCP_CONNTRACK_RETRANS] = 5 MINS, [TCP_CONNTRACK_UNACK] = 5 MINS, }; #define sNO TCP_CONNTRACK_NONE #define sSS TCP_CONNTRACK_SYN_SENT #define sSR TCP_CONNTRACK_SYN_RECV #define sES TCP_CONNTRACK_ESTABLISHED #define sFW TCP_CONNTRACK_FIN_WAIT #define sCW TCP_CONNTRACK_CLOSE_WAIT #define sLA TCP_CONNTRACK_LAST_ACK #define sTW TCP_CONNTRACK_TIME_WAIT #define sCL TCP_CONNTRACK_CLOSE #define sS2 TCP_CONNTRACK_SYN_SENT2 #define sIV TCP_CONNTRACK_MAX #define sIG TCP_CONNTRACK_IGNORE /* What TCP flags are set from RST/SYN/FIN/ACK. */ enum tcp_bit_set { TCP_SYN_SET, TCP_SYNACK_SET, TCP_FIN_SET, TCP_ACK_SET, TCP_RST_SET, TCP_NONE_SET, }; /* * The TCP state transition table needs a few words... * * We are the man in the middle. All the packets go through us * but might get lost in transit to the destination. * It is assumed that the destinations can't receive segments * we haven't seen. * * The checked segment is in window, but our windows are *not* * equivalent with the ones of the sender/receiver. We always * try to guess the state of the current sender. * * The meaning of the states are: * * NONE: initial state * SYN_SENT: SYN-only packet seen * SYN_SENT2: SYN-only packet seen from reply dir, simultaneous open * SYN_RECV: SYN-ACK packet seen * ESTABLISHED: ACK packet seen * FIN_WAIT: FIN packet seen * CLOSE_WAIT: ACK seen (after FIN) * LAST_ACK: FIN seen (after FIN) * TIME_WAIT: last ACK seen * CLOSE: closed connection (RST) * * Packets marked as IGNORED (sIG): * if they may be either invalid or valid * and the receiver may send back a connection * closing RST or a SYN/ACK. * * Packets marked as INVALID (sIV): * if we regard them as truly invalid packets */ static const u8 tcp_conntracks[2][6][TCP_CONNTRACK_MAX] = { { /* ORIGINAL */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*syn*/ { sSS, sSS, sIG, sIG, sIG, sIG, sIG, sSS, sSS, sS2 }, /* * sNO -> sSS Initialize a new connection * sSS -> sSS Retransmitted SYN * sS2 -> sS2 Late retransmitted SYN * sSR -> sIG * sES -> sIG Error: SYNs in window outside the SYN_SENT state * are errors. Receiver will reply with RST * and close the connection. * Or we are not in sync and hold a dead connection. * sFW -> sIG * sCW -> sIG * sLA -> sIG * sTW -> sSS Reopened connection (RFC 1122). * sCL -> sSS */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*synack*/ { sIV, sIV, sSR, sIV, sIV, sIV, sIV, sIV, sIV, sSR }, /* * sNO -> sIV Too late and no reason to do anything * sSS -> sIV Client can't send SYN and then SYN/ACK * sS2 -> sSR SYN/ACK sent to SYN2 in simultaneous open * sSR -> sSR Late retransmitted SYN/ACK in simultaneous open * sES -> sIV Invalid SYN/ACK packets sent by the client * sFW -> sIV * sCW -> sIV * sLA -> sIV * sTW -> sIV * sCL -> sIV */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*fin*/ { sIV, sIV, sFW, sFW, sLA, sLA, sLA, sTW, sCL, sIV }, /* * sNO -> sIV Too late and no reason to do anything... * sSS -> sIV Client migth not send FIN in this state: * we enforce waiting for a SYN/ACK reply first. * sS2 -> sIV * sSR -> sFW Close started. * sES -> sFW * sFW -> sLA FIN seen in both directions, waiting for * the last ACK. * Migth be a retransmitted FIN as well... * sCW -> sLA * sLA -> sLA Retransmitted FIN. Remain in the same state. * sTW -> sTW * sCL -> sCL */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*ack*/ { sES, sIV, sES, sES, sCW, sCW, sTW, sTW, sCL, sIV }, /* * sNO -> sES Assumed. * sSS -> sIV ACK is invalid: we haven't seen a SYN/ACK yet. * sS2 -> sIV * sSR -> sES Established state is reached. * sES -> sES :-) * sFW -> sCW Normal close request answered by ACK. * sCW -> sCW * sLA -> sTW Last ACK detected (RFC5961 challenged) * sTW -> sTW Retransmitted last ACK. Remain in the same state. * sCL -> sCL */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*rst*/ { sIV, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL }, /*none*/ { sIV, sIV, sIV, sIV, sIV, sIV, sIV, sIV, sIV, sIV } }, { /* REPLY */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*syn*/ { sIV, sS2, sIV, sIV, sIV, sIV, sIV, sSS, sIV, sS2 }, /* * sNO -> sIV Never reached. * sSS -> sS2 Simultaneous open * sS2 -> sS2 Retransmitted simultaneous SYN * sSR -> sIV Invalid SYN packets sent by the server * sES -> sIV * sFW -> sIV * sCW -> sIV * sLA -> sIV * sTW -> sSS Reopened connection, but server may have switched role * sCL -> sIV */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*synack*/ { sIV, sSR, sIG, sIG, sIG, sIG, sIG, sIG, sIG, sSR }, /* * sSS -> sSR Standard open. * sS2 -> sSR Simultaneous open * sSR -> sIG Retransmitted SYN/ACK, ignore it. * sES -> sIG Late retransmitted SYN/ACK? * sFW -> sIG Might be SYN/ACK answering ignored SYN * sCW -> sIG * sLA -> sIG * sTW -> sIG * sCL -> sIG */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*fin*/ { sIV, sIV, sFW, sFW, sLA, sLA, sLA, sTW, sCL, sIV }, /* * sSS -> sIV Server might not send FIN in this state. * sS2 -> sIV * sSR -> sFW Close started. * sES -> sFW * sFW -> sLA FIN seen in both directions. * sCW -> sLA * sLA -> sLA Retransmitted FIN. * sTW -> sTW * sCL -> sCL */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*ack*/ { sIV, sIG, sSR, sES, sCW, sCW, sTW, sTW, sCL, sIG }, /* * sSS -> sIG Might be a half-open connection. * sS2 -> sIG * sSR -> sSR Might answer late resent SYN. * sES -> sES :-) * sFW -> sCW Normal close request answered by ACK. * sCW -> sCW * sLA -> sTW Last ACK detected (RFC5961 challenged) * sTW -> sTW Retransmitted last ACK. * sCL -> sCL */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ /*rst*/ { sIV, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL }, /*none*/ { sIV, sIV, sIV, sIV, sIV, sIV, sIV, sIV, sIV, sIV } } }; #ifdef CONFIG_NF_CONNTRACK_PROCFS /* Print out the private part of the conntrack. */ static void tcp_print_conntrack(struct seq_file *s, struct nf_conn *ct) { if (test_bit(IPS_OFFLOAD_BIT, &ct->status)) return; seq_printf(s, "%s ", tcp_conntrack_names[ct->proto.tcp.state]); } #endif static unsigned int get_conntrack_index(const struct tcphdr *tcph) { if (tcph->rst) return TCP_RST_SET; else if (tcph->syn) return (tcph->ack ? TCP_SYNACK_SET : TCP_SYN_SET); else if (tcph->fin) return TCP_FIN_SET; else if (tcph->ack) return TCP_ACK_SET; else return TCP_NONE_SET; } /* TCP connection tracking based on 'Real Stateful TCP Packet Filtering in IP Filter' by Guido van Rooij. http://www.sane.nl/events/sane2000/papers.html http://www.darkart.com/mirrors/www.obfuscation.org/ipf/ The boundaries and the conditions are changed according to RFC793: the packet must intersect the window (i.e. segments may be after the right or before the left edge) and thus receivers may ACK segments after the right edge of the window. td_maxend = max(sack + max(win,1)) seen in reply packets td_maxwin = max(max(win, 1)) + (sack - ack) seen in sent packets td_maxwin += seq + len - sender.td_maxend if seq + len > sender.td_maxend td_end = max(seq + len) seen in sent packets I. Upper bound for valid data: seq <= sender.td_maxend II. Lower bound for valid data: seq + len >= sender.td_end - receiver.td_maxwin III. Upper bound for valid (s)ack: sack <= receiver.td_end IV. Lower bound for valid (s)ack: sack >= receiver.td_end - MAXACKWINDOW where sack is the highest right edge of sack block found in the packet or ack in the case of packet without SACK option. The upper bound limit for a valid (s)ack is not ignored - we doesn't have to deal with fragments. */ static inline __u32 segment_seq_plus_len(__u32 seq, size_t len, unsigned int dataoff, const struct tcphdr *tcph) { /* XXX Should I use payload length field in IP/IPv6 header ? * - YK */ return (seq + len - dataoff - tcph->doff*4 + (tcph->syn ? 1 : 0) + (tcph->fin ? 1 : 0)); } /* Fixme: what about big packets? */ #define MAXACKWINCONST 66000 #define MAXACKWINDOW(sender) \ ((sender)->td_maxwin > MAXACKWINCONST ? (sender)->td_maxwin \ : MAXACKWINCONST) /* * Simplified tcp_parse_options routine from tcp_input.c */ static void tcp_options(const struct sk_buff *skb, unsigned int dataoff, const struct tcphdr *tcph, struct ip_ct_tcp_state *state) { unsigned char buff[(15 * 4) - sizeof(struct tcphdr)]; const unsigned char *ptr; int length = (tcph->doff*4) - sizeof(struct tcphdr); if (!length) return; ptr = skb_header_pointer(skb, dataoff + sizeof(struct tcphdr), length, buff); if (!ptr) return; state->td_scale = 0; state->flags &= IP_CT_TCP_FLAG_BE_LIBERAL; while (length > 0) { int opcode=*ptr++; int opsize; switch (opcode) { case TCPOPT_EOL: return; case TCPOPT_NOP: /* Ref: RFC 793 section 3.1 */ length--; continue; default: if (length < 2) return; opsize=*ptr++; if (opsize < 2) /* "silly options" */ return; if (opsize > length) return; /* don't parse partial options */ if (opcode == TCPOPT_SACK_PERM && opsize == TCPOLEN_SACK_PERM) state->flags |= IP_CT_TCP_FLAG_SACK_PERM; else if (opcode == TCPOPT_WINDOW && opsize == TCPOLEN_WINDOW) { state->td_scale = *(u_int8_t *)ptr; if (state->td_scale > TCP_MAX_WSCALE) state->td_scale = TCP_MAX_WSCALE; state->flags |= IP_CT_TCP_FLAG_WINDOW_SCALE; } ptr += opsize - 2; length -= opsize; } } } static void tcp_sack(const struct sk_buff *skb, unsigned int dataoff, const struct tcphdr *tcph, __u32 *sack) { unsigned char buff[(15 * 4) - sizeof(struct tcphdr)]; const unsigned char *ptr; int length = (tcph->doff*4) - sizeof(struct tcphdr); __u32 tmp; if (!length) return; ptr = skb_header_pointer(skb, dataoff + sizeof(struct tcphdr), length, buff); if (!ptr) return; /* Fast path for timestamp-only option */ if (length == TCPOLEN_TSTAMP_ALIGNED && *(__be32 *)ptr == htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | (TCPOPT_TIMESTAMP << 8) | TCPOLEN_TIMESTAMP)) return; while (length > 0) { int opcode = *ptr++; int opsize, i; switch (opcode) { case TCPOPT_EOL: return; case TCPOPT_NOP: /* Ref: RFC 793 section 3.1 */ length--; continue; default: if (length < 2) return; opsize = *ptr++; if (opsize < 2) /* "silly options" */ return; if (opsize > length) return; /* don't parse partial options */ if (opcode == TCPOPT_SACK && opsize >= (TCPOLEN_SACK_BASE + TCPOLEN_SACK_PERBLOCK) && !((opsize - TCPOLEN_SACK_BASE) % TCPOLEN_SACK_PERBLOCK)) { for (i = 0; i < (opsize - TCPOLEN_SACK_BASE); i += TCPOLEN_SACK_PERBLOCK) { tmp = get_unaligned_be32((__be32 *)(ptr+i)+1); if (after(tmp, *sack)) *sack = tmp; } return; } ptr += opsize - 2; length -= opsize; } } } static void tcp_init_sender(struct ip_ct_tcp_state *sender, struct ip_ct_tcp_state *receiver, const struct sk_buff *skb, unsigned int dataoff, const struct tcphdr *tcph, u32 end, u32 win, enum ip_conntrack_dir dir) { /* SYN-ACK in reply to a SYN * or SYN from reply direction in simultaneous open. */ sender->td_end = sender->td_maxend = end; sender->td_maxwin = (win == 0 ? 1 : win); tcp_options(skb, dataoff, tcph, sender); /* RFC 1323: * Both sides must send the Window Scale option * to enable window scaling in either direction. */ if (dir == IP_CT_DIR_REPLY && !(sender->flags & IP_CT_TCP_FLAG_WINDOW_SCALE && receiver->flags & IP_CT_TCP_FLAG_WINDOW_SCALE)) { sender->td_scale = 0; receiver->td_scale = 0; } } __printf(6, 7) static enum nf_ct_tcp_action nf_tcp_log_invalid(const struct sk_buff *skb, const struct nf_conn *ct, const struct nf_hook_state *state, const struct ip_ct_tcp_state *sender, enum nf_ct_tcp_action ret, const char *fmt, ...) { const struct nf_tcp_net *tn = nf_tcp_pernet(nf_ct_net(ct)); struct va_format vaf; va_list args; bool be_liberal; be_liberal = sender->flags & IP_CT_TCP_FLAG_BE_LIBERAL || tn->tcp_be_liberal; if (be_liberal) return NFCT_TCP_ACCEPT; va_start(args, fmt); vaf.fmt = fmt; vaf.va = &args; nf_ct_l4proto_log_invalid(skb, ct, state, "%pV", &vaf); va_end(args); return ret; } static enum nf_ct_tcp_action tcp_in_window(struct nf_conn *ct, enum ip_conntrack_dir dir, unsigned int index, const struct sk_buff *skb, unsigned int dataoff, const struct tcphdr *tcph, const struct nf_hook_state *hook_state) { struct ip_ct_tcp *state = &ct->proto.tcp; struct ip_ct_tcp_state *sender = &state->seen[dir]; struct ip_ct_tcp_state *receiver = &state->seen[!dir]; __u32 seq, ack, sack, end, win, swin; bool in_recv_win, seq_ok; s32 receiver_offset; u16 win_raw; /* * Get the required data from the packet. */ seq = ntohl(tcph->seq); ack = sack = ntohl(tcph->ack_seq); win_raw = ntohs(tcph->window); win = win_raw; end = segment_seq_plus_len(seq, skb->len, dataoff, tcph); if (receiver->flags & IP_CT_TCP_FLAG_SACK_PERM) tcp_sack(skb, dataoff, tcph, &sack); /* Take into account NAT sequence number mangling */ receiver_offset = nf_ct_seq_offset(ct, !dir, ack - 1); ack -= receiver_offset; sack -= receiver_offset; if (sender->td_maxwin == 0) { /* * Initialize sender data. */ if (tcph->syn) { tcp_init_sender(sender, receiver, skb, dataoff, tcph, end, win, dir); if (!tcph->ack) /* Simultaneous open */ return NFCT_TCP_ACCEPT; } else { /* * We are in the middle of a connection, * its history is lost for us. * Let's try to use the data from the packet. */ sender->td_end = end; swin = win << sender->td_scale; sender->td_maxwin = (swin == 0 ? 1 : swin); sender->td_maxend = end + sender->td_maxwin; if (receiver->td_maxwin == 0) { /* We haven't seen traffic in the other * direction yet but we have to tweak window * tracking to pass III and IV until that * happens. */ receiver->td_end = receiver->td_maxend = sack; } else if (sack == receiver->td_end + 1) { /* Likely a reply to a keepalive. * Needed for III. */ receiver->td_end++; } } } else if (tcph->syn && after(end, sender->td_end) && (state->state == TCP_CONNTRACK_SYN_SENT || state->state == TCP_CONNTRACK_SYN_RECV)) { /* * RFC 793: "if a TCP is reinitialized ... then it need * not wait at all; it must only be sure to use sequence * numbers larger than those recently used." * * Re-init state for this direction, just like for the first * syn(-ack) reply, it might differ in seq, ack or tcp options. */ tcp_init_sender(sender, receiver, skb, dataoff, tcph, end, win, dir); if (dir == IP_CT_DIR_REPLY && !tcph->ack) return NFCT_TCP_ACCEPT; } if (!(tcph->ack)) { /* * If there is no ACK, just pretend it was set and OK. */ ack = sack = receiver->td_end; } else if (((tcp_flag_word(tcph) & (TCP_FLAG_ACK|TCP_FLAG_RST)) == (TCP_FLAG_ACK|TCP_FLAG_RST)) && (ack == 0)) { /* * Broken TCP stacks, that set ACK in RST packets as well * with zero ack value. */ ack = sack = receiver->td_end; } if (tcph->rst && seq == 0 && state->state == TCP_CONNTRACK_SYN_SENT) /* * RST sent answering SYN. */ seq = end = sender->td_end; seq_ok = before(seq, sender->td_maxend + 1); if (!seq_ok) { u32 overshot = end - sender->td_maxend + 1; bool ack_ok; ack_ok = after(sack, receiver->td_end - MAXACKWINDOW(sender) - 1); in_recv_win = receiver->td_maxwin && after(end, sender->td_end - receiver->td_maxwin - 1); if (in_recv_win && ack_ok && overshot <= receiver->td_maxwin && before(sack, receiver->td_end + 1)) { /* Work around TCPs that send more bytes than allowed by * the receive window. * * If the (marked as invalid) packet is allowed to pass by * the ruleset and the peer acks this data, then its possible * all future packets will trigger 'ACK is over upper bound' check. * * Thus if only the sequence check fails then do update td_end so * possible ACK for this data can update internal state. */ sender->td_end = end; sender->flags |= IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED; return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_IGNORE, "%u bytes more than expected", overshot); } return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_INVALID, "SEQ is over upper bound %u (over the window of the receiver)", sender->td_maxend + 1); } if (!before(sack, receiver->td_end + 1)) return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_INVALID, "ACK is over upper bound %u (ACKed data not seen yet)", receiver->td_end + 1); /* Is the ending sequence in the receive window (if available)? */ in_recv_win = !receiver->td_maxwin || after(end, sender->td_end - receiver->td_maxwin - 1); if (!in_recv_win) return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_IGNORE, "SEQ is under lower bound %u (already ACKed data retransmitted)", sender->td_end - receiver->td_maxwin - 1); if (!after(sack, receiver->td_end - MAXACKWINDOW(sender) - 1)) return nf_tcp_log_invalid(skb, ct, hook_state, sender, NFCT_TCP_IGNORE, "ignored ACK under lower bound %u (possible overly delayed)", receiver->td_end - MAXACKWINDOW(sender) - 1); /* Take into account window scaling (RFC 1323). */ if (!tcph->syn) win <<= sender->td_scale; /* Update sender data. */ swin = win + (sack - ack); if (sender->td_maxwin < swin) sender->td_maxwin = swin; if (after(end, sender->td_end)) { sender->td_end = end; sender->flags |= IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED; } if (tcph->ack) { if (!(sender->flags & IP_CT_TCP_FLAG_MAXACK_SET)) { sender->td_maxack = ack; sender->flags |= IP_CT_TCP_FLAG_MAXACK_SET; } else if (after(ack, sender->td_maxack)) { sender->td_maxack = ack; } } /* Update receiver data. */ if (receiver->td_maxwin != 0 && after(end, sender->td_maxend)) receiver->td_maxwin += end - sender->td_maxend; if (after(sack + win, receiver->td_maxend - 1)) { receiver->td_maxend = sack + win; if (win == 0) receiver->td_maxend++; } if (ack == receiver->td_end) receiver->flags &= ~IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED; /* Check retransmissions. */ if (index == TCP_ACK_SET) { if (state->last_dir == dir && state->last_seq == seq && state->last_ack == ack && state->last_end == end && state->last_win == win_raw) { state->retrans++; } else { state->last_dir = dir; state->last_seq = seq; state->last_ack = ack; state->last_end = end; state->last_win = win_raw; state->retrans = 0; } } return NFCT_TCP_ACCEPT; } static void __cold nf_tcp_handle_invalid(struct nf_conn *ct, enum ip_conntrack_dir dir, int index, const struct sk_buff *skb, const struct nf_hook_state *hook_state) { const unsigned int *timeouts; const struct nf_tcp_net *tn; unsigned int timeout; u32 expires; if (!test_bit(IPS_ASSURED_BIT, &ct->status) || test_bit(IPS_FIXED_TIMEOUT_BIT, &ct->status)) return; /* We don't want to have connections hanging around in ESTABLISHED * state for long time 'just because' conntrack deemed a FIN/RST * out-of-window. * * Shrink the timeout just like when there is unacked data. * This speeds up eviction of 'dead' connections where the * connection and conntracks internal state are out of sync. */ switch (index) { case TCP_RST_SET: case TCP_FIN_SET: break; default: return; } if (ct->proto.tcp.last_dir != dir && (ct->proto.tcp.last_index == TCP_FIN_SET || ct->proto.tcp.last_index == TCP_RST_SET)) { expires = nf_ct_expires(ct); if (expires < 120 * HZ) return; tn = nf_tcp_pernet(nf_ct_net(ct)); timeouts = nf_ct_timeout_lookup(ct); if (!timeouts) timeouts = tn->timeouts; timeout = READ_ONCE(timeouts[TCP_CONNTRACK_UNACK]); if (expires > timeout) { nf_ct_l4proto_log_invalid(skb, ct, hook_state, "packet (index %d, dir %d) response for index %d lower timeout to %u", index, dir, ct->proto.tcp.last_index, timeout); WRITE_ONCE(ct->timeout, timeout + nfct_time_stamp); } } else { ct->proto.tcp.last_index = index; ct->proto.tcp.last_dir = dir; } } /* table of valid flag combinations - PUSH, ECE and CWR are always valid */ static const u8 tcp_valid_flags[(TCPHDR_FIN|TCPHDR_SYN|TCPHDR_RST|TCPHDR_ACK| TCPHDR_URG) + 1] = { [TCPHDR_SYN] = 1, [TCPHDR_SYN|TCPHDR_URG] = 1, [TCPHDR_SYN|TCPHDR_ACK] = 1, [TCPHDR_RST] = 1, [TCPHDR_RST|TCPHDR_ACK] = 1, [TCPHDR_FIN|TCPHDR_ACK] = 1, [TCPHDR_FIN|TCPHDR_ACK|TCPHDR_URG] = 1, [TCPHDR_ACK] = 1, [TCPHDR_ACK|TCPHDR_URG] = 1, }; static void tcp_error_log(const struct sk_buff *skb, const struct nf_hook_state *state, const char *msg) { nf_l4proto_log_invalid(skb, state, IPPROTO_TCP, "%s", msg); } /* Protect conntrack agaist broken packets. Code taken from ipt_unclean.c. */ static bool tcp_error(const struct tcphdr *th, struct sk_buff *skb, unsigned int dataoff, const struct nf_hook_state *state) { unsigned int tcplen = skb->len - dataoff; u8 tcpflags; /* Not whole TCP header or malformed packet */ if (th->doff*4 < sizeof(struct tcphdr) || tcplen < th->doff*4) { tcp_error_log(skb, state, "truncated packet"); return true; } /* Checksum invalid? Ignore. * We skip checking packets on the outgoing path * because the checksum is assumed to be correct. */ /* FIXME: Source route IP option packets --RR */ if (state->net->ct.sysctl_checksum && state->hook == NF_INET_PRE_ROUTING && nf_checksum(skb, state->hook, dataoff, IPPROTO_TCP, state->pf)) { tcp_error_log(skb, state, "bad checksum"); return true; } /* Check TCP flags. */ tcpflags = (tcp_flag_byte(th) & ~(TCPHDR_ECE|TCPHDR_CWR|TCPHDR_PSH)); if (!tcp_valid_flags[tcpflags]) { tcp_error_log(skb, state, "invalid tcp flag combination"); return true; } return false; } static noinline bool tcp_new(struct nf_conn *ct, const struct sk_buff *skb, unsigned int dataoff, const struct tcphdr *th, const struct nf_hook_state *state) { enum tcp_conntrack new_state; struct net *net = nf_ct_net(ct); const struct nf_tcp_net *tn = nf_tcp_pernet(net); /* Don't need lock here: this conntrack not in circulation yet */ new_state = tcp_conntracks[0][get_conntrack_index(th)][TCP_CONNTRACK_NONE]; /* Invalid: delete conntrack */ if (new_state >= TCP_CONNTRACK_MAX) { tcp_error_log(skb, state, "invalid new"); return false; } if (new_state == TCP_CONNTRACK_SYN_SENT) { memset(&ct->proto.tcp, 0, sizeof(ct->proto.tcp)); /* SYN packet */ ct->proto.tcp.seen[0].td_end = segment_seq_plus_len(ntohl(th->seq), skb->len, dataoff, th); ct->proto.tcp.seen[0].td_maxwin = ntohs(th->window); if (ct->proto.tcp.seen[0].td_maxwin == 0) ct->proto.tcp.seen[0].td_maxwin = 1; ct->proto.tcp.seen[0].td_maxend = ct->proto.tcp.seen[0].td_end; tcp_options(skb, dataoff, th, &ct->proto.tcp.seen[0]); } else if (tn->tcp_loose == 0) { /* Don't try to pick up connections. */ return false; } else { memset(&ct->proto.tcp, 0, sizeof(ct->proto.tcp)); /* * We are in the middle of a connection, * its history is lost for us. * Let's try to use the data from the packet. */ ct->proto.tcp.seen[0].td_end = segment_seq_plus_len(ntohl(th->seq), skb->len, dataoff, th); ct->proto.tcp.seen[0].td_maxwin = ntohs(th->window); if (ct->proto.tcp.seen[0].td_maxwin == 0) ct->proto.tcp.seen[0].td_maxwin = 1; ct->proto.tcp.seen[0].td_maxend = ct->proto.tcp.seen[0].td_end + ct->proto.tcp.seen[0].td_maxwin; /* We assume SACK and liberal window checking to handle * window scaling */ ct->proto.tcp.seen[0].flags = ct->proto.tcp.seen[1].flags = IP_CT_TCP_FLAG_SACK_PERM | IP_CT_TCP_FLAG_BE_LIBERAL; } /* tcp_packet will set them */ ct->proto.tcp.last_index = TCP_NONE_SET; return true; } static bool tcp_can_early_drop(const struct nf_conn *ct) { switch (ct->proto.tcp.state) { case TCP_CONNTRACK_FIN_WAIT: case TCP_CONNTRACK_LAST_ACK: case TCP_CONNTRACK_TIME_WAIT: case TCP_CONNTRACK_CLOSE: case TCP_CONNTRACK_CLOSE_WAIT: return true; default: break; } return false; } void nf_conntrack_tcp_set_closing(struct nf_conn *ct) { enum tcp_conntrack old_state; const unsigned int *timeouts; u32 timeout; if (!nf_ct_is_confirmed(ct)) return; spin_lock_bh(&ct->lock); old_state = ct->proto.tcp.state; ct->proto.tcp.state = TCP_CONNTRACK_CLOSE; if (old_state == TCP_CONNTRACK_CLOSE || test_bit(IPS_FIXED_TIMEOUT_BIT, &ct->status)) { spin_unlock_bh(&ct->lock); return; } timeouts = nf_ct_timeout_lookup(ct); if (!timeouts) { const struct nf_tcp_net *tn; tn = nf_tcp_pernet(nf_ct_net(ct)); timeouts = tn->timeouts; } timeout = timeouts[TCP_CONNTRACK_CLOSE]; WRITE_ONCE(ct->timeout, timeout + nfct_time_stamp); spin_unlock_bh(&ct->lock); nf_conntrack_event_cache(IPCT_PROTOINFO, ct); } static void nf_ct_tcp_state_reset(struct ip_ct_tcp_state *state) { state->td_end = 0; state->td_maxend = 0; state->td_maxwin = 0; state->td_maxack = 0; state->td_scale = 0; state->flags &= IP_CT_TCP_FLAG_BE_LIBERAL; } /* Returns verdict for packet, or -1 for invalid. */ int nf_conntrack_tcp_packet(struct nf_conn *ct, struct sk_buff *skb, unsigned int dataoff, enum ip_conntrack_info ctinfo, const struct nf_hook_state *state) { struct net *net = nf_ct_net(ct); struct nf_tcp_net *tn = nf_tcp_pernet(net); enum tcp_conntrack new_state, old_state; unsigned int index, *timeouts; enum nf_ct_tcp_action res; enum ip_conntrack_dir dir; const struct tcphdr *th; struct tcphdr _tcph; unsigned long timeout; th = skb_header_pointer(skb, dataoff, sizeof(_tcph), &_tcph); if (th == NULL) return -NF_ACCEPT; if (tcp_error(th, skb, dataoff, state)) return -NF_ACCEPT; if (!nf_ct_is_confirmed(ct) && !tcp_new(ct, skb, dataoff, th, state)) return -NF_ACCEPT; spin_lock_bh(&ct->lock); old_state = ct->proto.tcp.state; dir = CTINFO2DIR(ctinfo); index = get_conntrack_index(th); new_state = tcp_conntracks[dir][index][old_state]; switch (new_state) { case TCP_CONNTRACK_SYN_SENT: if (old_state < TCP_CONNTRACK_TIME_WAIT) break; /* RFC 1122: "When a connection is closed actively, * it MUST linger in TIME-WAIT state for a time 2xMSL * (Maximum Segment Lifetime). However, it MAY accept * a new SYN from the remote TCP to reopen the connection * directly from TIME-WAIT state, if..." * We ignore the conditions because we are in the * TIME-WAIT state anyway. * * Handle aborted connections: we and the server * think there is an existing connection but the client * aborts it and starts a new one. */ if (((ct->proto.tcp.seen[dir].flags | ct->proto.tcp.seen[!dir].flags) & IP_CT_TCP_FLAG_CLOSE_INIT) || (ct->proto.tcp.last_dir == dir && ct->proto.tcp.last_index == TCP_RST_SET)) { /* Attempt to reopen a closed/aborted connection. * Delete this connection and look up again. */ spin_unlock_bh(&ct->lock); /* Only repeat if we can actually remove the timer. * Destruction may already be in progress in process * context and we must give it a chance to terminate. */ if (nf_ct_kill(ct)) return -NF_REPEAT; return NF_DROP; } fallthrough; case TCP_CONNTRACK_IGNORE: /* Ignored packets: * * Our connection entry may be out of sync, so ignore * packets which may signal the real connection between * the client and the server. * * a) SYN in ORIGINAL * b) SYN/ACK in REPLY * c) ACK in reply direction after initial SYN in original. * * If the ignored packet is invalid, the receiver will send * a RST we'll catch below. */ if (index == TCP_SYNACK_SET && ct->proto.tcp.last_index == TCP_SYN_SET && ct->proto.tcp.last_dir != dir && ntohl(th->ack_seq) == ct->proto.tcp.last_end) { /* b) This SYN/ACK acknowledges a SYN that we earlier * ignored as invalid. This means that the client and * the server are both in sync, while the firewall is * not. We get in sync from the previously annotated * values. */ old_state = TCP_CONNTRACK_SYN_SENT; new_state = TCP_CONNTRACK_SYN_RECV; ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_end = ct->proto.tcp.last_end; ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_maxend = ct->proto.tcp.last_end; ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_maxwin = ct->proto.tcp.last_win == 0 ? 1 : ct->proto.tcp.last_win; ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_scale = ct->proto.tcp.last_wscale; ct->proto.tcp.last_flags &= ~IP_CT_EXP_CHALLENGE_ACK; ct->proto.tcp.seen[ct->proto.tcp.last_dir].flags = ct->proto.tcp.last_flags; nf_ct_tcp_state_reset(&ct->proto.tcp.seen[dir]); break; } ct->proto.tcp.last_index = index; ct->proto.tcp.last_dir = dir; ct->proto.tcp.last_seq = ntohl(th->seq); ct->proto.tcp.last_end = segment_seq_plus_len(ntohl(th->seq), skb->len, dataoff, th); ct->proto.tcp.last_win = ntohs(th->window); /* a) This is a SYN in ORIGINAL. The client and the server * may be in sync but we are not. In that case, we annotate * the TCP options and let the packet go through. If it is a * valid SYN packet, the server will reply with a SYN/ACK, and * then we'll get in sync. Otherwise, the server potentially * responds with a challenge ACK if implementing RFC5961. */ if (index == TCP_SYN_SET && dir == IP_CT_DIR_ORIGINAL) { struct ip_ct_tcp_state seen = {}; ct->proto.tcp.last_flags = ct->proto.tcp.last_wscale = 0; tcp_options(skb, dataoff, th, &seen); if (seen.flags & IP_CT_TCP_FLAG_WINDOW_SCALE) { ct->proto.tcp.last_flags |= IP_CT_TCP_FLAG_WINDOW_SCALE; ct->proto.tcp.last_wscale = seen.td_scale; } if (seen.flags & IP_CT_TCP_FLAG_SACK_PERM) { ct->proto.tcp.last_flags |= IP_CT_TCP_FLAG_SACK_PERM; } /* Mark the potential for RFC5961 challenge ACK, * this pose a special problem for LAST_ACK state * as ACK is intrepretated as ACKing last FIN. */ if (old_state == TCP_CONNTRACK_LAST_ACK) ct->proto.tcp.last_flags |= IP_CT_EXP_CHALLENGE_ACK; } /* possible challenge ack reply to syn */ if (old_state == TCP_CONNTRACK_SYN_SENT && index == TCP_ACK_SET && dir == IP_CT_DIR_REPLY) ct->proto.tcp.last_ack = ntohl(th->ack_seq); spin_unlock_bh(&ct->lock); nf_ct_l4proto_log_invalid(skb, ct, state, "packet (index %d) in dir %d ignored, state %s", index, dir, tcp_conntrack_names[old_state]); return NF_ACCEPT; case TCP_CONNTRACK_MAX: /* Special case for SYN proxy: when the SYN to the server or * the SYN/ACK from the server is lost, the client may transmit * a keep-alive packet while in SYN_SENT state. This needs to * be associated with the original conntrack entry in order to * generate a new SYN with the correct sequence number. */ if (nfct_synproxy(ct) && old_state == TCP_CONNTRACK_SYN_SENT && index == TCP_ACK_SET && dir == IP_CT_DIR_ORIGINAL && ct->proto.tcp.last_dir == IP_CT_DIR_ORIGINAL && ct->proto.tcp.seen[dir].td_end - 1 == ntohl(th->seq)) { pr_debug("nf_ct_tcp: SYN proxy client keep alive\n"); spin_unlock_bh(&ct->lock); return NF_ACCEPT; } /* Invalid packet */ spin_unlock_bh(&ct->lock); nf_ct_l4proto_log_invalid(skb, ct, state, "packet (index %d) in dir %d invalid, state %s", index, dir, tcp_conntrack_names[old_state]); return -NF_ACCEPT; case TCP_CONNTRACK_TIME_WAIT: /* RFC5961 compliance cause stack to send "challenge-ACK" * e.g. in response to spurious SYNs. Conntrack MUST * not believe this ACK is acking last FIN. */ if (old_state == TCP_CONNTRACK_LAST_ACK && index == TCP_ACK_SET && ct->proto.tcp.last_dir != dir && ct->proto.tcp.last_index == TCP_SYN_SET && (ct->proto.tcp.last_flags & IP_CT_EXP_CHALLENGE_ACK)) { /* Detected RFC5961 challenge ACK */ ct->proto.tcp.last_flags &= ~IP_CT_EXP_CHALLENGE_ACK; spin_unlock_bh(&ct->lock); nf_ct_l4proto_log_invalid(skb, ct, state, "challenge-ack ignored"); return NF_ACCEPT; /* Don't change state */ } break; case TCP_CONNTRACK_SYN_SENT2: /* tcp_conntracks table is not smart enough to handle * simultaneous open. */ ct->proto.tcp.last_flags |= IP_CT_TCP_SIMULTANEOUS_OPEN; break; case TCP_CONNTRACK_SYN_RECV: if (dir == IP_CT_DIR_REPLY && index == TCP_ACK_SET && ct->proto.tcp.last_flags & IP_CT_TCP_SIMULTANEOUS_OPEN) new_state = TCP_CONNTRACK_ESTABLISHED; break; case TCP_CONNTRACK_CLOSE: if (index != TCP_RST_SET) break; /* If we are closing, tuple might have been re-used already. * last_index, last_ack, and all other ct fields used for * sequence/window validation are outdated in that case. * * As the conntrack can already be expired by GC under pressure, * just skip validation checks. */ if (tcp_can_early_drop(ct)) goto in_window; /* td_maxack might be outdated if we let a SYN through earlier */ if ((ct->proto.tcp.seen[!dir].flags & IP_CT_TCP_FLAG_MAXACK_SET) && ct->proto.tcp.last_index != TCP_SYN_SET) { u32 seq = ntohl(th->seq); /* If we are not in established state and SEQ=0 this is most * likely an answer to a SYN we let go through above (last_index * can be updated due to out-of-order ACKs). */ if (seq == 0 && !nf_conntrack_tcp_established(ct)) break; if (before(seq, ct->proto.tcp.seen[!dir].td_maxack) && !tn->tcp_ignore_invalid_rst) { /* Invalid RST */ spin_unlock_bh(&ct->lock); nf_ct_l4proto_log_invalid(skb, ct, state, "invalid rst"); return -NF_ACCEPT; } if (!nf_conntrack_tcp_established(ct) || seq == ct->proto.tcp.seen[!dir].td_maxack) break; /* Check if rst is part of train, such as * foo:80 > bar:4379: P, 235946583:235946602(19) ack 42 * foo:80 > bar:4379: R, 235946602:235946602(0) ack 42 */ if (ct->proto.tcp.last_index == TCP_ACK_SET && ct->proto.tcp.last_dir == dir && seq == ct->proto.tcp.last_end) break; /* ... RST sequence number doesn't match exactly, keep * established state to allow a possible challenge ACK. */ new_state = old_state; } if (((test_bit(IPS_SEEN_REPLY_BIT, &ct->status) && ct->proto.tcp.last_index == TCP_SYN_SET) || (!test_bit(IPS_ASSURED_BIT, &ct->status) && ct->proto.tcp.last_index == TCP_ACK_SET)) && ntohl(th->ack_seq) == ct->proto.tcp.last_end) { /* RST sent to invalid SYN or ACK we had let through * at a) and c) above: * * a) SYN was in window then * c) we hold a half-open connection. * * Delete our connection entry. * We skip window checking, because packet might ACK * segments we ignored. */ goto in_window; } /* Reset in response to a challenge-ack we let through earlier */ if (old_state == TCP_CONNTRACK_SYN_SENT && ct->proto.tcp.last_index == TCP_ACK_SET && ct->proto.tcp.last_dir == IP_CT_DIR_REPLY && ntohl(th->seq) == ct->proto.tcp.last_ack) goto in_window; break; default: /* Keep compilers happy. */ break; } res = tcp_in_window(ct, dir, index, skb, dataoff, th, state); switch (res) { case NFCT_TCP_IGNORE: spin_unlock_bh(&ct->lock); return NF_ACCEPT; case NFCT_TCP_INVALID: nf_tcp_handle_invalid(ct, dir, index, skb, state); spin_unlock_bh(&ct->lock); return -NF_ACCEPT; case NFCT_TCP_ACCEPT: break; } in_window: /* From now on we have got in-window packets */ ct->proto.tcp.last_index = index; ct->proto.tcp.last_dir = dir; ct->proto.tcp.state = new_state; if (old_state != new_state && new_state == TCP_CONNTRACK_FIN_WAIT) ct->proto.tcp.seen[dir].flags |= IP_CT_TCP_FLAG_CLOSE_INIT; timeouts = nf_ct_timeout_lookup(ct); if (!timeouts) timeouts = tn->timeouts; if (ct->proto.tcp.retrans >= tn->tcp_max_retrans && timeouts[new_state] > timeouts[TCP_CONNTRACK_RETRANS]) timeout = timeouts[TCP_CONNTRACK_RETRANS]; else if (unlikely(index == TCP_RST_SET)) timeout = timeouts[TCP_CONNTRACK_CLOSE]; else if ((ct->proto.tcp.seen[0].flags | ct->proto.tcp.seen[1].flags) & IP_CT_TCP_FLAG_DATA_UNACKNOWLEDGED && timeouts[new_state] > timeouts[TCP_CONNTRACK_UNACK]) timeout = timeouts[TCP_CONNTRACK_UNACK]; else if (ct->proto.tcp.last_win == 0 && timeouts[new_state] > timeouts[TCP_CONNTRACK_RETRANS]) timeout = timeouts[TCP_CONNTRACK_RETRANS]; else timeout = timeouts[new_state]; spin_unlock_bh(&ct->lock); if (new_state != old_state) nf_conntrack_event_cache(IPCT_PROTOINFO, ct); if (!test_bit(IPS_SEEN_REPLY_BIT, &ct->status)) { /* If only reply is a RST, we can consider ourselves not to have an established connection: this is a fairly common problem case, so we can delete the conntrack immediately. --RR */ if (th->rst) { nf_ct_kill_acct(ct, ctinfo, skb); return NF_ACCEPT; } if (index == TCP_SYN_SET && old_state == TCP_CONNTRACK_SYN_SENT) { /* do not renew timeout on SYN retransmit. * * Else port reuse by client or NAT middlebox can keep * entry alive indefinitely (including nat info). */ return NF_ACCEPT; } /* ESTABLISHED without SEEN_REPLY, i.e. mid-connection * pickup with loose=1. Avoid large ESTABLISHED timeout. */ if (new_state == TCP_CONNTRACK_ESTABLISHED && timeout > timeouts[TCP_CONNTRACK_UNACK]) timeout = timeouts[TCP_CONNTRACK_UNACK]; } else if (!test_bit(IPS_ASSURED_BIT, &ct->status) && (old_state == TCP_CONNTRACK_SYN_RECV || old_state == TCP_CONNTRACK_ESTABLISHED) && new_state == TCP_CONNTRACK_ESTABLISHED) { /* Set ASSURED if we see valid ack in ESTABLISHED after SYN_RECV or a valid answer for a picked up connection. */ set_bit(IPS_ASSURED_BIT, &ct->status); nf_conntrack_event_cache(IPCT_ASSURED, ct); } nf_ct_refresh_acct(ct, ctinfo, skb, timeout); return NF_ACCEPT; } #if IS_ENABLED(CONFIG_NF_CT_NETLINK) #include <linux/netfilter/nfnetlink.h> #include <linux/netfilter/nfnetlink_conntrack.h> static int tcp_to_nlattr(struct sk_buff *skb, struct nlattr *nla, struct nf_conn *ct, bool destroy) { struct nlattr *nest_parms; struct nf_ct_tcp_flags tmp = {}; spin_lock_bh(&ct->lock); nest_parms = nla_nest_start(skb, CTA_PROTOINFO_TCP); if (!nest_parms) goto nla_put_failure; if (nla_put_u8(skb, CTA_PROTOINFO_TCP_STATE, ct->proto.tcp.state)) goto nla_put_failure; if (destroy) goto skip_state; if (nla_put_u8(skb, CTA_PROTOINFO_TCP_WSCALE_ORIGINAL, ct->proto.tcp.seen[0].td_scale) || nla_put_u8(skb, CTA_PROTOINFO_TCP_WSCALE_REPLY, ct->proto.tcp.seen[1].td_scale)) goto nla_put_failure; tmp.flags = ct->proto.tcp.seen[0].flags; if (nla_put(skb, CTA_PROTOINFO_TCP_FLAGS_ORIGINAL, sizeof(struct nf_ct_tcp_flags), &tmp)) goto nla_put_failure; tmp.flags = ct->proto.tcp.seen[1].flags; if (nla_put(skb, CTA_PROTOINFO_TCP_FLAGS_REPLY, sizeof(struct nf_ct_tcp_flags), &tmp)) goto nla_put_failure; skip_state: spin_unlock_bh(&ct->lock); nla_nest_end(skb, nest_parms); return 0; nla_put_failure: spin_unlock_bh(&ct->lock); return -1; } static const struct nla_policy tcp_nla_policy[CTA_PROTOINFO_TCP_MAX+1] = { [CTA_PROTOINFO_TCP_STATE] = { .type = NLA_U8 }, [CTA_PROTOINFO_TCP_WSCALE_ORIGINAL] = { .type = NLA_U8 }, [CTA_PROTOINFO_TCP_WSCALE_REPLY] = { .type = NLA_U8 }, [CTA_PROTOINFO_TCP_FLAGS_ORIGINAL] = { .len = sizeof(struct nf_ct_tcp_flags) }, [CTA_PROTOINFO_TCP_FLAGS_REPLY] = { .len = sizeof(struct nf_ct_tcp_flags) }, }; #define TCP_NLATTR_SIZE ( \ NLA_ALIGN(NLA_HDRLEN + 1) + \ NLA_ALIGN(NLA_HDRLEN + 1) + \ NLA_ALIGN(NLA_HDRLEN + sizeof(struct nf_ct_tcp_flags)) + \ NLA_ALIGN(NLA_HDRLEN + sizeof(struct nf_ct_tcp_flags))) static int nlattr_to_tcp(struct nlattr *cda[], struct nf_conn *ct) { struct nlattr *pattr = cda[CTA_PROTOINFO_TCP]; struct nlattr *tb[CTA_PROTOINFO_TCP_MAX+1]; int err; /* updates could not contain anything about the private * protocol info, in that case skip the parsing */ if (!pattr) return 0; err = nla_parse_nested_deprecated(tb, CTA_PROTOINFO_TCP_MAX, pattr, tcp_nla_policy, NULL); if (err < 0) return err; if (tb[CTA_PROTOINFO_TCP_STATE] && nla_get_u8(tb[CTA_PROTOINFO_TCP_STATE]) >= TCP_CONNTRACK_MAX) return -EINVAL; spin_lock_bh(&ct->lock); if (tb[CTA_PROTOINFO_TCP_STATE]) ct->proto.tcp.state = nla_get_u8(tb[CTA_PROTOINFO_TCP_STATE]); if (tb[CTA_PROTOINFO_TCP_FLAGS_ORIGINAL]) { struct nf_ct_tcp_flags *attr = nla_data(tb[CTA_PROTOINFO_TCP_FLAGS_ORIGINAL]); ct->proto.tcp.seen[0].flags &= ~attr->mask; ct->proto.tcp.seen[0].flags |= attr->flags & attr->mask; } if (tb[CTA_PROTOINFO_TCP_FLAGS_REPLY]) { struct nf_ct_tcp_flags *attr = nla_data(tb[CTA_PROTOINFO_TCP_FLAGS_REPLY]); ct->proto.tcp.seen[1].flags &= ~attr->mask; ct->proto.tcp.seen[1].flags |= attr->flags & attr->mask; } if (tb[CTA_PROTOINFO_TCP_WSCALE_ORIGINAL] && tb[CTA_PROTOINFO_TCP_WSCALE_REPLY] && ct->proto.tcp.seen[0].flags & IP_CT_TCP_FLAG_WINDOW_SCALE && ct->proto.tcp.seen[1].flags & IP_CT_TCP_FLAG_WINDOW_SCALE) { ct->proto.tcp.seen[0].td_scale = nla_get_u8(tb[CTA_PROTOINFO_TCP_WSCALE_ORIGINAL]); ct->proto.tcp.seen[1].td_scale = nla_get_u8(tb[CTA_PROTOINFO_TCP_WSCALE_REPLY]); } spin_unlock_bh(&ct->lock); return 0; } static unsigned int tcp_nlattr_tuple_size(void) { static unsigned int size __read_mostly; if (!size) size = nla_policy_len(nf_ct_port_nla_policy, CTA_PROTO_MAX + 1); return size; } #endif #ifdef CONFIG_NF_CONNTRACK_TIMEOUT #include <linux/netfilter/nfnetlink.h> #include <linux/netfilter/nfnetlink_cttimeout.h> static int tcp_timeout_nlattr_to_obj(struct nlattr *tb[], struct net *net, void *data) { struct nf_tcp_net *tn = nf_tcp_pernet(net); unsigned int *timeouts = data; int i; if (!timeouts) timeouts = tn->timeouts; /* set default TCP timeouts. */ for (i=0; i<TCP_CONNTRACK_TIMEOUT_MAX; i++) timeouts[i] = tn->timeouts[i]; if (tb[CTA_TIMEOUT_TCP_SYN_SENT]) { timeouts[TCP_CONNTRACK_SYN_SENT] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_SYN_SENT]))*HZ; } if (tb[CTA_TIMEOUT_TCP_SYN_RECV]) { timeouts[TCP_CONNTRACK_SYN_RECV] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_SYN_RECV]))*HZ; } if (tb[CTA_TIMEOUT_TCP_ESTABLISHED]) { timeouts[TCP_CONNTRACK_ESTABLISHED] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_ESTABLISHED]))*HZ; } if (tb[CTA_TIMEOUT_TCP_FIN_WAIT]) { timeouts[TCP_CONNTRACK_FIN_WAIT] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_FIN_WAIT]))*HZ; } if (tb[CTA_TIMEOUT_TCP_CLOSE_WAIT]) { timeouts[TCP_CONNTRACK_CLOSE_WAIT] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_CLOSE_WAIT]))*HZ; } if (tb[CTA_TIMEOUT_TCP_LAST_ACK]) { timeouts[TCP_CONNTRACK_LAST_ACK] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_LAST_ACK]))*HZ; } if (tb[CTA_TIMEOUT_TCP_TIME_WAIT]) { timeouts[TCP_CONNTRACK_TIME_WAIT] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_TIME_WAIT]))*HZ; } if (tb[CTA_TIMEOUT_TCP_CLOSE]) { timeouts[TCP_CONNTRACK_CLOSE] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_CLOSE]))*HZ; } if (tb[CTA_TIMEOUT_TCP_SYN_SENT2]) { timeouts[TCP_CONNTRACK_SYN_SENT2] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_SYN_SENT2]))*HZ; } if (tb[CTA_TIMEOUT_TCP_RETRANS]) { timeouts[TCP_CONNTRACK_RETRANS] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_RETRANS]))*HZ; } if (tb[CTA_TIMEOUT_TCP_UNACK]) { timeouts[TCP_CONNTRACK_UNACK] = ntohl(nla_get_be32(tb[CTA_TIMEOUT_TCP_UNACK]))*HZ; } timeouts[CTA_TIMEOUT_TCP_UNSPEC] = timeouts[CTA_TIMEOUT_TCP_SYN_SENT]; return 0; } static int tcp_timeout_obj_to_nlattr(struct sk_buff *skb, const void *data) { const unsigned int *timeouts = data; if (nla_put_be32(skb, CTA_TIMEOUT_TCP_SYN_SENT, htonl(timeouts[TCP_CONNTRACK_SYN_SENT] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_SYN_RECV, htonl(timeouts[TCP_CONNTRACK_SYN_RECV] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_ESTABLISHED, htonl(timeouts[TCP_CONNTRACK_ESTABLISHED] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_FIN_WAIT, htonl(timeouts[TCP_CONNTRACK_FIN_WAIT] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_CLOSE_WAIT, htonl(timeouts[TCP_CONNTRACK_CLOSE_WAIT] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_LAST_ACK, htonl(timeouts[TCP_CONNTRACK_LAST_ACK] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_TIME_WAIT, htonl(timeouts[TCP_CONNTRACK_TIME_WAIT] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_CLOSE, htonl(timeouts[TCP_CONNTRACK_CLOSE] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_SYN_SENT2, htonl(timeouts[TCP_CONNTRACK_SYN_SENT2] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_RETRANS, htonl(timeouts[TCP_CONNTRACK_RETRANS] / HZ)) || nla_put_be32(skb, CTA_TIMEOUT_TCP_UNACK, htonl(timeouts[TCP_CONNTRACK_UNACK] / HZ))) goto nla_put_failure; return 0; nla_put_failure: return -ENOSPC; } static const struct nla_policy tcp_timeout_nla_policy[CTA_TIMEOUT_TCP_MAX+1] = { [CTA_TIMEOUT_TCP_SYN_SENT] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_SYN_RECV] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_ESTABLISHED] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_FIN_WAIT] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_CLOSE_WAIT] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_LAST_ACK] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_TIME_WAIT] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_CLOSE] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_SYN_SENT2] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_RETRANS] = { .type = NLA_U32 }, [CTA_TIMEOUT_TCP_UNACK] = { .type = NLA_U32 }, }; #endif /* CONFIG_NF_CONNTRACK_TIMEOUT */ void nf_conntrack_tcp_init_net(struct net *net) { struct nf_tcp_net *tn = nf_tcp_pernet(net); int i; for (i = 0; i < TCP_CONNTRACK_TIMEOUT_MAX; i++) tn->timeouts[i] = tcp_timeouts[i]; /* timeouts[0] is unused, make it same as SYN_SENT so * ->timeouts[0] contains 'new' timeout, like udp or icmp. */ tn->timeouts[0] = tcp_timeouts[TCP_CONNTRACK_SYN_SENT]; /* If it is set to zero, we disable picking up already established * connections. */ tn->tcp_loose = 1; /* "Be conservative in what you do, * be liberal in what you accept from others." * If it's non-zero, we mark only out of window RST segments as INVALID. */ tn->tcp_be_liberal = 0; /* If it's non-zero, we turn off RST sequence number check */ tn->tcp_ignore_invalid_rst = 0; /* Max number of the retransmitted packets without receiving an (acceptable) * ACK from the destination. If this number is reached, a shorter timer * will be started. */ tn->tcp_max_retrans = 3; #if IS_ENABLED(CONFIG_NF_FLOW_TABLE) tn->offload_timeout = 30 * HZ; #endif } const struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp = { .l4proto = IPPROTO_TCP, #ifdef CONFIG_NF_CONNTRACK_PROCFS .print_conntrack = tcp_print_conntrack, #endif .can_early_drop = tcp_can_early_drop, #if IS_ENABLED(CONFIG_NF_CT_NETLINK) .to_nlattr = tcp_to_nlattr, .from_nlattr = nlattr_to_tcp, .tuple_to_nlattr = nf_ct_port_tuple_to_nlattr, .nlattr_to_tuple = nf_ct_port_nlattr_to_tuple, .nlattr_tuple_size = tcp_nlattr_tuple_size, .nlattr_size = TCP_NLATTR_SIZE, .nla_policy = nf_ct_port_nla_policy, #endif #ifdef CONFIG_NF_CONNTRACK_TIMEOUT .ctnl_timeout = { .nlattr_to_obj = tcp_timeout_nlattr_to_obj, .obj_to_nlattr = tcp_timeout_obj_to_nlattr, .nlattr_max = CTA_TIMEOUT_TCP_MAX, .obj_size = sizeof(unsigned int) * TCP_CONNTRACK_TIMEOUT_MAX, .nla_policy = tcp_timeout_nla_policy, }, #endif /* CONFIG_NF_CONNTRACK_TIMEOUT */ }; |
1 1 1 19 21 21 20 19 19 1 1 21 1 12 9 19 1 6 6 1 1 1 1 20 21 1 1 20 21 20 21 19 1 1 1 1 1 1 1 1 7 1 1 6 5 6 5 1 6 37 38 40 40 1 1 5 5 2 5 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 | // SPDX-License-Identifier: GPL-2.0 /* * Basic worker thread pool for io_uring * * Copyright (C) 2019 Jens Axboe * */ #include <linux/kernel.h> #include <linux/init.h> #include <linux/errno.h> #include <linux/sched/signal.h> #include <linux/percpu.h> #include <linux/slab.h> #include <linux/rculist_nulls.h> #include <linux/cpu.h> #include <linux/task_work.h> #include <linux/audit.h> #include <linux/mmu_context.h> #include <uapi/linux/io_uring.h> #include "io-wq.h" #include "slist.h" #include "io_uring.h" #define WORKER_IDLE_TIMEOUT (5 * HZ) enum { IO_WORKER_F_UP = 1, /* up and active */ IO_WORKER_F_RUNNING = 2, /* account as running */ IO_WORKER_F_FREE = 4, /* worker on free list */ IO_WORKER_F_BOUND = 8, /* is doing bounded work */ }; enum { IO_WQ_BIT_EXIT = 0, /* wq exiting */ }; enum { IO_ACCT_STALLED_BIT = 0, /* stalled on hash */ }; /* * One for each thread in a wq pool */ struct io_worker { refcount_t ref; unsigned flags; struct hlist_nulls_node nulls_node; struct list_head all_list; struct task_struct *task; struct io_wq *wq; struct io_wq_work *cur_work; struct io_wq_work *next_work; raw_spinlock_t lock; struct completion ref_done; unsigned long create_state; struct callback_head create_work; int create_index; union { struct rcu_head rcu; struct work_struct work; }; }; #if BITS_PER_LONG == 64 #define IO_WQ_HASH_ORDER 6 #else #define IO_WQ_HASH_ORDER 5 #endif #define IO_WQ_NR_HASH_BUCKETS (1u << IO_WQ_HASH_ORDER) struct io_wq_acct { unsigned nr_workers; unsigned max_workers; int index; atomic_t nr_running; raw_spinlock_t lock; struct io_wq_work_list work_list; unsigned long flags; }; enum { IO_WQ_ACCT_BOUND, IO_WQ_ACCT_UNBOUND, IO_WQ_ACCT_NR, }; /* * Per io_wq state */ struct io_wq { unsigned long state; free_work_fn *free_work; io_wq_work_fn *do_work; struct io_wq_hash *hash; atomic_t worker_refs; struct completion worker_done; struct hlist_node cpuhp_node; struct task_struct *task; struct io_wq_acct acct[IO_WQ_ACCT_NR]; /* lock protects access to elements below */ raw_spinlock_t lock; struct hlist_nulls_head free_list; struct list_head all_list; struct wait_queue_entry wait; struct io_wq_work *hash_tail[IO_WQ_NR_HASH_BUCKETS]; cpumask_var_t cpu_mask; }; static enum cpuhp_state io_wq_online; struct io_cb_cancel_data { work_cancel_fn *fn; void *data; int nr_running; int nr_pending; bool cancel_all; }; static bool create_io_worker(struct io_wq *wq, int index); static void io_wq_dec_running(struct io_worker *worker); static bool io_acct_cancel_pending_work(struct io_wq *wq, struct io_wq_acct *acct, struct io_cb_cancel_data *match); static void create_worker_cb(struct callback_head *cb); static void io_wq_cancel_tw_create(struct io_wq *wq); static bool io_worker_get(struct io_worker *worker) { return refcount_inc_not_zero(&worker->ref); } static void io_worker_release(struct io_worker *worker) { if (refcount_dec_and_test(&worker->ref)) complete(&worker->ref_done); } static inline struct io_wq_acct *io_get_acct(struct io_wq *wq, bool bound) { return &wq->acct[bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND]; } static inline struct io_wq_acct *io_work_get_acct(struct io_wq *wq, struct io_wq_work *work) { return io_get_acct(wq, !(work->flags & IO_WQ_WORK_UNBOUND)); } static inline struct io_wq_acct *io_wq_get_acct(struct io_worker *worker) { return io_get_acct(worker->wq, worker->flags & IO_WORKER_F_BOUND); } static void io_worker_ref_put(struct io_wq *wq) { if (atomic_dec_and_test(&wq->worker_refs)) complete(&wq->worker_done); } bool io_wq_worker_stopped(void) { struct io_worker *worker = current->worker_private; if (WARN_ON_ONCE(!io_wq_current_is_worker())) return true; return test_bit(IO_WQ_BIT_EXIT, &worker->wq->state); } static void io_worker_cancel_cb(struct io_worker *worker) { struct io_wq_acct *acct = io_wq_get_acct(worker); struct io_wq *wq = worker->wq; atomic_dec(&acct->nr_running); raw_spin_lock(&wq->lock); acct->nr_workers--; raw_spin_unlock(&wq->lock); io_worker_ref_put(wq); clear_bit_unlock(0, &worker->create_state); io_worker_release(worker); } static bool io_task_worker_match(struct callback_head *cb, void *data) { struct io_worker *worker; if (cb->func != create_worker_cb) return false; worker = container_of(cb, struct io_worker, create_work); return worker == data; } static void io_worker_exit(struct io_worker *worker) { struct io_wq *wq = worker->wq; while (1) { struct callback_head *cb = task_work_cancel_match(wq->task, io_task_worker_match, worker); if (!cb) break; io_worker_cancel_cb(worker); } io_worker_release(worker); wait_for_completion(&worker->ref_done); raw_spin_lock(&wq->lock); if (worker->flags & IO_WORKER_F_FREE) hlist_nulls_del_rcu(&worker->nulls_node); list_del_rcu(&worker->all_list); raw_spin_unlock(&wq->lock); io_wq_dec_running(worker); /* * this worker is a goner, clear ->worker_private to avoid any * inc/dec running calls that could happen as part of exit from * touching 'worker'. */ current->worker_private = NULL; kfree_rcu(worker, rcu); io_worker_ref_put(wq); do_exit(0); } static inline bool __io_acct_run_queue(struct io_wq_acct *acct) { return !test_bit(IO_ACCT_STALLED_BIT, &acct->flags) && !wq_list_empty(&acct->work_list); } /* * If there's work to do, returns true with acct->lock acquired. If not, * returns false with no lock held. */ static inline bool io_acct_run_queue(struct io_wq_acct *acct) __acquires(&acct->lock) { raw_spin_lock(&acct->lock); if (__io_acct_run_queue(acct)) return true; raw_spin_unlock(&acct->lock); return false; } /* * Check head of free list for an available worker. If one isn't available, * caller must create one. */ static bool io_wq_activate_free_worker(struct io_wq *wq, struct io_wq_acct *acct) __must_hold(RCU) { struct hlist_nulls_node *n; struct io_worker *worker; /* * Iterate free_list and see if we can find an idle worker to * activate. If a given worker is on the free_list but in the process * of exiting, keep trying. */ hlist_nulls_for_each_entry_rcu(worker, n, &wq->free_list, nulls_node) { if (!io_worker_get(worker)) continue; if (io_wq_get_acct(worker) != acct) { io_worker_release(worker); continue; } /* * If the worker is already running, it's either already * starting work or finishing work. In either case, if it does * to go sleep, we'll kick off a new task for this work anyway. */ wake_up_process(worker->task); io_worker_release(worker); return true; } return false; } /* * We need a worker. If we find a free one, we're good. If not, and we're * below the max number of workers, create one. */ static bool io_wq_create_worker(struct io_wq *wq, struct io_wq_acct *acct) { /* * Most likely an attempt to queue unbounded work on an io_wq that * wasn't setup with any unbounded workers. */ if (unlikely(!acct->max_workers)) pr_warn_once("io-wq is not configured for unbound workers"); raw_spin_lock(&wq->lock); if (acct->nr_workers >= acct->max_workers) { raw_spin_unlock(&wq->lock); return true; } acct->nr_workers++; raw_spin_unlock(&wq->lock); atomic_inc(&acct->nr_running); atomic_inc(&wq->worker_refs); return create_io_worker(wq, acct->index); } static void io_wq_inc_running(struct io_worker *worker) { struct io_wq_acct *acct = io_wq_get_acct(worker); atomic_inc(&acct->nr_running); } static void create_worker_cb(struct callback_head *cb) { struct io_worker *worker; struct io_wq *wq; struct io_wq_acct *acct; bool do_create = false; worker = container_of(cb, struct io_worker, create_work); wq = worker->wq; acct = &wq->acct[worker->create_index]; raw_spin_lock(&wq->lock); if (acct->nr_workers < acct->max_workers) { acct->nr_workers++; do_create = true; } raw_spin_unlock(&wq->lock); if (do_create) { create_io_worker(wq, worker->create_index); } else { atomic_dec(&acct->nr_running); io_worker_ref_put(wq); } clear_bit_unlock(0, &worker->create_state); io_worker_release(worker); } static bool io_queue_worker_create(struct io_worker *worker, struct io_wq_acct *acct, task_work_func_t func) { struct io_wq *wq = worker->wq; /* raced with exit, just ignore create call */ if (test_bit(IO_WQ_BIT_EXIT, &wq->state)) goto fail; if (!io_worker_get(worker)) goto fail; /* * create_state manages ownership of create_work/index. We should * only need one entry per worker, as the worker going to sleep * will trigger the condition, and waking will clear it once it * runs the task_work. */ if (test_bit(0, &worker->create_state) || test_and_set_bit_lock(0, &worker->create_state)) goto fail_release; atomic_inc(&wq->worker_refs); init_task_work(&worker->create_work, func); worker->create_index = acct->index; if (!task_work_add(wq->task, &worker->create_work, TWA_SIGNAL)) { /* * EXIT may have been set after checking it above, check after * adding the task_work and remove any creation item if it is * now set. wq exit does that too, but we can have added this * work item after we canceled in io_wq_exit_workers(). */ if (test_bit(IO_WQ_BIT_EXIT, &wq->state)) io_wq_cancel_tw_create(wq); io_worker_ref_put(wq); return true; } io_worker_ref_put(wq); clear_bit_unlock(0, &worker->create_state); fail_release: io_worker_release(worker); fail: atomic_dec(&acct->nr_running); io_worker_ref_put(wq); return false; } static void io_wq_dec_running(struct io_worker *worker) { struct io_wq_acct *acct = io_wq_get_acct(worker); struct io_wq *wq = worker->wq; if (!(worker->flags & IO_WORKER_F_UP)) return; if (!atomic_dec_and_test(&acct->nr_running)) return; if (!io_acct_run_queue(acct)) return; raw_spin_unlock(&acct->lock); atomic_inc(&acct->nr_running); atomic_inc(&wq->worker_refs); io_queue_worker_create(worker, acct, create_worker_cb); } /* * Worker will start processing some work. Move it to the busy list, if * it's currently on the freelist */ static void __io_worker_busy(struct io_wq *wq, struct io_worker *worker) { if (worker->flags & IO_WORKER_F_FREE) { worker->flags &= ~IO_WORKER_F_FREE; raw_spin_lock(&wq->lock); hlist_nulls_del_init_rcu(&worker->nulls_node); raw_spin_unlock(&wq->lock); } } /* * No work, worker going to sleep. Move to freelist. */ static void __io_worker_idle(struct io_wq *wq, struct io_worker *worker) __must_hold(wq->lock) { if (!(worker->flags & IO_WORKER_F_FREE)) { worker->flags |= IO_WORKER_F_FREE; hlist_nulls_add_head_rcu(&worker->nulls_node, &wq->free_list); } } static inline unsigned int io_get_work_hash(struct io_wq_work *work) { return work->flags >> IO_WQ_HASH_SHIFT; } static bool io_wait_on_hash(struct io_wq *wq, unsigned int hash) { bool ret = false; spin_lock_irq(&wq->hash->wait.lock); if (list_empty(&wq->wait.entry)) { __add_wait_queue(&wq->hash->wait, &wq->wait); if (!test_bit(hash, &wq->hash->map)) { __set_current_state(TASK_RUNNING); list_del_init(&wq->wait.entry); ret = true; } } spin_unlock_irq(&wq->hash->wait.lock); return ret; } static struct io_wq_work *io_get_next_work(struct io_wq_acct *acct, struct io_worker *worker) __must_hold(acct->lock) { struct io_wq_work_node *node, *prev; struct io_wq_work *work, *tail; unsigned int stall_hash = -1U; struct io_wq *wq = worker->wq; wq_list_for_each(node, prev, &acct->work_list) { unsigned int hash; work = container_of(node, struct io_wq_work, list); /* not hashed, can run anytime */ if (!io_wq_is_hashed(work)) { wq_list_del(&acct->work_list, node, prev); return work; } hash = io_get_work_hash(work); /* all items with this hash lie in [work, tail] */ tail = wq->hash_tail[hash]; /* hashed, can run if not already running */ if (!test_and_set_bit(hash, &wq->hash->map)) { wq->hash_tail[hash] = NULL; wq_list_cut(&acct->work_list, &tail->list, prev); return work; } if (stall_hash == -1U) stall_hash = hash; /* fast forward to a next hash, for-each will fix up @prev */ node = &tail->list; } if (stall_hash != -1U) { bool unstalled; /* * Set this before dropping the lock to avoid racing with new * work being added and clearing the stalled bit. */ set_bit(IO_ACCT_STALLED_BIT, &acct->flags); raw_spin_unlock(&acct->lock); unstalled = io_wait_on_hash(wq, stall_hash); raw_spin_lock(&acct->lock); if (unstalled) { clear_bit(IO_ACCT_STALLED_BIT, &acct->flags); if (wq_has_sleeper(&wq->hash->wait)) wake_up(&wq->hash->wait); } } return NULL; } static void io_assign_current_work(struct io_worker *worker, struct io_wq_work *work) { if (work) { io_run_task_work(); cond_resched(); } raw_spin_lock(&worker->lock); worker->cur_work = work; worker->next_work = NULL; raw_spin_unlock(&worker->lock); } /* * Called with acct->lock held, drops it before returning */ static void io_worker_handle_work(struct io_wq_acct *acct, struct io_worker *worker) __releases(&acct->lock) { struct io_wq *wq = worker->wq; bool do_kill = test_bit(IO_WQ_BIT_EXIT, &wq->state); do { struct io_wq_work *work; /* * If we got some work, mark us as busy. If we didn't, but * the list isn't empty, it means we stalled on hashed work. * Mark us stalled so we don't keep looking for work when we * can't make progress, any work completion or insertion will * clear the stalled flag. */ work = io_get_next_work(acct, worker); raw_spin_unlock(&acct->lock); if (work) { __io_worker_busy(wq, worker); /* * Make sure cancelation can find this, even before * it becomes the active work. That avoids a window * where the work has been removed from our general * work list, but isn't yet discoverable as the * current work item for this worker. */ raw_spin_lock(&worker->lock); worker->next_work = work; raw_spin_unlock(&worker->lock); } else { break; } io_assign_current_work(worker, work); __set_current_state(TASK_RUNNING); /* handle a whole dependent link */ do { struct io_wq_work *next_hashed, *linked; unsigned int hash = io_get_work_hash(work); next_hashed = wq_next_work(work); if (unlikely(do_kill) && (work->flags & IO_WQ_WORK_UNBOUND)) work->flags |= IO_WQ_WORK_CANCEL; wq->do_work(work); io_assign_current_work(worker, NULL); linked = wq->free_work(work); work = next_hashed; if (!work && linked && !io_wq_is_hashed(linked)) { work = linked; linked = NULL; } io_assign_current_work(worker, work); if (linked) io_wq_enqueue(wq, linked); if (hash != -1U && !next_hashed) { /* serialize hash clear with wake_up() */ spin_lock_irq(&wq->hash->wait.lock); clear_bit(hash, &wq->hash->map); clear_bit(IO_ACCT_STALLED_BIT, &acct->flags); spin_unlock_irq(&wq->hash->wait.lock); if (wq_has_sleeper(&wq->hash->wait)) wake_up(&wq->hash->wait); } } while (work); if (!__io_acct_run_queue(acct)) break; raw_spin_lock(&acct->lock); } while (1); } static int io_wq_worker(void *data) { struct io_worker *worker = data; struct io_wq_acct *acct = io_wq_get_acct(worker); struct io_wq *wq = worker->wq; bool exit_mask = false, last_timeout = false; char buf[TASK_COMM_LEN]; worker->flags |= (IO_WORKER_F_UP | IO_WORKER_F_RUNNING); snprintf(buf, sizeof(buf), "iou-wrk-%d", wq->task->pid); set_task_comm(current, buf); while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) { long ret; set_current_state(TASK_INTERRUPTIBLE); /* * If we have work to do, io_acct_run_queue() returns with * the acct->lock held. If not, it will drop it. */ while (io_acct_run_queue(acct)) io_worker_handle_work(acct, worker); raw_spin_lock(&wq->lock); /* * Last sleep timed out. Exit if we're not the last worker, * or if someone modified our affinity. */ if (last_timeout && (exit_mask || acct->nr_workers > 1)) { acct->nr_workers--; raw_spin_unlock(&wq->lock); __set_current_state(TASK_RUNNING); break; } last_timeout = false; __io_worker_idle(wq, worker); raw_spin_unlock(&wq->lock); if (io_run_task_work()) continue; ret = schedule_timeout(WORKER_IDLE_TIMEOUT); if (signal_pending(current)) { struct ksignal ksig; if (!get_signal(&ksig)) continue; break; } if (!ret) { last_timeout = true; exit_mask = !cpumask_test_cpu(raw_smp_processor_id(), wq->cpu_mask); } } if (test_bit(IO_WQ_BIT_EXIT, &wq->state) && io_acct_run_queue(acct)) io_worker_handle_work(acct, worker); io_worker_exit(worker); return 0; } /* * Called when a worker is scheduled in. Mark us as currently running. */ void io_wq_worker_running(struct task_struct *tsk) { struct io_worker *worker = tsk->worker_private; if (!worker) return; if (!(worker->flags & IO_WORKER_F_UP)) return; if (worker->flags & IO_WORKER_F_RUNNING) return; worker->flags |= IO_WORKER_F_RUNNING; io_wq_inc_running(worker); } /* * Called when worker is going to sleep. If there are no workers currently * running and we have work pending, wake up a free one or create a new one. */ void io_wq_worker_sleeping(struct task_struct *tsk) { struct io_worker *worker = tsk->worker_private; if (!worker) return; if (!(worker->flags & IO_WORKER_F_UP)) return; if (!(worker->flags & IO_WORKER_F_RUNNING)) return; worker->flags &= ~IO_WORKER_F_RUNNING; io_wq_dec_running(worker); } static void io_init_new_worker(struct io_wq *wq, struct io_worker *worker, struct task_struct *tsk) { tsk->worker_private = worker; worker->task = tsk; set_cpus_allowed_ptr(tsk, wq->cpu_mask); raw_spin_lock(&wq->lock); hlist_nulls_add_head_rcu(&worker->nulls_node, &wq->free_list); list_add_tail_rcu(&worker->all_list, &wq->all_list); worker->flags |= IO_WORKER_F_FREE; raw_spin_unlock(&wq->lock); wake_up_new_task(tsk); } static bool io_wq_work_match_all(struct io_wq_work *work, void *data) { return true; } static inline bool io_should_retry_thread(long err) { /* * Prevent perpetual task_work retry, if the task (or its group) is * exiting. */ if (fatal_signal_pending(current)) return false; switch (err) { case -EAGAIN: case -ERESTARTSYS: case -ERESTARTNOINTR: case -ERESTARTNOHAND: return true; default: return false; } } static void create_worker_cont(struct callback_head *cb) { struct io_worker *worker; struct task_struct *tsk; struct io_wq *wq; worker = container_of(cb, struct io_worker, create_work); clear_bit_unlock(0, &worker->create_state); wq = worker->wq; tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE); if (!IS_ERR(tsk)) { io_init_new_worker(wq, worker, tsk); io_worker_release(worker); return; } else if (!io_should_retry_thread(PTR_ERR(tsk))) { struct io_wq_acct *acct = io_wq_get_acct(worker); atomic_dec(&acct->nr_running); raw_spin_lock(&wq->lock); acct->nr_workers--; if (!acct->nr_workers) { struct io_cb_cancel_data match = { .fn = io_wq_work_match_all, .cancel_all = true, }; raw_spin_unlock(&wq->lock); while (io_acct_cancel_pending_work(wq, acct, &match)) ; } else { raw_spin_unlock(&wq->lock); } io_worker_ref_put(wq); kfree(worker); return; } /* re-create attempts grab a new worker ref, drop the existing one */ io_worker_release(worker); schedule_work(&worker->work); } static void io_workqueue_create(struct work_struct *work) { struct io_worker *worker = container_of(work, struct io_worker, work); struct io_wq_acct *acct = io_wq_get_acct(worker); if (!io_queue_worker_create(worker, acct, create_worker_cont)) kfree(worker); } static bool create_io_worker(struct io_wq *wq, int index) { struct io_wq_acct *acct = &wq->acct[index]; struct io_worker *worker; struct task_struct *tsk; __set_current_state(TASK_RUNNING); worker = kzalloc(sizeof(*worker), GFP_KERNEL); if (!worker) { fail: atomic_dec(&acct->nr_running); raw_spin_lock(&wq->lock); acct->nr_workers--; raw_spin_unlock(&wq->lock); io_worker_ref_put(wq); return false; } refcount_set(&worker->ref, 1); worker->wq = wq; raw_spin_lock_init(&worker->lock); init_completion(&worker->ref_done); if (index == IO_WQ_ACCT_BOUND) worker->flags |= IO_WORKER_F_BOUND; tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE); if (!IS_ERR(tsk)) { io_init_new_worker(wq, worker, tsk); } else if (!io_should_retry_thread(PTR_ERR(tsk))) { kfree(worker); goto fail; } else { INIT_WORK(&worker->work, io_workqueue_create); schedule_work(&worker->work); } return true; } /* * Iterate the passed in list and call the specific function for each * worker that isn't exiting */ static bool io_wq_for_each_worker(struct io_wq *wq, bool (*func)(struct io_worker *, void *), void *data) { struct io_worker *worker; bool ret = false; list_for_each_entry_rcu(worker, &wq->all_list, all_list) { if (io_worker_get(worker)) { /* no task if node is/was offline */ if (worker->task) ret = func(worker, data); io_worker_release(worker); if (ret) break; } } return ret; } static bool io_wq_worker_wake(struct io_worker *worker, void *data) { __set_notify_signal(worker->task); wake_up_process(worker->task); return false; } static void io_run_cancel(struct io_wq_work *work, struct io_wq *wq) { do { work->flags |= IO_WQ_WORK_CANCEL; wq->do_work(work); work = wq->free_work(work); } while (work); } static void io_wq_insert_work(struct io_wq *wq, struct io_wq_work *work) { struct io_wq_acct *acct = io_work_get_acct(wq, work); unsigned int hash; struct io_wq_work *tail; if (!io_wq_is_hashed(work)) { append: wq_list_add_tail(&work->list, &acct->work_list); return; } hash = io_get_work_hash(work); tail = wq->hash_tail[hash]; wq->hash_tail[hash] = work; if (!tail) goto append; wq_list_add_after(&work->list, &tail->list, &acct->work_list); } static bool io_wq_work_match_item(struct io_wq_work *work, void *data) { return work == data; } void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work) { struct io_wq_acct *acct = io_work_get_acct(wq, work); struct io_cb_cancel_data match; unsigned work_flags = work->flags; bool do_create; /* * If io-wq is exiting for this task, or if the request has explicitly * been marked as one that should not get executed, cancel it here. */ if (test_bit(IO_WQ_BIT_EXIT, &wq->state) || (work->flags & IO_WQ_WORK_CANCEL)) { io_run_cancel(work, wq); return; } raw_spin_lock(&acct->lock); io_wq_insert_work(wq, work); clear_bit(IO_ACCT_STALLED_BIT, &acct->flags); raw_spin_unlock(&acct->lock); rcu_read_lock(); do_create = !io_wq_activate_free_worker(wq, acct); rcu_read_unlock(); if (do_create && ((work_flags & IO_WQ_WORK_CONCURRENT) || !atomic_read(&acct->nr_running))) { bool did_create; did_create = io_wq_create_worker(wq, acct); if (likely(did_create)) return; raw_spin_lock(&wq->lock); if (acct->nr_workers) { raw_spin_unlock(&wq->lock); return; } raw_spin_unlock(&wq->lock); /* fatal condition, failed to create the first worker */ match.fn = io_wq_work_match_item, match.data = work, match.cancel_all = false, io_acct_cancel_pending_work(wq, acct, &match); } } /* * Work items that hash to the same value will not be done in parallel. * Used to limit concurrent writes, generally hashed by inode. */ void io_wq_hash_work(struct io_wq_work *work, void *val) { unsigned int bit; bit = hash_ptr(val, IO_WQ_HASH_ORDER); work->flags |= (IO_WQ_WORK_HASHED | (bit << IO_WQ_HASH_SHIFT)); } static bool __io_wq_worker_cancel(struct io_worker *worker, struct io_cb_cancel_data *match, struct io_wq_work *work) { if (work && match->fn(work, match->data)) { work->flags |= IO_WQ_WORK_CANCEL; __set_notify_signal(worker->task); return true; } return false; } static bool io_wq_worker_cancel(struct io_worker *worker, void *data) { struct io_cb_cancel_data *match = data; /* * Hold the lock to avoid ->cur_work going out of scope, caller * may dereference the passed in work. */ raw_spin_lock(&worker->lock); if (__io_wq_worker_cancel(worker, match, worker->cur_work) || __io_wq_worker_cancel(worker, match, worker->next_work)) match->nr_running++; raw_spin_unlock(&worker->lock); return match->nr_running && !match->cancel_all; } static inline void io_wq_remove_pending(struct io_wq *wq, struct io_wq_work *work, struct io_wq_work_node *prev) { struct io_wq_acct *acct = io_work_get_acct(wq, work); unsigned int hash = io_get_work_hash(work); struct io_wq_work *prev_work = NULL; if (io_wq_is_hashed(work) && work == wq->hash_tail[hash]) { if (prev) prev_work = container_of(prev, struct io_wq_work, list); if (prev_work && io_get_work_hash(prev_work) == hash) wq->hash_tail[hash] = prev_work; else wq->hash_tail[hash] = NULL; } wq_list_del(&acct->work_list, &work->list, prev); } static bool io_acct_cancel_pending_work(struct io_wq *wq, struct io_wq_acct *acct, struct io_cb_cancel_data *match) { struct io_wq_work_node *node, *prev; struct io_wq_work *work; raw_spin_lock(&acct->lock); wq_list_for_each(node, prev, &acct->work_list) { work = container_of(node, struct io_wq_work, list); if (!match->fn(work, match->data)) continue; io_wq_remove_pending(wq, work, prev); raw_spin_unlock(&acct->lock); io_run_cancel(work, wq); match->nr_pending++; /* not safe to continue after unlock */ return true; } raw_spin_unlock(&acct->lock); return false; } static void io_wq_cancel_pending_work(struct io_wq *wq, struct io_cb_cancel_data *match) { int i; retry: for (i = 0; i < IO_WQ_ACCT_NR; i++) { struct io_wq_acct *acct = io_get_acct(wq, i == 0); if (io_acct_cancel_pending_work(wq, acct, match)) { if (match->cancel_all) goto retry; break; } } } static void io_wq_cancel_running_work(struct io_wq *wq, struct io_cb_cancel_data *match) { rcu_read_lock(); io_wq_for_each_worker(wq, io_wq_worker_cancel, match); rcu_read_unlock(); } enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel, void *data, bool cancel_all) { struct io_cb_cancel_data match = { .fn = cancel, .data = data, .cancel_all = cancel_all, }; /* * First check pending list, if we're lucky we can just remove it * from there. CANCEL_OK means that the work is returned as-new, * no completion will be posted for it. * * Then check if a free (going busy) or busy worker has the work * currently running. If we find it there, we'll return CANCEL_RUNNING * as an indication that we attempt to signal cancellation. The * completion will run normally in this case. * * Do both of these while holding the wq->lock, to ensure that * we'll find a work item regardless of state. */ io_wq_cancel_pending_work(wq, &match); if (match.nr_pending && !match.cancel_all) return IO_WQ_CANCEL_OK; raw_spin_lock(&wq->lock); io_wq_cancel_running_work(wq, &match); raw_spin_unlock(&wq->lock); if (match.nr_running && !match.cancel_all) return IO_WQ_CANCEL_RUNNING; if (match.nr_running) return IO_WQ_CANCEL_RUNNING; if (match.nr_pending) return IO_WQ_CANCEL_OK; return IO_WQ_CANCEL_NOTFOUND; } static int io_wq_hash_wake(struct wait_queue_entry *wait, unsigned mode, int sync, void *key) { struct io_wq *wq = container_of(wait, struct io_wq, wait); int i; list_del_init(&wait->entry); rcu_read_lock(); for (i = 0; i < IO_WQ_ACCT_NR; i++) { struct io_wq_acct *acct = &wq->acct[i]; if (test_and_clear_bit(IO_ACCT_STALLED_BIT, &acct->flags)) io_wq_activate_free_worker(wq, acct); } rcu_read_unlock(); return 1; } struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) { int ret, i; struct io_wq *wq; if (WARN_ON_ONCE(!data->free_work || !data->do_work)) return ERR_PTR(-EINVAL); if (WARN_ON_ONCE(!bounded)) return ERR_PTR(-EINVAL); wq = kzalloc(sizeof(struct io_wq), GFP_KERNEL); if (!wq) return ERR_PTR(-ENOMEM); refcount_inc(&data->hash->refs); wq->hash = data->hash; wq->free_work = data->free_work; wq->do_work = data->do_work; ret = -ENOMEM; if (!alloc_cpumask_var(&wq->cpu_mask, GFP_KERNEL)) goto err; cpumask_copy(wq->cpu_mask, cpu_possible_mask); wq->acct[IO_WQ_ACCT_BOUND].max_workers = bounded; wq->acct[IO_WQ_ACCT_UNBOUND].max_workers = task_rlimit(current, RLIMIT_NPROC); INIT_LIST_HEAD(&wq->wait.entry); wq->wait.func = io_wq_hash_wake; for (i = 0; i < IO_WQ_ACCT_NR; i++) { struct io_wq_acct *acct = &wq->acct[i]; acct->index = i; atomic_set(&acct->nr_running, 0); INIT_WQ_LIST(&acct->work_list); raw_spin_lock_init(&acct->lock); } raw_spin_lock_init(&wq->lock); INIT_HLIST_NULLS_HEAD(&wq->free_list, 0); INIT_LIST_HEAD(&wq->all_list); wq->task = get_task_struct(data->task); atomic_set(&wq->worker_refs, 1); init_completion(&wq->worker_done); ret = cpuhp_state_add_instance_nocalls(io_wq_online, &wq->cpuhp_node); if (ret) goto err; return wq; err: io_wq_put_hash(data->hash); free_cpumask_var(wq->cpu_mask); kfree(wq); return ERR_PTR(ret); } static bool io_task_work_match(struct callback_head *cb, void *data) { struct io_worker *worker; if (cb->func != create_worker_cb && cb->func != create_worker_cont) return false; worker = container_of(cb, struct io_worker, create_work); return worker->wq == data; } void io_wq_exit_start(struct io_wq *wq) { set_bit(IO_WQ_BIT_EXIT, &wq->state); } static void io_wq_cancel_tw_create(struct io_wq *wq) { struct callback_head *cb; while ((cb = task_work_cancel_match(wq->task, io_task_work_match, wq)) != NULL) { struct io_worker *worker; worker = container_of(cb, struct io_worker, create_work); io_worker_cancel_cb(worker); /* * Only the worker continuation helper has worker allocated and * hence needs freeing. */ if (cb->func == create_worker_cont) kfree(worker); } } static void io_wq_exit_workers(struct io_wq *wq) { if (!wq->task) return; io_wq_cancel_tw_create(wq); rcu_read_lock(); io_wq_for_each_worker(wq, io_wq_worker_wake, NULL); rcu_read_unlock(); io_worker_ref_put(wq); wait_for_completion(&wq->worker_done); spin_lock_irq(&wq->hash->wait.lock); list_del_init(&wq->wait.entry); spin_unlock_irq(&wq->hash->wait.lock); put_task_struct(wq->task); wq->task = NULL; } static void io_wq_destroy(struct io_wq *wq) { struct io_cb_cancel_data match = { .fn = io_wq_work_match_all, .cancel_all = true, }; cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node); io_wq_cancel_pending_work(wq, &match); free_cpumask_var(wq->cpu_mask); io_wq_put_hash(wq->hash); kfree(wq); } void io_wq_put_and_exit(struct io_wq *wq) { WARN_ON_ONCE(!test_bit(IO_WQ_BIT_EXIT, &wq->state)); io_wq_exit_workers(wq); io_wq_destroy(wq); } struct online_data { unsigned int cpu; bool online; }; static bool io_wq_worker_affinity(struct io_worker *worker, void *data) { struct online_data *od = data; if (od->online) cpumask_set_cpu(od->cpu, worker->wq->cpu_mask); else cpumask_clear_cpu(od->cpu, worker->wq->cpu_mask); return false; } static int __io_wq_cpu_online(struct io_wq *wq, unsigned int cpu, bool online) { struct online_data od = { .cpu = cpu, .online = online }; rcu_read_lock(); io_wq_for_each_worker(wq, io_wq_worker_affinity, &od); rcu_read_unlock(); return 0; } static int io_wq_cpu_online(unsigned int cpu, struct hlist_node *node) { struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node); return __io_wq_cpu_online(wq, cpu, true); } static int io_wq_cpu_offline(unsigned int cpu, struct hlist_node *node) { struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node); return __io_wq_cpu_online(wq, cpu, false); } int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask) { if (!tctx || !tctx->io_wq) return -EINVAL; rcu_read_lock(); if (mask) cpumask_copy(tctx->io_wq->cpu_mask, mask); else cpumask_copy(tctx->io_wq->cpu_mask, cpu_possible_mask); rcu_read_unlock(); return 0; } /* * Set max number of unbounded workers, returns old value. If new_count is 0, * then just return the old value. */ int io_wq_max_workers(struct io_wq *wq, int *new_count) { struct io_wq_acct *acct; int prev[IO_WQ_ACCT_NR]; int i; BUILD_BUG_ON((int) IO_WQ_ACCT_BOUND != (int) IO_WQ_BOUND); BUILD_BUG_ON((int) IO_WQ_ACCT_UNBOUND != (int) IO_WQ_UNBOUND); BUILD_BUG_ON((int) IO_WQ_ACCT_NR != 2); for (i = 0; i < IO_WQ_ACCT_NR; i++) { if (new_count[i] > task_rlimit(current, RLIMIT_NPROC)) new_count[i] = task_rlimit(current, RLIMIT_NPROC); } for (i = 0; i < IO_WQ_ACCT_NR; i++) prev[i] = 0; rcu_read_lock(); raw_spin_lock(&wq->lock); for (i = 0; i < IO_WQ_ACCT_NR; i++) { acct = &wq->acct[i]; prev[i] = max_t(int, acct->max_workers, prev[i]); if (new_count[i]) acct->max_workers = new_count[i]; } raw_spin_unlock(&wq->lock); rcu_read_unlock(); for (i = 0; i < IO_WQ_ACCT_NR; i++) new_count[i] = prev[i]; return 0; } static __init int io_wq_init(void) { int ret; ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "io-wq/online", io_wq_cpu_online, io_wq_cpu_offline); if (ret < 0) return ret; io_wq_online = ret; return 0; } subsys_initcall(io_wq_init); |
1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Force feedback support for PantherLord/GreenAsia based devices * * The devices are distributed under various names and the same USB device ID * can be used in both adapters and actual game controllers. * * 0810:0001 "Twin USB Joystick" * - tested with PantherLord USB/PS2 2in1 Adapter * - contains two reports, one for each port (HID_QUIRK_MULTI_INPUT) * * 0e8f:0003 "GreenAsia Inc. USB Joystick " * - tested with König Gaming gamepad * * 0e8f:0003 "GASIA USB Gamepad" * - another version of the König gamepad * * 0f30:0111 "Saitek Color Rumble Pad" * * Copyright (c) 2007, 2009 Anssi Hannula <anssi.hannula@gmail.com> */ /* */ /* #define DEBUG */ #define debug(format, arg...) pr_debug("hid-plff: " format "\n" , ## arg) #include <linux/input.h> #include <linux/slab.h> #include <linux/module.h> #include <linux/hid.h> #include "hid-ids.h" #ifdef CONFIG_PANTHERLORD_FF struct plff_device { struct hid_report *report; s32 maxval; s32 *strong; s32 *weak; }; static int hid_plff_play(struct input_dev *dev, void *data, struct ff_effect *effect) { struct hid_device *hid = input_get_drvdata(dev); struct plff_device *plff = data; int left, right; left = effect->u.rumble.strong_magnitude; right = effect->u.rumble.weak_magnitude; debug("called with 0x%04x 0x%04x", left, right); left = left * plff->maxval / 0xffff; right = right * plff->maxval / 0xffff; *plff->strong = left; *plff->weak = right; debug("running with 0x%02x 0x%02x", left, right); hid_hw_request(hid, plff->report, HID_REQ_SET_REPORT); return 0; } static int plff_init(struct hid_device *hid) { struct plff_device *plff; struct hid_report *report; struct hid_input *hidinput; struct list_head *report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list; struct list_head *report_ptr = report_list; struct input_dev *dev; int error; s32 maxval; s32 *strong; s32 *weak; /* The device contains one output report per physical device, all containing 1 field, which contains 4 ff00.0002 usages and 4 16bit absolute values. The input reports also contain a field which contains 8 ff00.0001 usages and 8 boolean values. Their meaning is currently unknown. A version of the 0e8f:0003 exists that has all the values in separate fields and misses the extra input field, thus resembling Zeroplus (hid-zpff) devices. */ if (list_empty(report_list)) { hid_err(hid, "no output reports found\n"); return -ENODEV; } list_for_each_entry(hidinput, &hid->inputs, list) { report_ptr = report_ptr->next; if (report_ptr == report_list) { hid_err(hid, "required output report is missing\n"); return -ENODEV; } report = list_entry(report_ptr, struct hid_report, list); if (report->maxfield < 1) { hid_err(hid, "no fields in the report\n"); return -ENODEV; } maxval = 0x7f; if (report->field[0]->report_count >= 4) { report->field[0]->value[0] = 0x00; report->field[0]->value[1] = 0x00; strong = &report->field[0]->value[2]; weak = &report->field[0]->value[3]; debug("detected single-field device"); } else if (report->field[0]->maxusage == 1 && report->field[0]->usage[0].hid == (HID_UP_LED | 0x43) && report->maxfield >= 4 && report->field[0]->report_count >= 1 && report->field[1]->report_count >= 1 && report->field[2]->report_count >= 1 && report->field[3]->report_count >= 1) { report->field[0]->value[0] = 0x00; report->field[1]->value[0] = 0x00; strong = &report->field[2]->value[0]; weak = &report->field[3]->value[0]; if (hid->vendor == USB_VENDOR_ID_JESS2) maxval = 0xff; debug("detected 4-field device"); } else { hid_err(hid, "not enough fields or values\n"); return -ENODEV; } plff = kzalloc(sizeof(struct plff_device), GFP_KERNEL); if (!plff) return -ENOMEM; dev = hidinput->input; set_bit(FF_RUMBLE, dev->ffbit); error = input_ff_create_memless(dev, plff, hid_plff_play); if (error) { kfree(plff); return error; } plff->report = report; plff->strong = strong; plff->weak = weak; plff->maxval = maxval; *strong = 0x00; *weak = 0x00; hid_hw_request(hid, plff->report, HID_REQ_SET_REPORT); } hid_info(hid, "Force feedback for PantherLord/GreenAsia devices by Anssi Hannula <anssi.hannula@gmail.com>\n"); return 0; } #else static inline int plff_init(struct hid_device *hid) { return 0; } #endif static int pl_probe(struct hid_device *hdev, const struct hid_device_id *id) { int ret; if (id->driver_data) hdev->quirks |= HID_QUIRK_MULTI_INPUT; ret = hid_parse(hdev); if (ret) { hid_err(hdev, "parse failed\n"); goto err; } ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT & ~HID_CONNECT_FF); if (ret) { hid_err(hdev, "hw start failed\n"); goto err; } plff_init(hdev); return 0; err: return ret; } static const struct hid_device_id pl_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_GAMERON, USB_DEVICE_ID_GAMERON_DUAL_PSX_ADAPTOR), .driver_data = 1 }, /* Twin USB Joystick */ { HID_USB_DEVICE(USB_VENDOR_ID_GAMERON, USB_DEVICE_ID_GAMERON_DUAL_PCS_ADAPTOR), .driver_data = 1 }, /* Twin USB Joystick */ { HID_USB_DEVICE(USB_VENDOR_ID_GREENASIA, 0x0003), }, { HID_USB_DEVICE(USB_VENDOR_ID_JESS2, USB_DEVICE_ID_JESS2_COLOR_RUMBLE_PAD), }, { } }; MODULE_DEVICE_TABLE(hid, pl_devices); static struct hid_driver pl_driver = { .name = "pantherlord", .id_table = pl_devices, .probe = pl_probe, }; module_hid_driver(pl_driver); MODULE_LICENSE("GPL"); |
10 10 2 2 6 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 | // SPDX-License-Identifier: GPL-2.0-or-later /* * SR-IPv6 implementation * * Author: * David Lebrun <david.lebrun@uclouvain.be> */ #include <linux/types.h> #include <linux/skbuff.h> #include <linux/net.h> #include <linux/module.h> #include <net/ip.h> #include <net/ip_tunnels.h> #include <net/lwtunnel.h> #include <net/netevent.h> #include <net/netns/generic.h> #include <net/ip6_fib.h> #include <net/route.h> #include <net/seg6.h> #include <linux/seg6.h> #include <linux/seg6_iptunnel.h> #include <net/addrconf.h> #include <net/ip6_route.h> #include <net/dst_cache.h> #ifdef CONFIG_IPV6_SEG6_HMAC #include <net/seg6_hmac.h> #endif #include <linux/netfilter.h> static size_t seg6_lwt_headroom(struct seg6_iptunnel_encap *tuninfo) { int head = 0; switch (tuninfo->mode) { case SEG6_IPTUN_MODE_INLINE: break; case SEG6_IPTUN_MODE_ENCAP: case SEG6_IPTUN_MODE_ENCAP_RED: head = sizeof(struct ipv6hdr); break; case SEG6_IPTUN_MODE_L2ENCAP: case SEG6_IPTUN_MODE_L2ENCAP_RED: return 0; } return ((tuninfo->srh->hdrlen + 1) << 3) + head; } struct seg6_lwt { struct dst_cache cache; struct seg6_iptunnel_encap tuninfo[]; }; static inline struct seg6_lwt *seg6_lwt_lwtunnel(struct lwtunnel_state *lwt) { return (struct seg6_lwt *)lwt->data; } static inline struct seg6_iptunnel_encap * seg6_encap_lwtunnel(struct lwtunnel_state *lwt) { return seg6_lwt_lwtunnel(lwt)->tuninfo; } static const struct nla_policy seg6_iptunnel_policy[SEG6_IPTUNNEL_MAX + 1] = { [SEG6_IPTUNNEL_SRH] = { .type = NLA_BINARY }, }; static int nla_put_srh(struct sk_buff *skb, int attrtype, struct seg6_iptunnel_encap *tuninfo) { struct seg6_iptunnel_encap *data; struct nlattr *nla; int len; len = SEG6_IPTUN_ENCAP_SIZE(tuninfo); nla = nla_reserve(skb, attrtype, len); if (!nla) return -EMSGSIZE; data = nla_data(nla); memcpy(data, tuninfo, len); return 0; } static void set_tun_src(struct net *net, struct net_device *dev, struct in6_addr *daddr, struct in6_addr *saddr) { struct seg6_pernet_data *sdata = seg6_pernet(net); struct in6_addr *tun_src; rcu_read_lock(); tun_src = rcu_dereference(sdata->tun_src); if (!ipv6_addr_any(tun_src)) { memcpy(saddr, tun_src, sizeof(struct in6_addr)); } else { ipv6_dev_get_saddr(net, dev, daddr, IPV6_PREFER_SRC_PUBLIC, saddr); } rcu_read_unlock(); } /* Compute flowlabel for outer IPv6 header */ static __be32 seg6_make_flowlabel(struct net *net, struct sk_buff *skb, struct ipv6hdr *inner_hdr) { int do_flowlabel = net->ipv6.sysctl.seg6_flowlabel; __be32 flowlabel = 0; u32 hash; if (do_flowlabel > 0) { hash = skb_get_hash(skb); hash = rol32(hash, 16); flowlabel = (__force __be32)hash & IPV6_FLOWLABEL_MASK; } else if (!do_flowlabel && skb->protocol == htons(ETH_P_IPV6)) { flowlabel = ip6_flowlabel(inner_hdr); } return flowlabel; } /* encapsulate an IPv6 packet within an outer IPv6 header with a given SRH */ int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto) { struct dst_entry *dst = skb_dst(skb); struct net *net = dev_net(dst->dev); struct ipv6hdr *hdr, *inner_hdr; struct ipv6_sr_hdr *isrh; int hdrlen, tot_len, err; __be32 flowlabel; hdrlen = (osrh->hdrlen + 1) << 3; tot_len = hdrlen + sizeof(*hdr); err = skb_cow_head(skb, tot_len + skb->mac_len); if (unlikely(err)) return err; inner_hdr = ipv6_hdr(skb); flowlabel = seg6_make_flowlabel(net, skb, inner_hdr); skb_push(skb, tot_len); skb_reset_network_header(skb); skb_mac_header_rebuild(skb); hdr = ipv6_hdr(skb); /* inherit tc, flowlabel and hlim * hlim will be decremented in ip6_forward() afterwards and * decapsulation will overwrite inner hlim with outer hlim */ if (skb->protocol == htons(ETH_P_IPV6)) { ip6_flow_hdr(hdr, ip6_tclass(ip6_flowinfo(inner_hdr)), flowlabel); hdr->hop_limit = inner_hdr->hop_limit; } else { ip6_flow_hdr(hdr, 0, flowlabel); hdr->hop_limit = ip6_dst_hoplimit(skb_dst(skb)); memset(IP6CB(skb), 0, sizeof(*IP6CB(skb))); /* the control block has been erased, so we have to set the * iif once again. * We read the receiving interface index directly from the * skb->skb_iif as it is done in the IPv4 receiving path (i.e.: * ip_rcv_core(...)). */ IP6CB(skb)->iif = skb->skb_iif; } hdr->nexthdr = NEXTHDR_ROUTING; isrh = (void *)hdr + sizeof(*hdr); memcpy(isrh, osrh, hdrlen); isrh->nexthdr = proto; hdr->daddr = isrh->segments[isrh->first_segment]; set_tun_src(net, dst->dev, &hdr->daddr, &hdr->saddr); #ifdef CONFIG_IPV6_SEG6_HMAC if (sr_has_hmac(isrh)) { err = seg6_push_hmac(net, &hdr->saddr, isrh); if (unlikely(err)) return err; } #endif hdr->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); skb_postpush_rcsum(skb, hdr, tot_len); return 0; } EXPORT_SYMBOL_GPL(seg6_do_srh_encap); /* encapsulate an IPv6 packet within an outer IPv6 header with reduced SRH */ static int seg6_do_srh_encap_red(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto) { __u8 first_seg = osrh->first_segment; struct dst_entry *dst = skb_dst(skb); struct net *net = dev_net(dst->dev); struct ipv6hdr *hdr, *inner_hdr; int hdrlen = ipv6_optlen(osrh); int red_tlv_offset, tlv_offset; struct ipv6_sr_hdr *isrh; bool skip_srh = false; __be32 flowlabel; int tot_len, err; int red_hdrlen; int tlvs_len; if (first_seg > 0) { red_hdrlen = hdrlen - sizeof(struct in6_addr); } else { /* NOTE: if tag/flags and/or other TLVs are introduced in the * seg6_iptunnel infrastructure, they should be considered when * deciding to skip the SRH. */ skip_srh = !sr_has_hmac(osrh); red_hdrlen = skip_srh ? 0 : hdrlen; } tot_len = red_hdrlen + sizeof(struct ipv6hdr); err = skb_cow_head(skb, tot_len + skb->mac_len); if (unlikely(err)) return err; inner_hdr = ipv6_hdr(skb); flowlabel = seg6_make_flowlabel(net, skb, inner_hdr); skb_push(skb, tot_len); skb_reset_network_header(skb); skb_mac_header_rebuild(skb); hdr = ipv6_hdr(skb); /* based on seg6_do_srh_encap() */ if (skb->protocol == htons(ETH_P_IPV6)) { ip6_flow_hdr(hdr, ip6_tclass(ip6_flowinfo(inner_hdr)), flowlabel); hdr->hop_limit = inner_hdr->hop_limit; } else { ip6_flow_hdr(hdr, 0, flowlabel); hdr->hop_limit = ip6_dst_hoplimit(skb_dst(skb)); memset(IP6CB(skb), 0, sizeof(*IP6CB(skb))); IP6CB(skb)->iif = skb->skb_iif; } /* no matter if we have to skip the SRH or not, the first segment * always comes in the pushed IPv6 header. */ hdr->daddr = osrh->segments[first_seg]; if (skip_srh) { hdr->nexthdr = proto; set_tun_src(net, dst->dev, &hdr->daddr, &hdr->saddr); goto out; } /* we cannot skip the SRH, slow path */ hdr->nexthdr = NEXTHDR_ROUTING; isrh = (void *)hdr + sizeof(struct ipv6hdr); if (unlikely(!first_seg)) { /* this is a very rare case; we have only one SID but * we cannot skip the SRH since we are carrying some * other info. */ memcpy(isrh, osrh, hdrlen); goto srcaddr; } tlv_offset = sizeof(*osrh) + (first_seg + 1) * sizeof(struct in6_addr); red_tlv_offset = tlv_offset - sizeof(struct in6_addr); memcpy(isrh, osrh, red_tlv_offset); tlvs_len = hdrlen - tlv_offset; if (unlikely(tlvs_len > 0)) { const void *s = (const void *)osrh + tlv_offset; void *d = (void *)isrh + red_tlv_offset; memcpy(d, s, tlvs_len); } --isrh->first_segment; isrh->hdrlen -= 2; srcaddr: isrh->nexthdr = proto; set_tun_src(net, dst->dev, &hdr->daddr, &hdr->saddr); #ifdef CONFIG_IPV6_SEG6_HMAC if (unlikely(!skip_srh && sr_has_hmac(isrh))) { err = seg6_push_hmac(net, &hdr->saddr, isrh); if (unlikely(err)) return err; } #endif out: hdr->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); skb_postpush_rcsum(skb, hdr, tot_len); return 0; } /* insert an SRH within an IPv6 packet, just after the IPv6 header */ int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh) { struct ipv6hdr *hdr, *oldhdr; struct ipv6_sr_hdr *isrh; int hdrlen, err; hdrlen = (osrh->hdrlen + 1) << 3; err = skb_cow_head(skb, hdrlen + skb->mac_len); if (unlikely(err)) return err; oldhdr = ipv6_hdr(skb); skb_pull(skb, sizeof(struct ipv6hdr)); skb_postpull_rcsum(skb, skb_network_header(skb), sizeof(struct ipv6hdr)); skb_push(skb, sizeof(struct ipv6hdr) + hdrlen); skb_reset_network_header(skb); skb_mac_header_rebuild(skb); hdr = ipv6_hdr(skb); memmove(hdr, oldhdr, sizeof(*hdr)); isrh = (void *)hdr + sizeof(*hdr); memcpy(isrh, osrh, hdrlen); isrh->nexthdr = hdr->nexthdr; hdr->nexthdr = NEXTHDR_ROUTING; isrh->segments[0] = hdr->daddr; hdr->daddr = isrh->segments[isrh->first_segment]; #ifdef CONFIG_IPV6_SEG6_HMAC if (sr_has_hmac(isrh)) { struct net *net = dev_net(skb_dst(skb)->dev); err = seg6_push_hmac(net, &hdr->saddr, isrh); if (unlikely(err)) return err; } #endif hdr->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); skb_postpush_rcsum(skb, hdr, sizeof(struct ipv6hdr) + hdrlen); return 0; } EXPORT_SYMBOL_GPL(seg6_do_srh_inline); static int seg6_do_srh(struct sk_buff *skb) { struct dst_entry *dst = skb_dst(skb); struct seg6_iptunnel_encap *tinfo; int proto, err = 0; tinfo = seg6_encap_lwtunnel(dst->lwtstate); switch (tinfo->mode) { case SEG6_IPTUN_MODE_INLINE: if (skb->protocol != htons(ETH_P_IPV6)) return -EINVAL; err = seg6_do_srh_inline(skb, tinfo->srh); if (err) return err; break; case SEG6_IPTUN_MODE_ENCAP: case SEG6_IPTUN_MODE_ENCAP_RED: err = iptunnel_handle_offloads(skb, SKB_GSO_IPXIP6); if (err) return err; if (skb->protocol == htons(ETH_P_IPV6)) proto = IPPROTO_IPV6; else if (skb->protocol == htons(ETH_P_IP)) proto = IPPROTO_IPIP; else return -EINVAL; if (tinfo->mode == SEG6_IPTUN_MODE_ENCAP) err = seg6_do_srh_encap(skb, tinfo->srh, proto); else err = seg6_do_srh_encap_red(skb, tinfo->srh, proto); if (err) return err; skb_set_inner_transport_header(skb, skb_transport_offset(skb)); skb_set_inner_protocol(skb, skb->protocol); skb->protocol = htons(ETH_P_IPV6); break; case SEG6_IPTUN_MODE_L2ENCAP: case SEG6_IPTUN_MODE_L2ENCAP_RED: if (!skb_mac_header_was_set(skb)) return -EINVAL; if (pskb_expand_head(skb, skb->mac_len, 0, GFP_ATOMIC) < 0) return -ENOMEM; skb_mac_header_rebuild(skb); skb_push(skb, skb->mac_len); if (tinfo->mode == SEG6_IPTUN_MODE_L2ENCAP) err = seg6_do_srh_encap(skb, tinfo->srh, IPPROTO_ETHERNET); else err = seg6_do_srh_encap_red(skb, tinfo->srh, IPPROTO_ETHERNET); if (err) return err; skb->protocol = htons(ETH_P_IPV6); break; } skb_set_transport_header(skb, sizeof(struct ipv6hdr)); nf_reset_ct(skb); return 0; } static int seg6_input_finish(struct net *net, struct sock *sk, struct sk_buff *skb) { return dst_input(skb); } static int seg6_input_core(struct net *net, struct sock *sk, struct sk_buff *skb) { struct dst_entry *orig_dst = skb_dst(skb); struct dst_entry *dst = NULL; struct seg6_lwt *slwt; int err; err = seg6_do_srh(skb); if (unlikely(err)) { kfree_skb(skb); return err; } slwt = seg6_lwt_lwtunnel(orig_dst->lwtstate); preempt_disable(); dst = dst_cache_get(&slwt->cache); preempt_enable(); if (!dst) { ip6_route_input(skb); dst = skb_dst(skb); if (!dst->error) { preempt_disable(); dst_cache_set_ip6(&slwt->cache, dst, &ipv6_hdr(skb)->saddr); preempt_enable(); } } else { skb_dst_drop(skb); skb_dst_set(skb, dst); } err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); if (unlikely(err)) return err; if (static_branch_unlikely(&nf_hooks_lwtunnel_enabled)) return NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_OUT, dev_net(skb->dev), NULL, skb, NULL, skb_dst(skb)->dev, seg6_input_finish); return seg6_input_finish(dev_net(skb->dev), NULL, skb); } static int seg6_input_nf(struct sk_buff *skb) { struct net_device *dev = skb_dst(skb)->dev; struct net *net = dev_net(skb->dev); switch (skb->protocol) { case htons(ETH_P_IP): return NF_HOOK(NFPROTO_IPV4, NF_INET_POST_ROUTING, net, NULL, skb, NULL, dev, seg6_input_core); case htons(ETH_P_IPV6): return NF_HOOK(NFPROTO_IPV6, NF_INET_POST_ROUTING, net, NULL, skb, NULL, dev, seg6_input_core); } return -EINVAL; } static int seg6_input(struct sk_buff *skb) { if (static_branch_unlikely(&nf_hooks_lwtunnel_enabled)) return seg6_input_nf(skb); return seg6_input_core(dev_net(skb->dev), NULL, skb); } static int seg6_output_core(struct net *net, struct sock *sk, struct sk_buff *skb) { struct dst_entry *orig_dst = skb_dst(skb); struct dst_entry *dst = NULL; struct seg6_lwt *slwt; int err; err = seg6_do_srh(skb); if (unlikely(err)) goto drop; slwt = seg6_lwt_lwtunnel(orig_dst->lwtstate); preempt_disable(); dst = dst_cache_get(&slwt->cache); preempt_enable(); if (unlikely(!dst)) { struct ipv6hdr *hdr = ipv6_hdr(skb); struct flowi6 fl6; memset(&fl6, 0, sizeof(fl6)); fl6.daddr = hdr->daddr; fl6.saddr = hdr->saddr; fl6.flowlabel = ip6_flowinfo(hdr); fl6.flowi6_mark = skb->mark; fl6.flowi6_proto = hdr->nexthdr; dst = ip6_route_output(net, NULL, &fl6); if (dst->error) { err = dst->error; dst_release(dst); goto drop; } preempt_disable(); dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr); preempt_enable(); } skb_dst_drop(skb); skb_dst_set(skb, dst); err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); if (unlikely(err)) goto drop; if (static_branch_unlikely(&nf_hooks_lwtunnel_enabled)) return NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_OUT, net, sk, skb, NULL, skb_dst(skb)->dev, dst_output); return dst_output(net, sk, skb); drop: kfree_skb(skb); return err; } static int seg6_output_nf(struct net *net, struct sock *sk, struct sk_buff *skb) { struct net_device *dev = skb_dst(skb)->dev; switch (skb->protocol) { case htons(ETH_P_IP): return NF_HOOK(NFPROTO_IPV4, NF_INET_POST_ROUTING, net, sk, skb, NULL, dev, seg6_output_core); case htons(ETH_P_IPV6): return NF_HOOK(NFPROTO_IPV6, NF_INET_POST_ROUTING, net, sk, skb, NULL, dev, seg6_output_core); } return -EINVAL; } static int seg6_output(struct net *net, struct sock *sk, struct sk_buff *skb) { if (static_branch_unlikely(&nf_hooks_lwtunnel_enabled)) return seg6_output_nf(net, sk, skb); return seg6_output_core(net, sk, skb); } static int seg6_build_state(struct net *net, struct nlattr *nla, unsigned int family, const void *cfg, struct lwtunnel_state **ts, struct netlink_ext_ack *extack) { struct nlattr *tb[SEG6_IPTUNNEL_MAX + 1]; struct seg6_iptunnel_encap *tuninfo; struct lwtunnel_state *newts; int tuninfo_len, min_size; struct seg6_lwt *slwt; int err; if (family != AF_INET && family != AF_INET6) return -EINVAL; err = nla_parse_nested_deprecated(tb, SEG6_IPTUNNEL_MAX, nla, seg6_iptunnel_policy, extack); if (err < 0) return err; if (!tb[SEG6_IPTUNNEL_SRH]) return -EINVAL; tuninfo = nla_data(tb[SEG6_IPTUNNEL_SRH]); tuninfo_len = nla_len(tb[SEG6_IPTUNNEL_SRH]); /* tuninfo must contain at least the iptunnel encap structure, * the SRH and one segment */ min_size = sizeof(*tuninfo) + sizeof(struct ipv6_sr_hdr) + sizeof(struct in6_addr); if (tuninfo_len < min_size) return -EINVAL; switch (tuninfo->mode) { case SEG6_IPTUN_MODE_INLINE: if (family != AF_INET6) return -EINVAL; break; case SEG6_IPTUN_MODE_ENCAP: break; case SEG6_IPTUN_MODE_L2ENCAP: break; case SEG6_IPTUN_MODE_ENCAP_RED: break; case SEG6_IPTUN_MODE_L2ENCAP_RED: break; default: return -EINVAL; } /* verify that SRH is consistent */ if (!seg6_validate_srh(tuninfo->srh, tuninfo_len - sizeof(*tuninfo), false)) return -EINVAL; newts = lwtunnel_state_alloc(tuninfo_len + sizeof(*slwt)); if (!newts) return -ENOMEM; slwt = seg6_lwt_lwtunnel(newts); err = dst_cache_init(&slwt->cache, GFP_ATOMIC); if (err) { kfree(newts); return err; } memcpy(&slwt->tuninfo, tuninfo, tuninfo_len); newts->type = LWTUNNEL_ENCAP_SEG6; newts->flags |= LWTUNNEL_STATE_INPUT_REDIRECT; if (tuninfo->mode != SEG6_IPTUN_MODE_L2ENCAP) newts->flags |= LWTUNNEL_STATE_OUTPUT_REDIRECT; newts->headroom = seg6_lwt_headroom(tuninfo); *ts = newts; return 0; } static void seg6_destroy_state(struct lwtunnel_state *lwt) { dst_cache_destroy(&seg6_lwt_lwtunnel(lwt)->cache); } static int seg6_fill_encap_info(struct sk_buff *skb, struct lwtunnel_state *lwtstate) { struct seg6_iptunnel_encap *tuninfo = seg6_encap_lwtunnel(lwtstate); if (nla_put_srh(skb, SEG6_IPTUNNEL_SRH, tuninfo)) return -EMSGSIZE; return 0; } static int seg6_encap_nlsize(struct lwtunnel_state *lwtstate) { struct seg6_iptunnel_encap *tuninfo = seg6_encap_lwtunnel(lwtstate); return nla_total_size(SEG6_IPTUN_ENCAP_SIZE(tuninfo)); } static int seg6_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b) { struct seg6_iptunnel_encap *a_hdr = seg6_encap_lwtunnel(a); struct seg6_iptunnel_encap *b_hdr = seg6_encap_lwtunnel(b); int len = SEG6_IPTUN_ENCAP_SIZE(a_hdr); if (len != SEG6_IPTUN_ENCAP_SIZE(b_hdr)) return 1; return memcmp(a_hdr, b_hdr, len); } static const struct lwtunnel_encap_ops seg6_iptun_ops = { .build_state = seg6_build_state, .destroy_state = seg6_destroy_state, .output = seg6_output, .input = seg6_input, .fill_encap = seg6_fill_encap_info, .get_encap_size = seg6_encap_nlsize, .cmp_encap = seg6_encap_cmp, .owner = THIS_MODULE, }; int __init seg6_iptunnel_init(void) { return lwtunnel_encap_add_ops(&seg6_iptun_ops, LWTUNNEL_ENCAP_SEG6); } void seg6_iptunnel_exit(void) { lwtunnel_encap_del_ops(&seg6_iptun_ops, LWTUNNEL_ENCAP_SEG6); } |
945 550 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _NF_NAT_H #define _NF_NAT_H #include <linux/list.h> #include <linux/netfilter_ipv4.h> #include <linux/netfilter/nf_conntrack_pptp.h> #include <net/netfilter/nf_conntrack.h> #include <net/netfilter/nf_conntrack_extend.h> #include <net/netfilter/nf_conntrack_tuple.h> #include <uapi/linux/netfilter/nf_nat.h> enum nf_nat_manip_type { NF_NAT_MANIP_SRC, NF_NAT_MANIP_DST }; /* SRC manip occurs POST_ROUTING or LOCAL_IN */ #define HOOK2MANIP(hooknum) ((hooknum) != NF_INET_POST_ROUTING && \ (hooknum) != NF_INET_LOCAL_IN) /* per conntrack: nat application helper private data */ union nf_conntrack_nat_help { /* insert nat helper private data here */ #if IS_ENABLED(CONFIG_NF_NAT_PPTP) struct nf_nat_pptp nat_pptp_info; #endif }; /* The structure embedded in the conntrack structure. */ struct nf_conn_nat { union nf_conntrack_nat_help help; #if IS_ENABLED(CONFIG_NF_NAT_MASQUERADE) int masq_index; #endif }; /* Set up the info structure to map into this range. */ unsigned int nf_nat_setup_info(struct nf_conn *ct, const struct nf_nat_range2 *range, enum nf_nat_manip_type maniptype); extern unsigned int nf_nat_alloc_null_binding(struct nf_conn *ct, unsigned int hooknum); struct nf_conn_nat *nf_ct_nat_ext_add(struct nf_conn *ct); static inline struct nf_conn_nat *nfct_nat(const struct nf_conn *ct) { #if IS_ENABLED(CONFIG_NF_NAT) return nf_ct_ext_find(ct, NF_CT_EXT_NAT); #else return NULL; #endif } static inline bool nf_nat_oif_changed(unsigned int hooknum, enum ip_conntrack_info ctinfo, struct nf_conn_nat *nat, const struct net_device *out) { #if IS_ENABLED(CONFIG_NF_NAT_MASQUERADE) return nat && nat->masq_index && hooknum == NF_INET_POST_ROUTING && CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL && nat->masq_index != out->ifindex; #else return false; #endif } int nf_nat_register_fn(struct net *net, u8 pf, const struct nf_hook_ops *ops, const struct nf_hook_ops *nat_ops, unsigned int ops_count); void nf_nat_unregister_fn(struct net *net, u8 pf, const struct nf_hook_ops *ops, unsigned int ops_count); unsigned int nf_nat_packet(struct nf_conn *ct, enum ip_conntrack_info ctinfo, unsigned int hooknum, struct sk_buff *skb); unsigned int nf_nat_manip_pkt(struct sk_buff *skb, struct nf_conn *ct, enum nf_nat_manip_type mtype, enum ip_conntrack_dir dir); void nf_nat_csum_recalc(struct sk_buff *skb, u8 nfproto, u8 proto, void *data, __sum16 *check, int datalen, int oldlen); int nf_nat_icmp_reply_translation(struct sk_buff *skb, struct nf_conn *ct, enum ip_conntrack_info ctinfo, unsigned int hooknum); int nf_nat_icmpv6_reply_translation(struct sk_buff *skb, struct nf_conn *ct, enum ip_conntrack_info ctinfo, unsigned int hooknum, unsigned int hdrlen); int nf_nat_ipv4_register_fn(struct net *net, const struct nf_hook_ops *ops); void nf_nat_ipv4_unregister_fn(struct net *net, const struct nf_hook_ops *ops); int nf_nat_ipv6_register_fn(struct net *net, const struct nf_hook_ops *ops); void nf_nat_ipv6_unregister_fn(struct net *net, const struct nf_hook_ops *ops); int nf_nat_inet_register_fn(struct net *net, const struct nf_hook_ops *ops); void nf_nat_inet_unregister_fn(struct net *net, const struct nf_hook_ops *ops); unsigned int nf_nat_inet_fn(void *priv, struct sk_buff *skb, const struct nf_hook_state *state); int nf_ct_nat(struct sk_buff *skb, struct nf_conn *ct, enum ip_conntrack_info ctinfo, int *action, const struct nf_nat_range2 *range, bool commit); static inline int nf_nat_initialized(const struct nf_conn *ct, enum nf_nat_manip_type manip) { if (manip == NF_NAT_MANIP_SRC) return ct->status & IPS_SRC_NAT_DONE; else return ct->status & IPS_DST_NAT_DONE; } #endif |
1125 500 503 500 1117 1137 1130 1128 1182 498 636 648 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 | // SPDX-License-Identifier: GPL-2.0-only /* * Link physical devices with ACPI devices support * * Copyright (c) 2005 David Shaohua Li <shaohua.li@intel.com> * Copyright (c) 2005 Intel Corp. */ #define pr_fmt(fmt) "ACPI: " fmt #include <linux/acpi_iort.h> #include <linux/export.h> #include <linux/init.h> #include <linux/list.h> #include <linux/device.h> #include <linux/slab.h> #include <linux/rwsem.h> #include <linux/acpi.h> #include <linux/dma-mapping.h> #include <linux/pci.h> #include <linux/pci-acpi.h> #include <linux/platform_device.h> #include "internal.h" static LIST_HEAD(bus_type_list); static DECLARE_RWSEM(bus_type_sem); #define PHYSICAL_NODE_STRING "physical_node" #define PHYSICAL_NODE_NAME_SIZE (sizeof(PHYSICAL_NODE_STRING) + 10) int register_acpi_bus_type(struct acpi_bus_type *type) { if (acpi_disabled) return -ENODEV; if (type && type->match && type->find_companion) { down_write(&bus_type_sem); list_add_tail(&type->list, &bus_type_list); up_write(&bus_type_sem); pr_info("bus type %s registered\n", type->name); return 0; } return -ENODEV; } EXPORT_SYMBOL_GPL(register_acpi_bus_type); int unregister_acpi_bus_type(struct acpi_bus_type *type) { if (acpi_disabled) return 0; if (type) { down_write(&bus_type_sem); list_del_init(&type->list); up_write(&bus_type_sem); pr_info("bus type %s unregistered\n", type->name); return 0; } return -ENODEV; } EXPORT_SYMBOL_GPL(unregister_acpi_bus_type); static struct acpi_bus_type *acpi_get_bus_type(struct device *dev) { struct acpi_bus_type *tmp, *ret = NULL; down_read(&bus_type_sem); list_for_each_entry(tmp, &bus_type_list, list) { if (tmp->match(dev)) { ret = tmp; break; } } up_read(&bus_type_sem); return ret; } #define FIND_CHILD_MIN_SCORE 1 #define FIND_CHILD_MID_SCORE 2 #define FIND_CHILD_MAX_SCORE 3 static int match_any(struct acpi_device *adev, void *not_used) { return 1; } static bool acpi_dev_has_children(struct acpi_device *adev) { return acpi_dev_for_each_child(adev, match_any, NULL) > 0; } static int find_child_checks(struct acpi_device *adev, bool check_children) { unsigned long long sta; acpi_status status; if (check_children && !acpi_dev_has_children(adev)) return -ENODEV; status = acpi_evaluate_integer(adev->handle, "_STA", NULL, &sta); if (status == AE_NOT_FOUND) { /* * Special case: backlight device objects without _STA are * preferred to other objects with the same _ADR value, because * it is more likely that they are actually useful. */ if (adev->pnp.type.backlight) return FIND_CHILD_MID_SCORE; return FIND_CHILD_MIN_SCORE; } if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_ENABLED)) return -ENODEV; /* * If the device has a _HID returning a valid ACPI/PNP device ID, it is * better to make it look less attractive here, so that the other device * with the same _ADR value (that may not have a valid device ID) can be * matched going forward. [This means a second spec violation in a row, * so whatever we do here is best effort anyway.] */ if (adev->pnp.type.platform_id) return FIND_CHILD_MIN_SCORE; return FIND_CHILD_MAX_SCORE; } struct find_child_walk_data { struct acpi_device *adev; u64 address; int score; bool check_sta; bool check_children; }; static int check_one_child(struct acpi_device *adev, void *data) { struct find_child_walk_data *wd = data; int score; if (!adev->pnp.type.bus_address || acpi_device_adr(adev) != wd->address) return 0; if (!wd->adev) { /* * This is the first matching object, so save it. If it is not * necessary to look for any other matching objects, stop the * search. */ wd->adev = adev; return !(wd->check_sta || wd->check_children); } /* * There is more than one matching device object with the same _ADR * value. That really is unexpected, so we are kind of beyond the scope * of the spec here. We have to choose which one to return, though. * * First, get the score for the previously found object and terminate * the walk if it is maximum. */ if (!wd->score) { score = find_child_checks(wd->adev, wd->check_children); if (score == FIND_CHILD_MAX_SCORE) return 1; wd->score = score; } /* * Second, if the object that has just been found has a better score, * replace the previously found one with it and terminate the walk if * the new score is maximum. */ score = find_child_checks(adev, wd->check_children); if (score > wd->score) { wd->adev = adev; if (score == FIND_CHILD_MAX_SCORE) return 1; wd->score = score; } /* Continue, because there may be better matches. */ return 0; } static struct acpi_device *acpi_find_child(struct acpi_device *parent, u64 address, bool check_children, bool check_sta) { struct find_child_walk_data wd = { .address = address, .check_children = check_children, .check_sta = check_sta, .adev = NULL, .score = 0, }; if (parent) acpi_dev_for_each_child(parent, check_one_child, &wd); return wd.adev; } struct acpi_device *acpi_find_child_device(struct acpi_device *parent, u64 address, bool check_children) { return acpi_find_child(parent, address, check_children, true); } EXPORT_SYMBOL_GPL(acpi_find_child_device); struct acpi_device *acpi_find_child_by_adr(struct acpi_device *adev, acpi_bus_address adr) { return acpi_find_child(adev, adr, false, false); } EXPORT_SYMBOL_GPL(acpi_find_child_by_adr); static void acpi_physnode_link_name(char *buf, unsigned int node_id) { if (node_id > 0) snprintf(buf, PHYSICAL_NODE_NAME_SIZE, PHYSICAL_NODE_STRING "%u", node_id); else strcpy(buf, PHYSICAL_NODE_STRING); } int acpi_bind_one(struct device *dev, struct acpi_device *acpi_dev) { struct acpi_device_physical_node *physical_node, *pn; char physical_node_name[PHYSICAL_NODE_NAME_SIZE]; struct list_head *physnode_list; unsigned int node_id; int retval = -EINVAL; if (has_acpi_companion(dev)) { if (acpi_dev) { dev_warn(dev, "ACPI companion already set\n"); return -EINVAL; } else { acpi_dev = ACPI_COMPANION(dev); } } if (!acpi_dev) return -EINVAL; acpi_dev_get(acpi_dev); get_device(dev); physical_node = kzalloc(sizeof(*physical_node), GFP_KERNEL); if (!physical_node) { retval = -ENOMEM; goto err; } mutex_lock(&acpi_dev->physical_node_lock); /* * Keep the list sorted by node_id so that the IDs of removed nodes can * be recycled easily. */ physnode_list = &acpi_dev->physical_node_list; node_id = 0; list_for_each_entry(pn, &acpi_dev->physical_node_list, node) { /* Sanity check. */ if (pn->dev == dev) { mutex_unlock(&acpi_dev->physical_node_lock); dev_warn(dev, "Already associated with ACPI node\n"); kfree(physical_node); if (ACPI_COMPANION(dev) != acpi_dev) goto err; put_device(dev); acpi_dev_put(acpi_dev); return 0; } if (pn->node_id == node_id) { physnode_list = &pn->node; node_id++; } } physical_node->node_id = node_id; physical_node->dev = dev; list_add(&physical_node->node, physnode_list); acpi_dev->physical_node_count++; if (!has_acpi_companion(dev)) ACPI_COMPANION_SET(dev, acpi_dev); acpi_physnode_link_name(physical_node_name, node_id); retval = sysfs_create_link(&acpi_dev->dev.kobj, &dev->kobj, physical_node_name); if (retval) dev_err(&acpi_dev->dev, "Failed to create link %s (%d)\n", physical_node_name, retval); retval = sysfs_create_link(&dev->kobj, &acpi_dev->dev.kobj, "firmware_node"); if (retval) dev_err(dev, "Failed to create link firmware_node (%d)\n", retval); mutex_unlock(&acpi_dev->physical_node_lock); if (acpi_dev->wakeup.flags.valid) device_set_wakeup_capable(dev, true); return 0; err: ACPI_COMPANION_SET(dev, NULL); put_device(dev); acpi_dev_put(acpi_dev); return retval; } EXPORT_SYMBOL_GPL(acpi_bind_one); int acpi_unbind_one(struct device *dev) { struct acpi_device *acpi_dev = ACPI_COMPANION(dev); struct acpi_device_physical_node *entry; if (!acpi_dev) return 0; mutex_lock(&acpi_dev->physical_node_lock); list_for_each_entry(entry, &acpi_dev->physical_node_list, node) if (entry->dev == dev) { char physnode_name[PHYSICAL_NODE_NAME_SIZE]; list_del(&entry->node); acpi_dev->physical_node_count--; acpi_physnode_link_name(physnode_name, entry->node_id); sysfs_remove_link(&acpi_dev->dev.kobj, physnode_name); sysfs_remove_link(&dev->kobj, "firmware_node"); ACPI_COMPANION_SET(dev, NULL); /* Drop references taken by acpi_bind_one(). */ put_device(dev); acpi_dev_put(acpi_dev); kfree(entry); break; } mutex_unlock(&acpi_dev->physical_node_lock); return 0; } EXPORT_SYMBOL_GPL(acpi_unbind_one); void acpi_device_notify(struct device *dev) { struct acpi_device *adev; int ret; ret = acpi_bind_one(dev, NULL); if (ret) { struct acpi_bus_type *type = acpi_get_bus_type(dev); if (!type) goto err; adev = type->find_companion(dev); if (!adev) { dev_dbg(dev, "ACPI companion not found\n"); goto err; } ret = acpi_bind_one(dev, adev); if (ret) goto err; if (type->setup) { type->setup(dev); goto done; } } else { adev = ACPI_COMPANION(dev); if (dev_is_pci(dev)) { pci_acpi_setup(dev, adev); goto done; } else if (dev_is_platform(dev)) { acpi_configure_pmsi_domain(dev); } } if (adev->handler && adev->handler->bind) adev->handler->bind(dev); done: acpi_handle_debug(ACPI_HANDLE(dev), "Bound to device %s\n", dev_name(dev)); return; err: dev_dbg(dev, "No ACPI support\n"); } void acpi_device_notify_remove(struct device *dev) { struct acpi_device *adev = ACPI_COMPANION(dev); if (!adev) return; if (dev_is_pci(dev)) pci_acpi_cleanup(dev, adev); else if (adev->handler && adev->handler->unbind) adev->handler->unbind(dev); acpi_unbind_one(dev); } |
2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | // SPDX-License-Identifier: GPL-2.0-only /* Kernel module to match AH parameters. */ /* (C) 2001-2002 Andras Kis-Szabo <kisza@sch.bme.hu> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/module.h> #include <linux/skbuff.h> #include <linux/ip.h> #include <linux/ipv6.h> #include <linux/types.h> #include <net/checksum.h> #include <net/ipv6.h> #include <linux/netfilter/x_tables.h> #include <linux/netfilter_ipv6/ip6_tables.h> #include <linux/netfilter_ipv6/ip6t_ah.h> MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Xtables: IPv6 IPsec-AH match"); MODULE_AUTHOR("Andras Kis-Szabo <kisza@sch.bme.hu>"); /* Returns 1 if the spi is matched by the range, 0 otherwise */ static inline bool spi_match(u_int32_t min, u_int32_t max, u_int32_t spi, bool invert) { bool r; pr_debug("spi_match:%c 0x%x <= 0x%x <= 0x%x\n", invert ? '!' : ' ', min, spi, max); r = (spi >= min && spi <= max) ^ invert; pr_debug(" result %s\n", r ? "PASS" : "FAILED"); return r; } static bool ah_mt6(const struct sk_buff *skb, struct xt_action_param *par) { struct ip_auth_hdr _ah; const struct ip_auth_hdr *ah; const struct ip6t_ah *ahinfo = par->matchinfo; unsigned int ptr = 0; unsigned int hdrlen = 0; int err; err = ipv6_find_hdr(skb, &ptr, NEXTHDR_AUTH, NULL, NULL); if (err < 0) { if (err != -ENOENT) par->hotdrop = true; return false; } ah = skb_header_pointer(skb, ptr, sizeof(_ah), &_ah); if (ah == NULL) { par->hotdrop = true; return false; } hdrlen = ipv6_authlen(ah); pr_debug("IPv6 AH LEN %u %u ", hdrlen, ah->hdrlen); pr_debug("RES %04X ", ah->reserved); pr_debug("SPI %u %08X\n", ntohl(ah->spi), ntohl(ah->spi)); pr_debug("IPv6 AH spi %02X ", spi_match(ahinfo->spis[0], ahinfo->spis[1], ntohl(ah->spi), !!(ahinfo->invflags & IP6T_AH_INV_SPI))); pr_debug("len %02X %04X %02X ", ahinfo->hdrlen, hdrlen, (!ahinfo->hdrlen || (ahinfo->hdrlen == hdrlen) ^ !!(ahinfo->invflags & IP6T_AH_INV_LEN))); pr_debug("res %02X %04X %02X\n", ahinfo->hdrres, ah->reserved, !(ahinfo->hdrres && ah->reserved)); return spi_match(ahinfo->spis[0], ahinfo->spis[1], ntohl(ah->spi), !!(ahinfo->invflags & IP6T_AH_INV_SPI)) && (!ahinfo->hdrlen || (ahinfo->hdrlen == hdrlen) ^ !!(ahinfo->invflags & IP6T_AH_INV_LEN)) && !(ahinfo->hdrres && ah->reserved); } static int ah_mt6_check(const struct xt_mtchk_param *par) { const struct ip6t_ah *ahinfo = par->matchinfo; if (ahinfo->invflags & ~IP6T_AH_INV_MASK) { pr_debug("unknown flags %X\n", ahinfo->invflags); return -EINVAL; } return 0; } static struct xt_match ah_mt6_reg __read_mostly = { .name = "ah", .family = NFPROTO_IPV6, .match = ah_mt6, .matchsize = sizeof(struct ip6t_ah), .checkentry = ah_mt6_check, .me = THIS_MODULE, }; static int __init ah_mt6_init(void) { return xt_register_match(&ah_mt6_reg); } static void __exit ah_mt6_exit(void) { xt_unregister_match(&ah_mt6_reg); } module_init(ah_mt6_init); module_exit(ah_mt6_exit); |
3 2 2 1 3 1 2 4 2 9 4 9 14 4 10 5 9 1 1 13 13 1 1 1 1 4 4 4 5 4 5 3 3 3 1 2 1 2 2 3 3 2 3 2 1 3 2 3 2 1 2 3 3 2 1 1 3 2 2 1 1 1 1 11 11 2 2 44 2 2 3 4 5 4 5 5 5 3 3 3 5 5 2 3 5 2 3 3 4 4 4 5 5 4 3 1 2 3 5 10 30 1 27 31 8 2 6 4 1 1 1 3 4 4 1 1 2 1 5 5 5 4 2 1 2 1 5 5 4 1 1 1 3 3 12 12 2 10 9 8 2 1 1 2 5 5 5 5 2 2 1 2 3 5 5 760 752 10 1 10 9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 | // SPDX-License-Identifier: GPL-2.0-only /* * Interface handling * * Copyright 2002-2005, Instant802 Networks, Inc. * Copyright 2005-2006, Devicescape Software, Inc. * Copyright (c) 2006 Jiri Benc <jbenc@suse.cz> * Copyright 2008, Johannes Berg <johannes@sipsolutions.net> * Copyright 2013-2014 Intel Mobile Communications GmbH * Copyright (c) 2016 Intel Deutschland GmbH * Copyright (C) 2018-2024 Intel Corporation */ #include <linux/slab.h> #include <linux/kernel.h> #include <linux/if_arp.h> #include <linux/netdevice.h> #include <linux/rtnetlink.h> #include <linux/kcov.h> #include <net/mac80211.h> #include <net/ieee80211_radiotap.h> #include "ieee80211_i.h" #include "sta_info.h" #include "debugfs_netdev.h" #include "mesh.h" #include "led.h" #include "driver-ops.h" #include "wme.h" #include "rate.h" /** * DOC: Interface list locking * * The interface list in each struct ieee80211_local is protected * three-fold: * * (1) modifications may only be done under the RTNL *and* wiphy mutex * *and* iflist_mtx * (2) modifications are done in an RCU manner so atomic readers * can traverse the list in RCU-safe blocks. * * As a consequence, reads (traversals) of the list can be protected * by either the RTNL, the wiphy mutex, the iflist_mtx or RCU. */ static void ieee80211_iface_work(struct wiphy *wiphy, struct wiphy_work *work); bool __ieee80211_recalc_txpower(struct ieee80211_sub_if_data *sdata) { struct ieee80211_chanctx_conf *chanctx_conf; int power; rcu_read_lock(); chanctx_conf = rcu_dereference(sdata->vif.bss_conf.chanctx_conf); if (!chanctx_conf) { rcu_read_unlock(); return false; } power = ieee80211_chandef_max_power(&chanctx_conf->def); rcu_read_unlock(); if (sdata->deflink.user_power_level != IEEE80211_UNSET_POWER_LEVEL) power = min(power, sdata->deflink.user_power_level); if (sdata->deflink.ap_power_level != IEEE80211_UNSET_POWER_LEVEL) power = min(power, sdata->deflink.ap_power_level); if (power != sdata->vif.bss_conf.txpower) { sdata->vif.bss_conf.txpower = power; ieee80211_hw_config(sdata->local, 0); return true; } return false; } void ieee80211_recalc_txpower(struct ieee80211_sub_if_data *sdata, bool update_bss) { if (__ieee80211_recalc_txpower(sdata) || (update_bss && ieee80211_sdata_running(sdata))) ieee80211_link_info_change_notify(sdata, &sdata->deflink, BSS_CHANGED_TXPOWER); } static u32 __ieee80211_idle_off(struct ieee80211_local *local) { if (!(local->hw.conf.flags & IEEE80211_CONF_IDLE)) return 0; local->hw.conf.flags &= ~IEEE80211_CONF_IDLE; return IEEE80211_CONF_CHANGE_IDLE; } static u32 __ieee80211_idle_on(struct ieee80211_local *local) { if (local->hw.conf.flags & IEEE80211_CONF_IDLE) return 0; ieee80211_flush_queues(local, NULL, false); local->hw.conf.flags |= IEEE80211_CONF_IDLE; return IEEE80211_CONF_CHANGE_IDLE; } static u32 __ieee80211_recalc_idle(struct ieee80211_local *local, bool force_active) { bool working, scanning, active; unsigned int led_trig_start = 0, led_trig_stop = 0; lockdep_assert_wiphy(local->hw.wiphy); active = force_active || !list_empty(&local->chanctx_list) || local->monitors; working = !local->ops->remain_on_channel && !list_empty(&local->roc_list); scanning = test_bit(SCAN_SW_SCANNING, &local->scanning) || test_bit(SCAN_ONCHANNEL_SCANNING, &local->scanning); if (working || scanning) led_trig_start |= IEEE80211_TPT_LEDTRIG_FL_WORK; else led_trig_stop |= IEEE80211_TPT_LEDTRIG_FL_WORK; if (active) led_trig_start |= IEEE80211_TPT_LEDTRIG_FL_CONNECTED; else led_trig_stop |= IEEE80211_TPT_LEDTRIG_FL_CONNECTED; ieee80211_mod_tpt_led_trig(local, led_trig_start, led_trig_stop); if (working || scanning || active) return __ieee80211_idle_off(local); return __ieee80211_idle_on(local); } u32 ieee80211_idle_off(struct ieee80211_local *local) { return __ieee80211_recalc_idle(local, true); } void ieee80211_recalc_idle(struct ieee80211_local *local) { u32 change = __ieee80211_recalc_idle(local, false); if (change) ieee80211_hw_config(local, change); } static int ieee80211_verify_mac(struct ieee80211_sub_if_data *sdata, u8 *addr, bool check_dup) { struct ieee80211_local *local = sdata->local; struct ieee80211_sub_if_data *iter; u64 new, mask, tmp; u8 *m; int ret = 0; lockdep_assert_wiphy(local->hw.wiphy); if (is_zero_ether_addr(local->hw.wiphy->addr_mask)) return 0; m = addr; new = ((u64)m[0] << 5*8) | ((u64)m[1] << 4*8) | ((u64)m[2] << 3*8) | ((u64)m[3] << 2*8) | ((u64)m[4] << 1*8) | ((u64)m[5] << 0*8); m = local->hw.wiphy->addr_mask; mask = ((u64)m[0] << 5*8) | ((u64)m[1] << 4*8) | ((u64)m[2] << 3*8) | ((u64)m[3] << 2*8) | ((u64)m[4] << 1*8) | ((u64)m[5] << 0*8); if (!check_dup) return ret; list_for_each_entry(iter, &local->interfaces, list) { if (iter == sdata) continue; if (iter->vif.type == NL80211_IFTYPE_MONITOR && !(iter->u.mntr.flags & MONITOR_FLAG_ACTIVE)) continue; m = iter->vif.addr; tmp = ((u64)m[0] << 5*8) | ((u64)m[1] << 4*8) | ((u64)m[2] << 3*8) | ((u64)m[3] << 2*8) | ((u64)m[4] << 1*8) | ((u64)m[5] << 0*8); if ((new & ~mask) != (tmp & ~mask)) { ret = -EINVAL; break; } } return ret; } static int ieee80211_can_powered_addr_change(struct ieee80211_sub_if_data *sdata) { struct ieee80211_roc_work *roc; struct ieee80211_local *local = sdata->local; struct ieee80211_sub_if_data *scan_sdata; int ret = 0; lockdep_assert_wiphy(local->hw.wiphy); /* To be the most flexible here we want to only limit changing the * address if the specific interface is doing offchannel work or * scanning. */ if (netif_carrier_ok(sdata->dev)) return -EBUSY; /* First check no ROC work is happening on this iface */ list_for_each_entry(roc, &local->roc_list, list) { if (roc->sdata != sdata) continue; if (roc->started) { ret = -EBUSY; goto unlock; } } /* And if this iface is scanning */ if (local->scanning) { scan_sdata = rcu_dereference_protected(local->scan_sdata, lockdep_is_held(&local->hw.wiphy->mtx)); if (sdata == scan_sdata) ret = -EBUSY; } switch (sdata->vif.type) { case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_P2P_CLIENT: /* More interface types could be added here but changing the * address while powered makes the most sense in client modes. */ break; default: ret = -EOPNOTSUPP; } unlock: return ret; } static int _ieee80211_change_mac(struct ieee80211_sub_if_data *sdata, void *addr) { struct ieee80211_local *local = sdata->local; struct sockaddr *sa = addr; bool check_dup = true; bool live = false; int ret; if (ieee80211_sdata_running(sdata)) { ret = ieee80211_can_powered_addr_change(sdata); if (ret) return ret; live = true; } if (sdata->vif.type == NL80211_IFTYPE_MONITOR && !(sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE)) check_dup = false; ret = ieee80211_verify_mac(sdata, sa->sa_data, check_dup); if (ret) return ret; if (live) drv_remove_interface(local, sdata); ret = eth_mac_addr(sdata->dev, sa); if (ret == 0) { memcpy(sdata->vif.addr, sa->sa_data, ETH_ALEN); ether_addr_copy(sdata->vif.bss_conf.addr, sdata->vif.addr); } /* Regardless of eth_mac_addr() return we still want to add the * interface back. This should not fail... */ if (live) WARN_ON(drv_add_interface(local, sdata)); return ret; } static int ieee80211_change_mac(struct net_device *dev, void *addr) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); struct ieee80211_local *local = sdata->local; int ret; /* * This happens during unregistration if there's a bond device * active (maybe other cases?) and we must get removed from it. * But we really don't care anymore if it's not registered now. */ if (!dev->ieee80211_ptr->registered) return 0; wiphy_lock(local->hw.wiphy); ret = _ieee80211_change_mac(sdata, addr); wiphy_unlock(local->hw.wiphy); return ret; } static inline int identical_mac_addr_allowed(int type1, int type2) { return type1 == NL80211_IFTYPE_MONITOR || type2 == NL80211_IFTYPE_MONITOR || type1 == NL80211_IFTYPE_P2P_DEVICE || type2 == NL80211_IFTYPE_P2P_DEVICE || (type1 == NL80211_IFTYPE_AP && type2 == NL80211_IFTYPE_AP_VLAN) || (type1 == NL80211_IFTYPE_AP_VLAN && (type2 == NL80211_IFTYPE_AP || type2 == NL80211_IFTYPE_AP_VLAN)); } static int ieee80211_check_concurrent_iface(struct ieee80211_sub_if_data *sdata, enum nl80211_iftype iftype) { struct ieee80211_local *local = sdata->local; struct ieee80211_sub_if_data *nsdata; ASSERT_RTNL(); lockdep_assert_wiphy(local->hw.wiphy); /* we hold the RTNL here so can safely walk the list */ list_for_each_entry(nsdata, &local->interfaces, list) { if (nsdata != sdata && ieee80211_sdata_running(nsdata)) { /* * Only OCB and monitor mode may coexist */ if ((sdata->vif.type == NL80211_IFTYPE_OCB && nsdata->vif.type != NL80211_IFTYPE_MONITOR) || (sdata->vif.type != NL80211_IFTYPE_MONITOR && nsdata->vif.type == NL80211_IFTYPE_OCB)) return -EBUSY; /* * Allow only a single IBSS interface to be up at any * time. This is restricted because beacon distribution * cannot work properly if both are in the same IBSS. * * To remove this restriction we'd have to disallow them * from setting the same SSID on different IBSS interfaces * belonging to the same hardware. Then, however, we're * faced with having to adopt two different TSF timers... */ if (iftype == NL80211_IFTYPE_ADHOC && nsdata->vif.type == NL80211_IFTYPE_ADHOC) return -EBUSY; /* * will not add another interface while any channel * switch is active. */ if (nsdata->vif.bss_conf.csa_active) return -EBUSY; /* * The remaining checks are only performed for interfaces * with the same MAC address. */ if (!ether_addr_equal(sdata->vif.addr, nsdata->vif.addr)) continue; /* * check whether it may have the same address */ if (!identical_mac_addr_allowed(iftype, nsdata->vif.type)) return -ENOTUNIQ; /* No support for VLAN with MLO yet */ if (iftype == NL80211_IFTYPE_AP_VLAN && sdata->wdev.use_4addr && nsdata->vif.type == NL80211_IFTYPE_AP && nsdata->vif.valid_links) return -EOPNOTSUPP; /* * can only add VLANs to enabled APs */ if (iftype == NL80211_IFTYPE_AP_VLAN && nsdata->vif.type == NL80211_IFTYPE_AP) sdata->bss = &nsdata->u.ap; } } return ieee80211_check_combinations(sdata, NULL, 0, 0); } static int ieee80211_check_queues(struct ieee80211_sub_if_data *sdata, enum nl80211_iftype iftype) { int n_queues = sdata->local->hw.queues; int i; if (iftype == NL80211_IFTYPE_NAN) return 0; if (iftype != NL80211_IFTYPE_P2P_DEVICE) { for (i = 0; i < IEEE80211_NUM_ACS; i++) { if (WARN_ON_ONCE(sdata->vif.hw_queue[i] == IEEE80211_INVAL_HW_QUEUE)) return -EINVAL; if (WARN_ON_ONCE(sdata->vif.hw_queue[i] >= n_queues)) return -EINVAL; } } if ((iftype != NL80211_IFTYPE_AP && iftype != NL80211_IFTYPE_P2P_GO && iftype != NL80211_IFTYPE_MESH_POINT) || !ieee80211_hw_check(&sdata->local->hw, QUEUE_CONTROL)) { sdata->vif.cab_queue = IEEE80211_INVAL_HW_QUEUE; return 0; } if (WARN_ON_ONCE(sdata->vif.cab_queue == IEEE80211_INVAL_HW_QUEUE)) return -EINVAL; if (WARN_ON_ONCE(sdata->vif.cab_queue >= n_queues)) return -EINVAL; return 0; } static int ieee80211_open(struct net_device *dev) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); int err; /* fail early if user set an invalid address */ if (!is_valid_ether_addr(dev->dev_addr)) return -EADDRNOTAVAIL; wiphy_lock(sdata->local->hw.wiphy); err = ieee80211_check_concurrent_iface(sdata, sdata->vif.type); if (err) goto out; err = ieee80211_do_open(&sdata->wdev, true); out: wiphy_unlock(sdata->local->hw.wiphy); return err; } static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata, bool going_down) { struct ieee80211_local *local = sdata->local; unsigned long flags; struct sk_buff *skb, *tmp; u32 hw_reconf_flags = 0; int i, flushed; struct ps_data *ps; struct cfg80211_chan_def chandef; bool cancel_scan; struct cfg80211_nan_func *func; lockdep_assert_wiphy(local->hw.wiphy); clear_bit(SDATA_STATE_RUNNING, &sdata->state); synchronize_rcu(); /* flush _ieee80211_wake_txqs() */ cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata; if (cancel_scan) ieee80211_scan_cancel(local); ieee80211_roc_purge(local, sdata); switch (sdata->vif.type) { case NL80211_IFTYPE_STATION: ieee80211_mgd_stop(sdata); break; case NL80211_IFTYPE_ADHOC: ieee80211_ibss_stop(sdata); break; case NL80211_IFTYPE_MONITOR: if (sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES) break; list_del_rcu(&sdata->u.mntr.list); break; default: break; } /* * Remove all stations associated with this interface. * * This must be done before calling ops->remove_interface() * because otherwise we can later invoke ops->sta_notify() * whenever the STAs are removed, and that invalidates driver * assumptions about always getting a vif pointer that is valid * (because if we remove a STA after ops->remove_interface() * the driver will have removed the vif info already!) * * For AP_VLANs stations may exist since there's nothing else that * would have removed them, but in other modes there shouldn't * be any stations. */ flushed = sta_info_flush(sdata, -1); WARN_ON_ONCE(sdata->vif.type != NL80211_IFTYPE_AP_VLAN && flushed > 0); /* don't count this interface for allmulti while it is down */ if (sdata->flags & IEEE80211_SDATA_ALLMULTI) atomic_dec(&local->iff_allmultis); if (sdata->vif.type == NL80211_IFTYPE_AP) { local->fif_pspoll--; local->fif_probe_req--; } else if (sdata->vif.type == NL80211_IFTYPE_ADHOC) { local->fif_probe_req--; } if (sdata->dev) { netif_addr_lock_bh(sdata->dev); spin_lock_bh(&local->filter_lock); __hw_addr_unsync(&local->mc_list, &sdata->dev->mc, sdata->dev->addr_len); spin_unlock_bh(&local->filter_lock); netif_addr_unlock_bh(sdata->dev); } del_timer_sync(&local->dynamic_ps_timer); wiphy_work_cancel(local->hw.wiphy, &local->dynamic_ps_enable_work); WARN(ieee80211_vif_is_mld(&sdata->vif), "destroying interface with valid links 0x%04x\n", sdata->vif.valid_links); sdata->vif.bss_conf.csa_active = false; if (sdata->vif.type == NL80211_IFTYPE_STATION) sdata->deflink.u.mgd.csa_waiting_bcn = false; if (sdata->csa_blocked_tx) { ieee80211_wake_vif_queues(local, sdata, IEEE80211_QUEUE_STOP_REASON_CSA); sdata->csa_blocked_tx = false; } wiphy_work_cancel(local->hw.wiphy, &sdata->deflink.csa_finalize_work); wiphy_work_cancel(local->hw.wiphy, &sdata->deflink.color_change_finalize_work); wiphy_delayed_work_cancel(local->hw.wiphy, &sdata->deflink.dfs_cac_timer_work); if (sdata->wdev.cac_started) { chandef = sdata->vif.bss_conf.chanreq.oper; WARN_ON(local->suspended); ieee80211_link_release_channel(&sdata->deflink); cfg80211_cac_event(sdata->dev, &chandef, NL80211_RADAR_CAC_ABORTED, GFP_KERNEL); } if (sdata->vif.type == NL80211_IFTYPE_AP) { WARN_ON(!list_empty(&sdata->u.ap.vlans)); } else if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN) { /* remove all packets in parent bc_buf pointing to this dev */ ps = &sdata->bss->ps; spin_lock_irqsave(&ps->bc_buf.lock, flags); skb_queue_walk_safe(&ps->bc_buf, skb, tmp) { if (skb->dev == sdata->dev) { __skb_unlink(skb, &ps->bc_buf); local->total_ps_buffered--; ieee80211_free_txskb(&local->hw, skb); } } spin_unlock_irqrestore(&ps->bc_buf.lock, flags); } if (going_down) local->open_count--; switch (sdata->vif.type) { case NL80211_IFTYPE_AP_VLAN: list_del(&sdata->u.vlan.list); RCU_INIT_POINTER(sdata->vif.bss_conf.chanctx_conf, NULL); /* see comment in the default case below */ ieee80211_free_keys(sdata, true); /* no need to tell driver */ break; case NL80211_IFTYPE_MONITOR: if (sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES) { local->cooked_mntrs--; break; } local->monitors--; if (local->monitors == 0) { local->hw.conf.flags &= ~IEEE80211_CONF_MONITOR; hw_reconf_flags |= IEEE80211_CONF_CHANGE_MONITOR; } ieee80211_adjust_monitor_flags(sdata, -1); break; case NL80211_IFTYPE_NAN: /* clean all the functions */ spin_lock_bh(&sdata->u.nan.func_lock); idr_for_each_entry(&sdata->u.nan.function_inst_ids, func, i) { idr_remove(&sdata->u.nan.function_inst_ids, i); cfg80211_free_nan_func(func); } idr_destroy(&sdata->u.nan.function_inst_ids); spin_unlock_bh(&sdata->u.nan.func_lock); break; case NL80211_IFTYPE_P2P_DEVICE: /* relies on synchronize_rcu() below */ RCU_INIT_POINTER(local->p2p_sdata, NULL); fallthrough; default: wiphy_work_cancel(sdata->local->hw.wiphy, &sdata->work); /* * When we get here, the interface is marked down. * Free the remaining keys, if there are any * (which can happen in AP mode if userspace sets * keys before the interface is operating) * * Force the key freeing to always synchronize_net() * to wait for the RX path in case it is using this * interface enqueuing frames at this very time on * another CPU. */ ieee80211_free_keys(sdata, true); skb_queue_purge(&sdata->skb_queue); skb_queue_purge(&sdata->status_queue); } spin_lock_irqsave(&local->queue_stop_reason_lock, flags); for (i = 0; i < IEEE80211_MAX_QUEUES; i++) { skb_queue_walk_safe(&local->pending[i], skb, tmp) { struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); if (info->control.vif == &sdata->vif) { __skb_unlink(skb, &local->pending[i]); ieee80211_free_txskb(&local->hw, skb); } } } spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags); if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN) ieee80211_txq_remove_vlan(local, sdata); sdata->bss = NULL; if (local->open_count == 0) ieee80211_clear_tx_pending(local); sdata->vif.bss_conf.beacon_int = 0; /* * If the interface goes down while suspended, presumably because * the device was unplugged and that happens before our resume, * then the driver is already unconfigured and the remainder of * this function isn't needed. * XXX: what about WoWLAN? If the device has software state, e.g. * memory allocated, it might expect teardown commands from * mac80211 here? */ if (local->suspended) { WARN_ON(local->wowlan); WARN_ON(rcu_access_pointer(local->monitor_sdata)); return; } switch (sdata->vif.type) { case NL80211_IFTYPE_AP_VLAN: break; case NL80211_IFTYPE_MONITOR: if (local->monitors == 0) ieee80211_del_virtual_monitor(local); ieee80211_recalc_idle(local); if (!(sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE)) break; fallthrough; default: if (going_down) drv_remove_interface(local, sdata); } ieee80211_recalc_ps(local); if (cancel_scan) wiphy_delayed_work_flush(local->hw.wiphy, &local->scan_work); if (local->open_count == 0) { ieee80211_stop_device(local); /* no reconfiguring after stop! */ return; } /* do after stop to avoid reconfiguring when we stop anyway */ ieee80211_configure_filter(local); ieee80211_hw_config(local, hw_reconf_flags); if (local->monitors == local->open_count) ieee80211_add_virtual_monitor(local); } static void ieee80211_stop_mbssid(struct ieee80211_sub_if_data *sdata) { struct ieee80211_sub_if_data *tx_sdata, *non_tx_sdata, *tmp_sdata; struct ieee80211_vif *tx_vif = sdata->vif.mbssid_tx_vif; if (!tx_vif) return; tx_sdata = vif_to_sdata(tx_vif); sdata->vif.mbssid_tx_vif = NULL; list_for_each_entry_safe(non_tx_sdata, tmp_sdata, &tx_sdata->local->interfaces, list) { if (non_tx_sdata != sdata && non_tx_sdata != tx_sdata && non_tx_sdata->vif.mbssid_tx_vif == tx_vif && ieee80211_sdata_running(non_tx_sdata)) { non_tx_sdata->vif.mbssid_tx_vif = NULL; dev_close(non_tx_sdata->wdev.netdev); } } if (sdata != tx_sdata && ieee80211_sdata_running(tx_sdata)) { tx_sdata->vif.mbssid_tx_vif = NULL; dev_close(tx_sdata->wdev.netdev); } } static int ieee80211_stop(struct net_device *dev) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); /* close dependent VLAN and MBSSID interfaces before locking wiphy */ if (sdata->vif.type == NL80211_IFTYPE_AP) { struct ieee80211_sub_if_data *vlan, *tmpsdata; list_for_each_entry_safe(vlan, tmpsdata, &sdata->u.ap.vlans, u.vlan.list) dev_close(vlan->dev); ieee80211_stop_mbssid(sdata); } wiphy_lock(sdata->local->hw.wiphy); wiphy_work_cancel(sdata->local->hw.wiphy, &sdata->activate_links_work); ieee80211_do_stop(sdata, true); wiphy_unlock(sdata->local->hw.wiphy); return 0; } static void ieee80211_set_multicast_list(struct net_device *dev) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); struct ieee80211_local *local = sdata->local; int allmulti, sdata_allmulti; allmulti = !!(dev->flags & IFF_ALLMULTI); sdata_allmulti = !!(sdata->flags & IEEE80211_SDATA_ALLMULTI); if (allmulti != sdata_allmulti) { if (dev->flags & IFF_ALLMULTI) atomic_inc(&local->iff_allmultis); else atomic_dec(&local->iff_allmultis); sdata->flags ^= IEEE80211_SDATA_ALLMULTI; } spin_lock_bh(&local->filter_lock); __hw_addr_sync(&local->mc_list, &dev->mc, dev->addr_len); spin_unlock_bh(&local->filter_lock); wiphy_work_queue(local->hw.wiphy, &local->reconfig_filter); } /* * Called when the netdev is removed or, by the code below, before * the interface type changes. */ static void ieee80211_teardown_sdata(struct ieee80211_sub_if_data *sdata) { /* free extra data */ ieee80211_free_keys(sdata, false); ieee80211_debugfs_remove_netdev(sdata); ieee80211_destroy_frag_cache(&sdata->frags); if (ieee80211_vif_is_mesh(&sdata->vif)) ieee80211_mesh_teardown_sdata(sdata); ieee80211_vif_clear_links(sdata); ieee80211_link_stop(&sdata->deflink); } static void ieee80211_uninit(struct net_device *dev) { ieee80211_teardown_sdata(IEEE80211_DEV_TO_SUB_IF(dev)); } static void ieee80211_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) { dev_fetch_sw_netstats(stats, dev->tstats); } static int ieee80211_netdev_setup_tc(struct net_device *dev, enum tc_setup_type type, void *type_data) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); struct ieee80211_local *local = sdata->local; return drv_net_setup_tc(local, sdata, dev, type, type_data); } static const struct net_device_ops ieee80211_dataif_ops = { .ndo_open = ieee80211_open, .ndo_stop = ieee80211_stop, .ndo_uninit = ieee80211_uninit, .ndo_start_xmit = ieee80211_subif_start_xmit, .ndo_set_rx_mode = ieee80211_set_multicast_list, .ndo_set_mac_address = ieee80211_change_mac, .ndo_get_stats64 = ieee80211_get_stats64, .ndo_setup_tc = ieee80211_netdev_setup_tc, }; static u16 ieee80211_monitor_select_queue(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev) { struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev); struct ieee80211_local *local = sdata->local; struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); struct ieee80211_hdr *hdr; int len_rthdr; if (local->hw.queues < IEEE80211_NUM_ACS) return 0; /* reset flags and info before parsing radiotap header */ memset(info, 0, sizeof(*info)); if (!ieee80211_parse_tx_radiotap(skb, dev)) return 0; /* doesn't matter, frame will be dropped */ len_rthdr = ieee80211_get_radiotap_len(skb->data); hdr = (struct ieee80211_hdr *)(skb->data + len_rthdr); if (skb->len < len_rthdr + 2 || skb->len < len_rthdr + ieee80211_hdrlen(hdr->frame_control)) return 0; /* doesn't matter, frame will be dropped */ return ieee80211_select_queue_80211(sdata, skb, hdr); } static const struct net_device_ops ieee80211_monitorif_ops = { .ndo_open = ieee80211_open, .ndo_stop = ieee80211_stop, .ndo_uninit = ieee80211_uninit, .ndo_start_xmit = ieee80211_monitor_start_xmit, .ndo_set_rx_mode = ieee80211_set_multicast_list, .ndo_set_mac_address = ieee80211_change_mac, .ndo_select_queue = ieee80211_monitor_select_queue, .ndo_get_stats64 = ieee80211_get_stats64, }; static int ieee80211_netdev_fill_forward_path(struct net_device_path_ctx *ctx, struct net_device_path *path) { struct ieee80211_sub_if_data *sdata; struct ieee80211_local *local; struct sta_info *sta; int ret = -ENOENT; sdata = IEEE80211_DEV_TO_SUB_IF(ctx->dev); local = sdata->local; if (!local->ops->net_fill_forward_path) return -EOPNOTSUPP; rcu_read_lock(); switch (sdata->vif.type) { case NL80211_IFTYPE_AP_VLAN: sta = rcu_dereference(sdata->u.vlan.sta); if (sta) break; if (sdata->wdev.use_4addr) goto out; if (is_multicast_ether_addr(ctx->daddr)) goto out; sta = sta_info_get_bss(sdata, ctx->daddr); break; case NL80211_IFTYPE_AP: if (is_multicast_ether_addr(ctx->daddr)) goto out; sta = sta_info_get(sdata, ctx->daddr); break; case NL80211_IFTYPE_STATION: if (sdata->wdev.wiphy->flags & WIPHY_FLAG_SUPPORTS_TDLS) { sta = sta_info_get(sdata, ctx->daddr); if (sta && test_sta_flag(sta, WLAN_STA_TDLS_PEER)) { if (!test_sta_flag(sta, WLAN_STA_TDLS_PEER_AUTH)) goto out; break; } } sta = sta_info_get(sdata, sdata->deflink.u.mgd.bssid); break; default: goto out; } if (!sta) goto out; ret = drv_net_fill_forward_path(local, sdata, &sta->sta, ctx, path); out: rcu_read_unlock(); return ret; } static const struct net_device_ops ieee80211_dataif_8023_ops = { .ndo_open = ieee80211_open, .ndo_stop = ieee80211_stop, .ndo_uninit = ieee80211_uninit, .ndo_start_xmit = ieee80211_subif_start_xmit_8023, .ndo_set_rx_mode = ieee80211_set_multicast_list, .ndo_set_mac_address = ieee80211_change_mac, .ndo_get_stats64 = ieee80211_get_stats64, .ndo_fill_forward_path = ieee80211_netdev_fill_forward_path, .ndo_setup_tc = ieee80211_netdev_setup_tc, }; static bool ieee80211_iftype_supports_hdr_offload(enum nl80211_iftype iftype) { switch (iftype) { /* P2P GO and client are mapped to AP/STATION types */ case NL80211_IFTYPE_AP: case NL80211_IFTYPE_STATION: return true; default: return false; } } static bool ieee80211_set_sdata_offload_flags(struct ieee80211_sub_if_data *sdata) { struct ieee80211_local *local = sdata->local; u32 flags; flags = sdata->vif.offload_flags; if (ieee80211_hw_check(&local->hw, SUPPORTS_TX_ENCAP_OFFLOAD) && ieee80211_iftype_supports_hdr_offload(sdata->vif.type)) { flags |= IEEE80211_OFFLOAD_ENCAP_ENABLED; if (!ieee80211_hw_check(&local->hw, SUPPORTS_TX_FRAG) && local->hw.wiphy->frag_threshold != (u32)-1) flags &= ~IEEE80211_OFFLOAD_ENCAP_ENABLED; if (local->monitors) flags &= ~IEEE80211_OFFLOAD_ENCAP_ENABLED; } else { flags &= ~IEEE80211_OFFLOAD_ENCAP_ENABLED; } if (ieee80211_hw_check(&local->hw, SUPPORTS_RX_DECAP_OFFLOAD) && ieee80211_iftype_supports_hdr_offload(sdata->vif.type)) { flags |= IEEE80211_OFFLOAD_DECAP_ENABLED; if (local->monitors && !ieee80211_hw_check(&local->hw, SUPPORTS_CONC_MON_RX_DECAP)) flags &= ~IEEE80211_OFFLOAD_DECAP_ENABLED; } else { flags &= ~IEEE80211_OFFLOAD_DECAP_ENABLED; } if (sdata->vif.offload_flags == flags) return false; sdata->vif.offload_flags = flags; ieee80211_check_fast_rx_iface(sdata); return true; } static void ieee80211_set_vif_encap_ops(struct ieee80211_sub_if_data *sdata) { struct ieee80211_local *local = sdata->local; struct ieee80211_sub_if_data *bss = sdata; bool enabled; if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN) { if (!sdata->bss) return; bss = container_of(sdata->bss, struct ieee80211_sub_if_data, u.ap); } if (!ieee80211_hw_check(&local->hw, SUPPORTS_TX_ENCAP_OFFLOAD) || !ieee80211_iftype_supports_hdr_offload(bss->vif.type)) return; enabled = bss->vif.offload_flags & IEEE80211_OFFLOAD_ENCAP_ENABLED; if (sdata->wdev.use_4addr && !(bss->vif.offload_flags & IEEE80211_OFFLOAD_ENCAP_4ADDR)) enabled = false; sdata->dev->netdev_ops = enabled ? &ieee80211_dataif_8023_ops : &ieee80211_dataif_ops; } static void ieee80211_recalc_sdata_offload(struct ieee80211_sub_if_data *sdata) { struct ieee80211_local *local = sdata->local; struct ieee80211_sub_if_data *vsdata; if (ieee80211_set_sdata_offload_flags(sdata)) { drv_update_vif_offload(local, sdata); ieee80211_set_vif_encap_ops(sdata); } list_for_each_entry(vsdata, &local->interfaces, list) { if (vsdata->vif.type != NL80211_IFTYPE_AP_VLAN || vsdata->bss != &sdata->u.ap) continue; ieee80211_set_vif_encap_ops(vsdata); } } void ieee80211_recalc_offload(struct ieee80211_local *local) { struct ieee80211_sub_if_data *sdata; if (!ieee80211_hw_check(&local->hw, SUPPORTS_TX_ENCAP_OFFLOAD)) return; lockdep_assert_wiphy(local->hw.wiphy); list_for_each_entry(sdata, &local->interfaces, list) { if (!ieee80211_sdata_running(sdata)) continue; ieee80211_recalc_sdata_offload(sdata); } } void ieee80211_adjust_monitor_flags(struct ieee80211_sub_if_data *sdata, const int offset) { struct ieee80211_local *local = sdata->local; u32 flags = sdata->u.mntr.flags; #define ADJUST(_f, _s) do { \ if (flags & MONITOR_FLAG_##_f) \ local->fif_##_s += offset; \ } while (0) ADJUST(FCSFAIL, fcsfail); ADJUST(PLCPFAIL, plcpfail); ADJUST(CONTROL, control); ADJUST(CONTROL, pspoll); ADJUST(OTHER_BSS, other_bss); #undef ADJUST } static void ieee80211_set_default_queues(struct ieee80211_sub_if_data *sdata) { struct ieee80211_local *local = sdata->local; int i; for (i = 0; i < IEEE80211_NUM_ACS; i++) { if (ieee80211_hw_check(&local->hw, QUEUE_CONTROL)) sdata->vif.hw_queue[i] = IEEE80211_INVAL_HW_QUEUE; else if (local->hw.queues >= IEEE80211_NUM_ACS) sdata->vif.hw_queue[i] = i; else sdata->vif.hw_queue[i] = 0; } sdata->vif.cab_queue = IEEE80211_INVAL_HW_QUEUE; } static void ieee80211_sdata_init(struct ieee80211_local *local, struct ieee80211_sub_if_data *sdata) { sdata->local = local; /* * Initialize the default link, so we can use link_id 0 for non-MLD, * and that continues to work for non-MLD-aware drivers that use just * vif.bss_conf instead of vif.link_conf. * * Note that we never change this, so if link ID 0 isn't used in an * MLD connection, we get a separate allocation for it. */ ieee80211_link_init(sdata, -1, &sdata->deflink, &sdata->vif.bss_conf); } int ieee80211_add_virtual_monitor(struct ieee80211_local *local) { struct ieee80211_sub_if_data *sdata; int ret; if (!ieee80211_hw_check(&local->hw, WANT_MONITOR_VIF)) return 0; ASSERT_RTNL(); lockdep_assert_wiphy(local->hw.wiphy); if (local->monitor_sdata) return 0; sdata = kzalloc(sizeof(*sdata) + local->hw.vif_data_size, GFP_KERNEL); if (!sdata) return -ENOMEM; /* set up data */ sdata->vif.type = NL80211_IFTYPE_MONITOR; snprintf(sdata->name, IFNAMSIZ, "%s-monitor", wiphy_name(local->hw.wiphy)); sdata->wdev.iftype = NL80211_IFTYPE_MONITOR; sdata->wdev.wiphy = local->hw.wiphy; ieee80211_sdata_init(local, sdata); ieee80211_set_default_queues(sdata); ret = drv_add_interface(local, sdata); if (WARN_ON(ret)) { /* ok .. stupid driver, it asked for this! */ kfree(sdata); return ret; } set_bit(SDATA_STATE_RUNNING, &sdata->state); ret = ieee80211_check_queues(sdata, NL80211_IFTYPE_MONITOR); if (ret) { kfree(sdata); return ret; } mutex_lock(&local->iflist_mtx); rcu_assign_pointer(local->monitor_sdata, sdata); mutex_unlock(&local->iflist_mtx); ret = ieee80211_link_use_channel(&sdata->deflink, &local->monitor_chanreq, IEEE80211_CHANCTX_EXCLUSIVE); if (ret) { mutex_lock(&local->iflist_mtx); RCU_INIT_POINTER(local->monitor_sdata, NULL); mutex_unlock(&local->iflist_mtx); synchronize_net(); drv_remove_interface(local, sdata); kfree(sdata); return ret; } skb_queue_head_init(&sdata->skb_queue); skb_queue_head_init(&sdata->status_queue); wiphy_work_init(&sdata->work, ieee80211_iface_work); return 0; } void ieee80211_del_virtual_monitor(struct ieee80211_local *local) { struct ieee80211_sub_if_data *sdata; if (!ieee80211_hw_check(&local->hw, WANT_MONITOR_VIF)) return; ASSERT_RTNL(); lockdep_assert_wiphy(local->hw.wiphy); mutex_lock(&local->iflist_mtx); sdata = rcu_dereference_protected(local->monitor_sdata, lockdep_is_held(&local->iflist_mtx)); if (!sdata) { mutex_unlock(&local->iflist_mtx); return; } RCU_INIT_POINTER(local->monitor_sdata, NULL); mutex_unlock(&local->iflist_mtx); synchronize_net(); ieee80211_link_release_channel(&sdata->deflink); drv_remove_interface(local, sdata); kfree(sdata); } /* * NOTE: Be very careful when changing this function, it must NOT return * an error on interface type changes that have been pre-checked, so most * checks should be in ieee80211_check_concurrent_iface. */ int ieee80211_do_open(struct wireless_dev *wdev, bool coming_up) { struct ieee80211_sub_if_data *sdata = IEEE80211_WDEV_TO_SUB_IF(wdev); struct net_device *dev = wdev->netdev; struct ieee80211_local *local = sdata->local; u64 changed = 0; int res; u32 hw_reconf_flags = 0; lockdep_assert_wiphy(local->hw.wiphy); switch (sdata->vif.type) { case NL80211_IFTYPE_AP_VLAN: { struct ieee80211_sub_if_data *master; if (!sdata->bss) return -ENOLINK; list_add(&sdata->u.vlan.list, &sdata->bss->vlans); master = container_of(sdata->bss, struct ieee80211_sub_if_data, u.ap); sdata->control_port_protocol = master->control_port_protocol; sdata->control_port_no_encrypt = master->control_port_no_encrypt; sdata->control_port_over_nl80211 = master->control_port_over_nl80211; sdata->control_port_no_preauth = master->control_port_no_preauth; sdata->vif.cab_queue = master->vif.cab_queue; memcpy(sdata->vif.hw_queue, master->vif.hw_queue, sizeof(sdata->vif.hw_queue)); sdata->vif.bss_conf.chanreq = master->vif.bss_conf.chanreq; sdata->crypto_tx_tailroom_needed_cnt += master->crypto_tx_tailroom_needed_cnt; break; } case NL80211_IFTYPE_AP: sdata->bss = &sdata->u.ap; break; case NL80211_IFTYPE_MESH_POINT: case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_MONITOR: case NL80211_IFTYPE_ADHOC: case NL80211_IFTYPE_P2P_DEVICE: case NL80211_IFTYPE_OCB: case NL80211_IFTYPE_NAN: /* no special treatment */ break; case NL80211_IFTYPE_UNSPECIFIED: case NUM_NL80211_IFTYPES: case NL80211_IFTYPE_P2P_CLIENT: case NL80211_IFTYPE_P2P_GO: case NL80211_IFTYPE_WDS: /* cannot happen */ WARN_ON(1); break; } if (local->open_count == 0) { /* here we can consider everything in good order (again) */ local->reconfig_failure = false; res = drv_start(local); if (res) goto err_del_bss; ieee80211_led_radio(local, true); ieee80211_mod_tpt_led_trig(local, IEEE80211_TPT_LEDTRIG_FL_RADIO, 0); } /* * Copy the hopefully now-present MAC address to * this interface, if it has the special null one. */ if (dev && is_zero_ether_addr(dev->dev_addr)) { eth_hw_addr_set(dev, local->hw.wiphy->perm_addr); memcpy(dev->perm_addr, dev->dev_addr, ETH_ALEN); if (!is_valid_ether_addr(dev->dev_addr)) { res = -EADDRNOTAVAIL; goto err_stop; } } switch (sdata->vif.type) { case NL80211_IFTYPE_AP_VLAN: /* no need to tell driver, but set carrier and chanctx */ if (sdata->bss->active) { ieee80211_link_vlan_copy_chanctx(&sdata->deflink); netif_carrier_on(dev); ieee80211_set_vif_encap_ops(sdata); } else { netif_carrier_off(dev); } break; case NL80211_IFTYPE_MONITOR: if (sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES) { local->cooked_mntrs++; break; } if (sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE) { res = drv_add_interface(local, sdata); if (res) goto err_stop; } else if (local->monitors == 0 && local->open_count == 0) { res = ieee80211_add_virtual_monitor(local); if (res) goto err_stop; } /* must be before the call to ieee80211_configure_filter */ local->monitors++; if (local->monitors == 1) { local->hw.conf.flags |= IEEE80211_CONF_MONITOR; hw_reconf_flags |= IEEE80211_CONF_CHANGE_MONITOR; } ieee80211_adjust_monitor_flags(sdata, 1); ieee80211_configure_filter(local); ieee80211_recalc_offload(local); ieee80211_recalc_idle(local); netif_carrier_on(dev); break; default: if (coming_up) { ieee80211_del_virtual_monitor(local); ieee80211_set_sdata_offload_flags(sdata); res = drv_add_interface(local, sdata); if (res) goto err_stop; ieee80211_set_vif_encap_ops(sdata); res = ieee80211_check_queues(sdata, ieee80211_vif_type_p2p(&sdata->vif)); if (res) goto err_del_interface; } if (sdata->vif.type == NL80211_IFTYPE_AP) { local->fif_pspoll++; local->fif_probe_req++; ieee80211_configure_filter(local); } else if (sdata->vif.type == NL80211_IFTYPE_ADHOC) { local->fif_probe_req++; } if (sdata->vif.probe_req_reg) drv_config_iface_filter(local, sdata, FIF_PROBE_REQ, FIF_PROBE_REQ); if (sdata->vif.type != NL80211_IFTYPE_P2P_DEVICE && sdata->vif.type != NL80211_IFTYPE_NAN) changed |= ieee80211_reset_erp_info(sdata); ieee80211_link_info_change_notify(sdata, &sdata->deflink, changed); switch (sdata->vif.type) { case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_ADHOC: case NL80211_IFTYPE_AP: case NL80211_IFTYPE_MESH_POINT: case NL80211_IFTYPE_OCB: netif_carrier_off(dev); break; case NL80211_IFTYPE_P2P_DEVICE: case NL80211_IFTYPE_NAN: break; default: /* not reached */ WARN_ON(1); } /* * Set default queue parameters so drivers don't * need to initialise the hardware if the hardware * doesn't start up with sane defaults. * Enable QoS for anything but station interfaces. */ ieee80211_set_wmm_default(&sdata->deflink, true, sdata->vif.type != NL80211_IFTYPE_STATION); } switch (sdata->vif.type) { case NL80211_IFTYPE_P2P_DEVICE: rcu_assign_pointer(local->p2p_sdata, sdata); break; case NL80211_IFTYPE_MONITOR: if (sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES) break; list_add_tail_rcu(&sdata->u.mntr.list, &local->mon_list); break; default: break; } /* * set_multicast_list will be invoked by the networking core * which will check whether any increments here were done in * error and sync them down to the hardware as filter flags. */ if (sdata->flags & IEEE80211_SDATA_ALLMULTI) atomic_inc(&local->iff_allmultis); if (coming_up) local->open_count++; if (local->open_count == 1) ieee80211_hw_conf_init(local); else if (hw_reconf_flags) ieee80211_hw_config(local, hw_reconf_flags); ieee80211_recalc_ps(local); set_bit(SDATA_STATE_RUNNING, &sdata->state); return 0; err_del_interface: drv_remove_interface(local, sdata); err_stop: if (!local->open_count) drv_stop(local); err_del_bss: sdata->bss = NULL; if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN) list_del(&sdata->u.vlan.list); /* might already be clear but that doesn't matter */ clear_bit(SDATA_STATE_RUNNING, &sdata->state); return res; } static void ieee80211_if_free(struct net_device *dev) { free_percpu(dev->tstats); } static void ieee80211_if_setup(struct net_device *dev) { ether_setup(dev); dev->priv_flags &= ~IFF_TX_SKB_SHARING; dev->priv_flags |= IFF_NO_QUEUE; dev->netdev_ops = &ieee80211_dataif_ops; dev->needs_free_netdev = true; dev->priv_destructor = ieee80211_if_free; } static void ieee80211_iface_process_skb(struct ieee80211_local *local, struct ieee80211_sub_if_data *sdata, struct sk_buff *skb) { struct ieee80211_mgmt *mgmt = (void *)skb->data; lockdep_assert_wiphy(local->hw.wiphy); if (ieee80211_is_action(mgmt->frame_control) && mgmt->u.action.category == WLAN_CATEGORY_BACK) { struct sta_info *sta; int len = skb->len; sta = sta_info_get_bss(sdata, mgmt->sa); if (sta) { switch (mgmt->u.action.u.addba_req.action_code) { case WLAN_ACTION_ADDBA_REQ: ieee80211_process_addba_request(local, sta, mgmt, len); break; case WLAN_ACTION_ADDBA_RESP: ieee80211_process_addba_resp(local, sta, mgmt, len); break; case WLAN_ACTION_DELBA: ieee80211_process_delba(sdata, sta, mgmt, len); break; default: WARN_ON(1); break; } } } else if (ieee80211_is_action(mgmt->frame_control) && mgmt->u.action.category == WLAN_CATEGORY_VHT) { switch (mgmt->u.action.u.vht_group_notif.action_code) { case WLAN_VHT_ACTION_OPMODE_NOTIF: { struct ieee80211_rx_status *status; enum nl80211_band band; struct sta_info *sta; u8 opmode; status = IEEE80211_SKB_RXCB(skb); band = status->band; opmode = mgmt->u.action.u.vht_opmode_notif.operating_mode; sta = sta_info_get_bss(sdata, mgmt->sa); if (sta) ieee80211_vht_handle_opmode(sdata, &sta->deflink, opmode, band); break; } case WLAN_VHT_ACTION_GROUPID_MGMT: ieee80211_process_mu_groups(sdata, &sdata->deflink, mgmt); break; default: WARN_ON(1); break; } } else if (ieee80211_is_action(mgmt->frame_control) && mgmt->u.action.category == WLAN_CATEGORY_S1G) { switch (mgmt->u.action.u.s1g.action_code) { case WLAN_S1G_TWT_TEARDOWN: case WLAN_S1G_TWT_SETUP: ieee80211_s1g_rx_twt_action(sdata, skb); break; default: break; } } else if (ieee80211_is_action(mgmt->frame_control) && mgmt->u.action.category == WLAN_CATEGORY_PROTECTED_EHT) { if (sdata->vif.type == NL80211_IFTYPE_STATION) { switch (mgmt->u.action.u.ttlm_req.action_code) { case WLAN_PROTECTED_EHT_ACTION_TTLM_REQ: ieee80211_process_neg_ttlm_req(sdata, mgmt, skb->len); break; case WLAN_PROTECTED_EHT_ACTION_TTLM_RES: ieee80211_process_neg_ttlm_res(sdata, mgmt, skb->len); break; default: break; } } } else if (ieee80211_is_ext(mgmt->frame_control)) { if (sdata->vif.type == NL80211_IFTYPE_STATION) ieee80211_sta_rx_queued_ext(sdata, skb); else WARN_ON(1); } else if (ieee80211_is_data_qos(mgmt->frame_control)) { struct ieee80211_hdr *hdr = (void *)mgmt; struct sta_info *sta; /* * So the frame isn't mgmt, but frame_control * is at the right place anyway, of course, so * the if statement is correct. * * Warn if we have other data frame types here, * they must not get here. */ WARN_ON(hdr->frame_control & cpu_to_le16(IEEE80211_STYPE_NULLFUNC)); WARN_ON(!(hdr->seq_ctrl & cpu_to_le16(IEEE80211_SCTL_FRAG))); /* * This was a fragment of a frame, received while * a block-ack session was active. That cannot be * right, so terminate the session. */ sta = sta_info_get_bss(sdata, mgmt->sa); if (sta) { u16 tid = ieee80211_get_tid(hdr); __ieee80211_stop_rx_ba_session( sta, tid, WLAN_BACK_RECIPIENT, WLAN_REASON_QSTA_REQUIRE_SETUP, true); } } else switch (sdata->vif.type) { case NL80211_IFTYPE_STATION: ieee80211_sta_rx_queued_mgmt(sdata, skb); break; case NL80211_IFTYPE_ADHOC: ieee80211_ibss_rx_queued_mgmt(sdata, skb); break; case NL80211_IFTYPE_MESH_POINT: if (!ieee80211_vif_is_mesh(&sdata->vif)) break; ieee80211_mesh_rx_queued_mgmt(sdata, skb); break; default: WARN(1, "frame for unexpected interface type"); break; } } static void ieee80211_iface_process_status(struct ieee80211_sub_if_data *sdata, struct sk_buff *skb) { struct ieee80211_mgmt *mgmt = (void *)skb->data; if (ieee80211_is_action(mgmt->frame_control) && mgmt->u.action.category == WLAN_CATEGORY_S1G) { switch (mgmt->u.action.u.s1g.action_code) { case WLAN_S1G_TWT_TEARDOWN: case WLAN_S1G_TWT_SETUP: ieee80211_s1g_status_twt_action(sdata, skb); break; default: break; } } } static void ieee80211_iface_work(struct wiphy *wiphy, struct wiphy_work *work) { struct ieee80211_sub_if_data *sdata = container_of(work, struct ieee80211_sub_if_data, work); struct ieee80211_local *local = sdata->local; struct sk_buff *skb; if (!ieee80211_sdata_running(sdata)) return; if (test_bit(SCAN_SW_SCANNING, &local->scanning)) return; if (!ieee80211_can_run_worker(local)) return; /* first process frames */ while ((skb = skb_dequeue(&sdata->skb_queue))) { kcov_remote_start_common(skb_get_kcov_handle(skb)); if (skb->protocol == cpu_to_be16(ETH_P_TDLS)) ieee80211_process_tdls_channel_switch(sdata, skb); else ieee80211_iface_process_skb(local, sdata, skb); kfree_skb(skb); kcov_remote_stop(); } /* process status queue */ while ((skb = skb_dequeue(&sdata->status_queue))) { kcov_remote_start_common(skb_get_kcov_handle(skb)); ieee80211_iface_process_status(sdata, skb); kfree_skb(skb); kcov_remote_stop(); } /* then other type-dependent work */ switch (sdata->vif.type) { case NL80211_IFTYPE_STATION: ieee80211_sta_work(sdata); break; case NL80211_IFTYPE_ADHOC: ieee80211_ibss_work(sdata); break; case NL80211_IFTYPE_MESH_POINT: if (!ieee80211_vif_is_mesh(&sdata->vif)) break; ieee80211_mesh_work(sdata); break; case NL80211_IFTYPE_OCB: ieee80211_ocb_work(sdata); break; default: break; } } static void ieee80211_activate_links_work(struct wiphy *wiphy, struct wiphy_work *work) { struct ieee80211_sub_if_data *sdata = container_of(work, struct ieee80211_sub_if_data, activate_links_work); ieee80211_set_active_links(&sdata->vif, sdata->desired_active_links); } /* * Helper function to initialise an interface to a specific type. */ static void ieee80211_setup_sdata(struct ieee80211_sub_if_data *sdata, enum nl80211_iftype type) { static const u8 bssid_wildcard[ETH_ALEN] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff}; /* clear type-dependent unions */ memset(&sdata->u, 0, sizeof(sdata->u)); memset(&sdata->deflink.u, 0, sizeof(sdata->deflink.u)); /* and set some type-dependent values */ sdata->vif.type = type; sdata->vif.p2p = false; sdata->wdev.iftype = type; sdata->control_port_protocol = cpu_to_be16(ETH_P_PAE); sdata->control_port_no_encrypt = false; sdata->control_port_over_nl80211 = false; sdata->control_port_no_preauth = false; sdata->vif.cfg.idle = true; sdata->vif.bss_conf.txpower = INT_MIN; /* unset */ sdata->noack_map = 0; /* only monitor/p2p-device differ */ if (sdata->dev) { sdata->dev->netdev_ops = &ieee80211_dataif_ops; sdata->dev->type = ARPHRD_ETHER; } skb_queue_head_init(&sdata->skb_queue); skb_queue_head_init(&sdata->status_queue); wiphy_work_init(&sdata->work, ieee80211_iface_work); wiphy_work_init(&sdata->activate_links_work, ieee80211_activate_links_work); switch (type) { case NL80211_IFTYPE_P2P_GO: type = NL80211_IFTYPE_AP; sdata->vif.type = type; sdata->vif.p2p = true; fallthrough; case NL80211_IFTYPE_AP: skb_queue_head_init(&sdata->u.ap.ps.bc_buf); INIT_LIST_HEAD(&sdata->u.ap.vlans); sdata->vif.bss_conf.bssid = sdata->vif.addr; break; case NL80211_IFTYPE_P2P_CLIENT: type = NL80211_IFTYPE_STATION; sdata->vif.type = type; sdata->vif.p2p = true; fallthrough; case NL80211_IFTYPE_STATION: sdata->vif.bss_conf.bssid = sdata->deflink.u.mgd.bssid; ieee80211_sta_setup_sdata(sdata); break; case NL80211_IFTYPE_OCB: sdata->vif.bss_conf.bssid = bssid_wildcard; ieee80211_ocb_setup_sdata(sdata); break; case NL80211_IFTYPE_ADHOC: sdata->vif.bss_conf.bssid = sdata->u.ibss.bssid; ieee80211_ibss_setup_sdata(sdata); break; case NL80211_IFTYPE_MESH_POINT: if (ieee80211_vif_is_mesh(&sdata->vif)) ieee80211_mesh_init_sdata(sdata); break; case NL80211_IFTYPE_MONITOR: sdata->dev->type = ARPHRD_IEEE80211_RADIOTAP; sdata->dev->netdev_ops = &ieee80211_monitorif_ops; sdata->u.mntr.flags = MONITOR_FLAG_CONTROL | MONITOR_FLAG_OTHER_BSS; break; case NL80211_IFTYPE_NAN: idr_init(&sdata->u.nan.function_inst_ids); spin_lock_init(&sdata->u.nan.func_lock); sdata->vif.bss_conf.bssid = sdata->vif.addr; break; case NL80211_IFTYPE_AP_VLAN: case NL80211_IFTYPE_P2P_DEVICE: sdata->vif.bss_conf.bssid = sdata->vif.addr; break; case NL80211_IFTYPE_UNSPECIFIED: case NL80211_IFTYPE_WDS: case NUM_NL80211_IFTYPES: WARN_ON(1); break; } /* need to do this after the switch so vif.type is correct */ ieee80211_link_setup(&sdata->deflink); ieee80211_debugfs_recreate_netdev(sdata, false); } static int ieee80211_runtime_change_iftype(struct ieee80211_sub_if_data *sdata, enum nl80211_iftype type) { struct ieee80211_local *local = sdata->local; int ret, err; enum nl80211_iftype internal_type = type; bool p2p = false; ASSERT_RTNL(); if (!local->ops->change_interface) return -EBUSY; /* for now, don't support changing while links exist */ if (ieee80211_vif_is_mld(&sdata->vif)) return -EBUSY; switch (sdata->vif.type) { case NL80211_IFTYPE_AP: if (!list_empty(&sdata->u.ap.vlans)) return -EBUSY; break; case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_ADHOC: case NL80211_IFTYPE_OCB: /* * Could maybe also all others here? * Just not sure how that interacts * with the RX/config path e.g. for * mesh. */ break; default: return -EBUSY; } switch (type) { case NL80211_IFTYPE_AP: case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_ADHOC: case NL80211_IFTYPE_OCB: /* * Could probably support everything * but here. */ break; case NL80211_IFTYPE_P2P_CLIENT: p2p = true; internal_type = NL80211_IFTYPE_STATION; break; case NL80211_IFTYPE_P2P_GO: p2p = true; internal_type = NL80211_IFTYPE_AP; break; default: return -EBUSY; } ret = ieee80211_check_concurrent_iface(sdata, internal_type); if (ret) return ret; ieee80211_stop_vif_queues(local, sdata, IEEE80211_QUEUE_STOP_REASON_IFTYPE_CHANGE); /* do_stop will synchronize_rcu() first thing */ ieee80211_do_stop(sdata, false); ieee80211_teardown_sdata(sdata); ieee80211_set_sdata_offload_flags(sdata); ret = drv_change_interface(local, sdata, internal_type, p2p); if (ret) type = ieee80211_vif_type_p2p(&sdata->vif); /* * Ignore return value here, there's not much we can do since * the driver changed the interface type internally already. * The warnings will hopefully make driver authors fix it :-) */ ieee80211_check_queues(sdata, type); ieee80211_setup_sdata(sdata, type); ieee80211_set_vif_encap_ops(sdata); err = ieee80211_do_open(&sdata->wdev, false); WARN(err, "type change: do_open returned %d", err); ieee80211_wake_vif_queues(local, sdata, IEEE80211_QUEUE_STOP_REASON_IFTYPE_CHANGE); return ret; } int ieee80211_if_change_type(struct ieee80211_sub_if_data *sdata, enum nl80211_iftype type) { int ret; ASSERT_RTNL(); if (type == ieee80211_vif_type_p2p(&sdata->vif)) return 0; if (ieee80211_sdata_running(sdata)) { ret = ieee80211_runtime_change_iftype(sdata, type); if (ret) return ret; } else { /* Purge and reset type-dependent state. */ ieee80211_teardown_sdata(sdata); ieee80211_setup_sdata(sdata, type); } /* reset some values that shouldn't be kept across type changes */ if (type == NL80211_IFTYPE_STATION) sdata->u.mgd.use_4addr = false; return 0; } static void ieee80211_assign_perm_addr(struct ieee80211_local *local, u8 *perm_addr, enum nl80211_iftype type) { struct ieee80211_sub_if_data *sdata; u64 mask, start, addr, val, inc; u8 *m; u8 tmp_addr[ETH_ALEN]; int i; lockdep_assert_wiphy(local->hw.wiphy); /* default ... something at least */ memcpy(perm_addr, local->hw.wiphy->perm_addr, ETH_ALEN); if (is_zero_ether_addr(local->hw.wiphy->addr_mask) && local->hw.wiphy->n_addresses <= 1) return; switch (type) { case NL80211_IFTYPE_MONITOR: /* doesn't matter */ break; case NL80211_IFTYPE_AP_VLAN: /* match up with an AP interface */ list_for_each_entry(sdata, &local->interfaces, list) { if (sdata->vif.type != NL80211_IFTYPE_AP) continue; memcpy(perm_addr, sdata->vif.addr, ETH_ALEN); break; } /* keep default if no AP interface present */ break; case NL80211_IFTYPE_P2P_CLIENT: case NL80211_IFTYPE_P2P_GO: if (ieee80211_hw_check(&local->hw, P2P_DEV_ADDR_FOR_INTF)) { list_for_each_entry(sdata, &local->interfaces, list) { if (sdata->vif.type != NL80211_IFTYPE_P2P_DEVICE) continue; if (!ieee80211_sdata_running(sdata)) continue; memcpy(perm_addr, sdata->vif.addr, ETH_ALEN); return; } } fallthrough; default: /* assign a new address if possible -- try n_addresses first */ for (i = 0; i < local->hw.wiphy->n_addresses; i++) { bool used = false; list_for_each_entry(sdata, &local->interfaces, list) { if (ether_addr_equal(local->hw.wiphy->addresses[i].addr, sdata->vif.addr)) { used = true; break; } } if (!used) { memcpy(perm_addr, local->hw.wiphy->addresses[i].addr, ETH_ALEN); break; } } /* try mask if available */ if (is_zero_ether_addr(local->hw.wiphy->addr_mask)) break; m = local->hw.wiphy->addr_mask; mask = ((u64)m[0] << 5*8) | ((u64)m[1] << 4*8) | ((u64)m[2] << 3*8) | ((u64)m[3] << 2*8) | ((u64)m[4] << 1*8) | ((u64)m[5] << 0*8); if (__ffs64(mask) + hweight64(mask) != fls64(mask)) { /* not a contiguous mask ... not handled now! */ pr_info("not contiguous\n"); break; } /* * Pick address of existing interface in case user changed * MAC address manually, default to perm_addr. */ m = local->hw.wiphy->perm_addr; list_for_each_entry(sdata, &local->interfaces, list) { if (sdata->vif.type == NL80211_IFTYPE_MONITOR) continue; m = sdata->vif.addr; break; } start = ((u64)m[0] << 5*8) | ((u64)m[1] << 4*8) | ((u64)m[2] << 3*8) | ((u64)m[3] << 2*8) | ((u64)m[4] << 1*8) | ((u64)m[5] << 0*8); inc = 1ULL<<__ffs64(mask); val = (start & mask); addr = (start & ~mask) | (val & mask); do { bool used = false; tmp_addr[5] = addr >> 0*8; tmp_addr[4] = addr >> 1*8; tmp_addr[3] = addr >> 2*8; tmp_addr[2] = addr >> 3*8; tmp_addr[1] = addr >> 4*8; tmp_addr[0] = addr >> 5*8; val += inc; list_for_each_entry(sdata, &local->interfaces, list) { if (ether_addr_equal(tmp_addr, sdata->vif.addr)) { used = true; break; } } if (!used) { memcpy(perm_addr, tmp_addr, ETH_ALEN); break; } addr = (start & ~mask) | (val & mask); } while (addr != start); break; } } int ieee80211_if_add(struct ieee80211_local *local, const char *name, unsigned char name_assign_type, struct wireless_dev **new_wdev, enum nl80211_iftype type, struct vif_params *params) { struct net_device *ndev = NULL; struct ieee80211_sub_if_data *sdata = NULL; struct txq_info *txqi; int ret, i; ASSERT_RTNL(); lockdep_assert_wiphy(local->hw.wiphy); if (type == NL80211_IFTYPE_P2P_DEVICE || type == NL80211_IFTYPE_NAN) { struct wireless_dev *wdev; sdata = kzalloc(sizeof(*sdata) + local->hw.vif_data_size, GFP_KERNEL); if (!sdata) return -ENOMEM; wdev = &sdata->wdev; sdata->dev = NULL; strscpy(sdata->name, name, IFNAMSIZ); ieee80211_assign_perm_addr(local, wdev->address, type); memcpy(sdata->vif.addr, wdev->address, ETH_ALEN); ether_addr_copy(sdata->vif.bss_conf.addr, sdata->vif.addr); } else { int size = ALIGN(sizeof(*sdata) + local->hw.vif_data_size, sizeof(void *)); int txq_size = 0; if (type != NL80211_IFTYPE_AP_VLAN && (type != NL80211_IFTYPE_MONITOR || (params->flags & MONITOR_FLAG_ACTIVE))) txq_size += sizeof(struct txq_info) + local->hw.txq_data_size; ndev = alloc_netdev_mqs(size + txq_size, name, name_assign_type, ieee80211_if_setup, 1, 1); if (!ndev) return -ENOMEM; dev_net_set(ndev, wiphy_net(local->hw.wiphy)); ndev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); if (!ndev->tstats) { free_netdev(ndev); return -ENOMEM; } ndev->needed_headroom = local->tx_headroom + 4*6 /* four MAC addresses */ + 2 + 2 + 2 + 2 /* ctl, dur, seq, qos */ + 6 /* mesh */ + 8 /* rfc1042/bridge tunnel */ - ETH_HLEN /* ethernet hard_header_len */ + IEEE80211_ENCRYPT_HEADROOM; ndev->needed_tailroom = IEEE80211_ENCRYPT_TAILROOM; ret = dev_alloc_name(ndev, ndev->name); if (ret < 0) { ieee80211_if_free(ndev); free_netdev(ndev); return ret; } ieee80211_assign_perm_addr(local, ndev->perm_addr, type); if (is_valid_ether_addr(params->macaddr)) eth_hw_addr_set(ndev, params->macaddr); else eth_hw_addr_set(ndev, ndev->perm_addr); SET_NETDEV_DEV(ndev, wiphy_dev(local->hw.wiphy)); /* don't use IEEE80211_DEV_TO_SUB_IF -- it checks too much */ sdata = netdev_priv(ndev); ndev->ieee80211_ptr = &sdata->wdev; memcpy(sdata->vif.addr, ndev->dev_addr, ETH_ALEN); ether_addr_copy(sdata->vif.bss_conf.addr, sdata->vif.addr); memcpy(sdata->name, ndev->name, IFNAMSIZ); if (txq_size) { txqi = netdev_priv(ndev) + size; ieee80211_txq_init(sdata, NULL, txqi, 0); } sdata->dev = ndev; } /* initialise type-independent data */ sdata->wdev.wiphy = local->hw.wiphy; ieee80211_sdata_init(local, sdata); ieee80211_init_frag_cache(&sdata->frags); INIT_LIST_HEAD(&sdata->key_list); wiphy_delayed_work_init(&sdata->dec_tailroom_needed_wk, ieee80211_delayed_tailroom_dec); for (i = 0; i < NUM_NL80211_BANDS; i++) { struct ieee80211_supported_band *sband; sband = local->hw.wiphy->bands[i]; sdata->rc_rateidx_mask[i] = sband ? (1 << sband->n_bitrates) - 1 : 0; if (sband) { __le16 cap; u16 *vht_rate_mask; memcpy(sdata->rc_rateidx_mcs_mask[i], sband->ht_cap.mcs.rx_mask, sizeof(sdata->rc_rateidx_mcs_mask[i])); cap = sband->vht_cap.vht_mcs.rx_mcs_map; vht_rate_mask = sdata->rc_rateidx_vht_mcs_mask[i]; ieee80211_get_vht_mask_from_cap(cap, vht_rate_mask); } else { memset(sdata->rc_rateidx_mcs_mask[i], 0, sizeof(sdata->rc_rateidx_mcs_mask[i])); memset(sdata->rc_rateidx_vht_mcs_mask[i], 0, sizeof(sdata->rc_rateidx_vht_mcs_mask[i])); } } ieee80211_set_default_queues(sdata); sdata->deflink.ap_power_level = IEEE80211_UNSET_POWER_LEVEL; sdata->deflink.user_power_level = local->user_power_level; /* setup type-dependent data */ ieee80211_setup_sdata(sdata, type); if (ndev) { ndev->ieee80211_ptr->use_4addr = params->use_4addr; if (type == NL80211_IFTYPE_STATION) sdata->u.mgd.use_4addr = params->use_4addr; ndev->features |= local->hw.netdev_features; ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE; ndev->hw_features |= ndev->features & MAC80211_SUPPORTED_FEATURES_TX; sdata->vif.netdev_features = local->hw.netdev_features; netdev_set_default_ethtool_ops(ndev, &ieee80211_ethtool_ops); /* MTU range is normally 256 - 2304, where the upper limit is * the maximum MSDU size. Monitor interfaces send and receive * MPDU and A-MSDU frames which may be much larger so we do * not impose an upper limit in that case. */ ndev->min_mtu = 256; if (type == NL80211_IFTYPE_MONITOR) ndev->max_mtu = 0; else ndev->max_mtu = local->hw.max_mtu; ret = cfg80211_register_netdevice(ndev); if (ret) { free_netdev(ndev); return ret; } } mutex_lock(&local->iflist_mtx); list_add_tail_rcu(&sdata->list, &local->interfaces); mutex_unlock(&local->iflist_mtx); if (new_wdev) *new_wdev = &sdata->wdev; return 0; } void ieee80211_if_remove(struct ieee80211_sub_if_data *sdata) { ASSERT_RTNL(); lockdep_assert_wiphy(sdata->local->hw.wiphy); mutex_lock(&sdata->local->iflist_mtx); list_del_rcu(&sdata->list); mutex_unlock(&sdata->local->iflist_mtx); if (sdata->vif.txq) ieee80211_txq_purge(sdata->local, to_txq_info(sdata->vif.txq)); synchronize_rcu(); cfg80211_unregister_wdev(&sdata->wdev); if (!sdata->dev) { ieee80211_teardown_sdata(sdata); kfree(sdata); } } void ieee80211_sdata_stop(struct ieee80211_sub_if_data *sdata) { if (WARN_ON_ONCE(!test_bit(SDATA_STATE_RUNNING, &sdata->state))) return; ieee80211_do_stop(sdata, true); } void ieee80211_remove_interfaces(struct ieee80211_local *local) { struct ieee80211_sub_if_data *sdata, *tmp; LIST_HEAD(unreg_list); ASSERT_RTNL(); /* Before destroying the interfaces, make sure they're all stopped so * that the hardware is stopped. Otherwise, the driver might still be * iterating the interfaces during the shutdown, e.g. from a worker * or from RX processing or similar, and if it does so (using atomic * iteration) while we're manipulating the list, the iteration will * crash. * * After this, the hardware should be stopped and the driver should * have stopped all of its activities, so that we can do RCU-unaware * manipulations of the interface list below. */ cfg80211_shutdown_all_interfaces(local->hw.wiphy); wiphy_lock(local->hw.wiphy); WARN(local->open_count, "%s: open count remains %d\n", wiphy_name(local->hw.wiphy), local->open_count); mutex_lock(&local->iflist_mtx); list_splice_init(&local->interfaces, &unreg_list); mutex_unlock(&local->iflist_mtx); list_for_each_entry_safe(sdata, tmp, &unreg_list, list) { bool netdev = sdata->dev; /* * Remove IP addresses explicitly, since the notifier will * skip the callbacks if wdev->registered is false, since * we can't acquire the wiphy_lock() again there if already * inside this locked section. */ sdata->vif.cfg.arp_addr_cnt = 0; if (sdata->vif.type == NL80211_IFTYPE_STATION && sdata->u.mgd.associated) ieee80211_vif_cfg_change_notify(sdata, BSS_CHANGED_ARP_FILTER); list_del(&sdata->list); cfg80211_unregister_wdev(&sdata->wdev); if (!netdev) kfree(sdata); } wiphy_unlock(local->hw.wiphy); } static int netdev_notify(struct notifier_block *nb, unsigned long state, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr); struct ieee80211_sub_if_data *sdata; if (state != NETDEV_CHANGENAME) return NOTIFY_DONE; if (!dev->ieee80211_ptr || !dev->ieee80211_ptr->wiphy) return NOTIFY_DONE; if (dev->ieee80211_ptr->wiphy->privid != mac80211_wiphy_privid) return NOTIFY_DONE; sdata = IEEE80211_DEV_TO_SUB_IF(dev); memcpy(sdata->name, dev->name, IFNAMSIZ); ieee80211_debugfs_rename_netdev(sdata); return NOTIFY_OK; } static struct notifier_block mac80211_netdev_notifier = { .notifier_call = netdev_notify, }; int ieee80211_iface_init(void) { return register_netdevice_notifier(&mac80211_netdev_notifier); } void ieee80211_iface_exit(void) { unregister_netdevice_notifier(&mac80211_netdev_notifier); } void ieee80211_vif_inc_num_mcast(struct ieee80211_sub_if_data *sdata) { if (sdata->vif.type == NL80211_IFTYPE_AP) atomic_inc(&sdata->u.ap.num_mcast_sta); else if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN) atomic_inc(&sdata->u.vlan.num_mcast_sta); } void ieee80211_vif_dec_num_mcast(struct ieee80211_sub_if_data *sdata) { if (sdata->vif.type == NL80211_IFTYPE_AP) atomic_dec(&sdata->u.ap.num_mcast_sta); else if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN) atomic_dec(&sdata->u.vlan.num_mcast_sta); } |
1 16 66 17 4 30 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef __UDF_DECL_H #define __UDF_DECL_H #define pr_fmt(fmt) "UDF-fs: " fmt #include "ecma_167.h" #include "osta_udf.h" #include <linux/fs.h> #include <linux/types.h> #include <linux/buffer_head.h> #include <linux/udf_fs_i.h> #include "udf_sb.h" #include "udfend.h" #include "udf_i.h" #define UDF_DEFAULT_PREALLOC_BLOCKS 8 extern __printf(3, 4) void _udf_err(struct super_block *sb, const char *function, const char *fmt, ...); #define udf_err(sb, fmt, ...) \ _udf_err(sb, __func__, fmt, ##__VA_ARGS__) extern __printf(3, 4) void _udf_warn(struct super_block *sb, const char *function, const char *fmt, ...); #define udf_warn(sb, fmt, ...) \ _udf_warn(sb, __func__, fmt, ##__VA_ARGS__) #define udf_info(fmt, ...) \ pr_info("INFO " fmt, ##__VA_ARGS__) #define udf_debug(fmt, ...) \ pr_debug("%s:%d:%s: " fmt, __FILE__, __LINE__, __func__, ##__VA_ARGS__) #define UDF_EXTENT_LENGTH_MASK 0x3FFFFFFF #define UDF_EXTENT_FLAG_MASK 0xC0000000 #define UDF_INVALID_ID ((uint32_t)-1) #define UDF_NAME_PAD 4 #define UDF_NAME_LEN 254 #define UDF_NAME_LEN_CS0 255 static inline size_t udf_file_entry_alloc_offset(struct inode *inode) { struct udf_inode_info *iinfo = UDF_I(inode); if (iinfo->i_use) return sizeof(struct unallocSpaceEntry); else if (iinfo->i_efe) return sizeof(struct extendedFileEntry) + iinfo->i_lenEAttr; else return sizeof(struct fileEntry) + iinfo->i_lenEAttr; } static inline size_t udf_ext0_offset(struct inode *inode) { if (UDF_I(inode)->i_alloc_type == ICBTAG_FLAG_AD_IN_ICB) return udf_file_entry_alloc_offset(inode); else return 0; } /* computes tag checksum */ u8 udf_tag_checksum(const struct tag *t); typedef uint32_t udf_pblk_t; struct dentry; struct inode; struct task_struct; struct buffer_head; struct super_block; extern const struct export_operations udf_export_ops; extern const struct inode_operations udf_dir_inode_operations; extern const struct file_operations udf_dir_operations; extern const struct inode_operations udf_file_inode_operations; extern const struct file_operations udf_file_operations; extern const struct inode_operations udf_symlink_inode_operations; extern const struct address_space_operations udf_aops; extern const struct address_space_operations udf_symlink_aops; struct udf_fileident_iter { struct inode *dir; /* Directory we are working with */ loff_t pos; /* Logical position in a dir */ struct buffer_head *bh[2]; /* Buffer containing 'pos' and possibly * next buffer if entry straddles * blocks */ struct kernel_lb_addr eloc; /* Start of extent containing 'pos' */ uint32_t elen; /* Length of extent containing 'pos' */ sector_t loffset; /* Block offset of 'pos' within above * extent */ struct extent_position epos; /* Position after the above extent */ struct fileIdentDesc fi; /* Copied directory entry */ uint8_t *name; /* Pointer to entry name */ uint8_t *namebuf; /* Storage for entry name in case * the name is split between two blocks */ }; struct udf_vds_record { uint32_t block; uint32_t volDescSeqNum; }; struct generic_desc { struct tag descTag; __le32 volDescSeqNum; }; /* super.c */ static inline void udf_updated_lvid(struct super_block *sb) { struct buffer_head *bh = UDF_SB(sb)->s_lvid_bh; BUG_ON(!bh); WARN_ON_ONCE(((struct logicalVolIntegrityDesc *) bh->b_data)->integrityType != cpu_to_le32(LVID_INTEGRITY_TYPE_OPEN)); UDF_SB(sb)->s_lvid_dirty = 1; } extern u64 lvid_get_unique_id(struct super_block *sb); struct inode *udf_find_metadata_inode_efe(struct super_block *sb, u32 meta_file_loc, u32 partition_num); /* namei.c */ static inline unsigned int udf_dir_entry_len(struct fileIdentDesc *cfi) { return ALIGN(sizeof(struct fileIdentDesc) + le16_to_cpu(cfi->lengthOfImpUse) + cfi->lengthFileIdent, UDF_NAME_PAD); } /* file.c */ extern long udf_ioctl(struct file *, unsigned int, unsigned long); /* inode.c */ extern struct inode *__udf_iget(struct super_block *, struct kernel_lb_addr *, bool hidden_inode); static inline struct inode *udf_iget_special(struct super_block *sb, struct kernel_lb_addr *ino) { return __udf_iget(sb, ino, true); } static inline struct inode *udf_iget(struct super_block *sb, struct kernel_lb_addr *ino) { return __udf_iget(sb, ino, false); } extern int udf_expand_file_adinicb(struct inode *); extern struct buffer_head *udf_bread(struct inode *inode, udf_pblk_t block, int create, int *err); extern int udf_setsize(struct inode *, loff_t); extern void udf_evict_inode(struct inode *); extern int udf_write_inode(struct inode *, struct writeback_control *wbc); extern int8_t inode_bmap(struct inode *, sector_t, struct extent_position *, struct kernel_lb_addr *, uint32_t *, sector_t *); int udf_get_block(struct inode *, sector_t, struct buffer_head *, int); extern int udf_setup_indirect_aext(struct inode *inode, udf_pblk_t block, struct extent_position *epos); extern int __udf_add_aext(struct inode *inode, struct extent_position *epos, struct kernel_lb_addr *eloc, uint32_t elen, int inc); extern int udf_add_aext(struct inode *, struct extent_position *, struct kernel_lb_addr *, uint32_t, int); extern void udf_write_aext(struct inode *, struct extent_position *, struct kernel_lb_addr *, uint32_t, int); extern int8_t udf_delete_aext(struct inode *, struct extent_position); extern int8_t udf_next_aext(struct inode *, struct extent_position *, struct kernel_lb_addr *, uint32_t *, int); extern int8_t udf_current_aext(struct inode *, struct extent_position *, struct kernel_lb_addr *, uint32_t *, int); extern void udf_update_extra_perms(struct inode *inode, umode_t mode); /* misc.c */ extern struct genericFormat *udf_add_extendedattr(struct inode *, uint32_t, uint32_t, uint8_t); extern struct genericFormat *udf_get_extendedattr(struct inode *, uint32_t, uint8_t); extern struct buffer_head *udf_read_tagged(struct super_block *, uint32_t, uint32_t, uint16_t *); extern struct buffer_head *udf_read_ptagged(struct super_block *, struct kernel_lb_addr *, uint32_t, uint16_t *); extern void udf_update_tag(char *, int); extern void udf_new_tag(char *, uint16_t, uint16_t, uint16_t, uint32_t, int); /* lowlevel.c */ extern unsigned int udf_get_last_session(struct super_block *); udf_pblk_t udf_get_last_block(struct super_block *); /* partition.c */ extern uint32_t udf_get_pblock(struct super_block *, uint32_t, uint16_t, uint32_t); extern uint32_t udf_get_pblock_virt15(struct super_block *, uint32_t, uint16_t, uint32_t); extern uint32_t udf_get_pblock_virt20(struct super_block *, uint32_t, uint16_t, uint32_t); extern uint32_t udf_get_pblock_spar15(struct super_block *, uint32_t, uint16_t, uint32_t); extern uint32_t udf_get_pblock_meta25(struct super_block *, uint32_t, uint16_t, uint32_t); extern int udf_relocate_blocks(struct super_block *, long, long *); static inline uint32_t udf_get_lb_pblock(struct super_block *sb, struct kernel_lb_addr *loc, uint32_t offset) { return udf_get_pblock(sb, loc->logicalBlockNum, loc->partitionReferenceNum, offset); } /* unicode.c */ extern int udf_get_filename(struct super_block *, const uint8_t *, int, uint8_t *, int); extern int udf_put_filename(struct super_block *, const uint8_t *, int, uint8_t *, int); extern int udf_dstrCS0toChar(struct super_block *, uint8_t *, int, const uint8_t *, int); /* ialloc.c */ extern void udf_free_inode(struct inode *); extern struct inode *udf_new_inode(struct inode *, umode_t); /* truncate.c */ extern void udf_truncate_tail_extent(struct inode *); extern void udf_discard_prealloc(struct inode *); extern int udf_truncate_extents(struct inode *); /* balloc.c */ extern void udf_free_blocks(struct super_block *, struct inode *, struct kernel_lb_addr *, uint32_t, uint32_t); extern int udf_prealloc_blocks(struct super_block *, struct inode *, uint16_t, uint32_t, uint32_t); extern udf_pblk_t udf_new_block(struct super_block *sb, struct inode *inode, uint16_t partition, uint32_t goal, int *err); /* directory.c */ int udf_fiiter_init(struct udf_fileident_iter *iter, struct inode *dir, loff_t pos); int udf_fiiter_advance(struct udf_fileident_iter *iter); void udf_fiiter_release(struct udf_fileident_iter *iter); void udf_fiiter_write_fi(struct udf_fileident_iter *iter, uint8_t *impuse); void udf_fiiter_update_elen(struct udf_fileident_iter *iter, uint32_t new_elen); int udf_fiiter_append_blk(struct udf_fileident_iter *iter); extern struct long_ad *udf_get_filelongad(uint8_t *, int, uint32_t *, int); extern struct short_ad *udf_get_fileshortad(uint8_t *, int, uint32_t *, int); /* udftime.c */ extern void udf_disk_stamp_to_time(struct timespec64 *dest, struct timestamp src); extern void udf_time_to_disk_stamp(struct timestamp *dest, struct timespec64 src); #endif /* __UDF_DECL_H */ |
1 1 1 1 1 1 1 1 1 2 2 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 | // SPDX-License-Identifier: GPL-2.0-only /* * net/sched/sch_qfq.c Quick Fair Queueing Plus Scheduler. * * Copyright (c) 2009 Fabio Checconi, Luigi Rizzo, and Paolo Valente. * Copyright (c) 2012 Paolo Valente. */ #include <linux/module.h> #include <linux/init.h> #include <linux/bitops.h> #include <linux/errno.h> #include <linux/netdevice.h> #include <linux/pkt_sched.h> #include <net/sch_generic.h> #include <net/pkt_sched.h> #include <net/pkt_cls.h> /* Quick Fair Queueing Plus ======================== Sources: [1] Paolo Valente, "Reducing the Execution Time of Fair-Queueing Schedulers." http://algo.ing.unimo.it/people/paolo/agg-sched/agg-sched.pdf Sources for QFQ: [2] Fabio Checconi, Luigi Rizzo, and Paolo Valente: "QFQ: Efficient Packet Scheduling with Tight Bandwidth Distribution Guarantees." See also: http://retis.sssup.it/~fabio/linux/qfq/ */ /* QFQ+ divides classes into aggregates of at most MAX_AGG_CLASSES classes. Each aggregate is timestamped with a virtual start time S and a virtual finish time F, and scheduled according to its timestamps. S and F are computed as a function of a system virtual time function V. The classes within each aggregate are instead scheduled with DRR. To speed up operations, QFQ+ divides also aggregates into a limited number of groups. Which group a class belongs to depends on the ratio between the maximum packet length for the class and the weight of the class. Groups have their own S and F. In the end, QFQ+ schedules groups, then aggregates within groups, then classes within aggregates. See [1] and [2] for a full description. Virtual time computations. S, F and V are all computed in fixed point arithmetic with FRAC_BITS decimal bits. QFQ_MAX_INDEX is the maximum index allowed for a group. We need one bit per index. QFQ_MAX_WSHIFT is the maximum power of two supported as a weight. The layout of the bits is as below: [ MTU_SHIFT ][ FRAC_BITS ] [ MAX_INDEX ][ MIN_SLOT_SHIFT ] ^.__grp->index = 0 *.__grp->slot_shift where MIN_SLOT_SHIFT is derived by difference from the others. The max group index corresponds to Lmax/w_min, where Lmax=1<<MTU_SHIFT, w_min = 1 . From this, and knowing how many groups (MAX_INDEX) we want, we can derive the shift corresponding to each group. Because we often need to compute F = S + len/w_i and V = V + len/wsum instead of storing w_i store the value inv_w = (1<<FRAC_BITS)/w_i so we can do F = S + len * inv_w * wsum. We use W_TOT in the formulas so we can easily move between static and adaptive weight sum. The per-scheduler-instance data contain all the data structures for the scheduler: bitmaps and bucket lists. */ /* * Maximum number of consecutive slots occupied by backlogged classes * inside a group. */ #define QFQ_MAX_SLOTS 32 /* * Shifts used for aggregate<->group mapping. We allow class weights that are * in the range [1, 2^MAX_WSHIFT], and we try to map each aggregate i to the * group with the smallest index that can support the L_i / r_i configured * for the classes in the aggregate. * * grp->index is the index of the group; and grp->slot_shift * is the shift for the corresponding (scaled) sigma_i. */ #define QFQ_MAX_INDEX 24 #define QFQ_MAX_WSHIFT 10 #define QFQ_MAX_WEIGHT (1<<QFQ_MAX_WSHIFT) /* see qfq_slot_insert */ #define QFQ_MAX_WSUM (64*QFQ_MAX_WEIGHT) #define FRAC_BITS 30 /* fixed point arithmetic */ #define ONE_FP (1UL << FRAC_BITS) #define QFQ_MTU_SHIFT 16 /* to support TSO/GSO */ #define QFQ_MIN_LMAX 512 /* see qfq_slot_insert */ #define QFQ_MAX_LMAX (1UL << QFQ_MTU_SHIFT) #define QFQ_MAX_AGG_CLASSES 8 /* max num classes per aggregate allowed */ /* * Possible group states. These values are used as indexes for the bitmaps * array of struct qfq_queue. */ enum qfq_state { ER, IR, EB, IB, QFQ_MAX_STATE }; struct qfq_group; struct qfq_aggregate; struct qfq_class { struct Qdisc_class_common common; struct gnet_stats_basic_sync bstats; struct gnet_stats_queue qstats; struct net_rate_estimator __rcu *rate_est; struct Qdisc *qdisc; struct list_head alist; /* Link for active-classes list. */ struct qfq_aggregate *agg; /* Parent aggregate. */ int deficit; /* DRR deficit counter. */ }; struct qfq_aggregate { struct hlist_node next; /* Link for the slot list. */ u64 S, F; /* flow timestamps (exact) */ /* group we belong to. In principle we would need the index, * which is log_2(lmax/weight), but we never reference it * directly, only the group. */ struct qfq_group *grp; /* these are copied from the flowset. */ u32 class_weight; /* Weight of each class in this aggregate. */ /* Max pkt size for the classes in this aggregate, DRR quantum. */ int lmax; u32 inv_w; /* ONE_FP/(sum of weights of classes in aggr.). */ u32 budgetmax; /* Max budget for this aggregate. */ u32 initial_budget, budget; /* Initial and current budget. */ int num_classes; /* Number of classes in this aggr. */ struct list_head active; /* DRR queue of active classes. */ struct hlist_node nonfull_next; /* See nonfull_aggs in qfq_sched. */ }; struct qfq_group { u64 S, F; /* group timestamps (approx). */ unsigned int slot_shift; /* Slot shift. */ unsigned int index; /* Group index. */ unsigned int front; /* Index of the front slot. */ unsigned long full_slots; /* non-empty slots */ /* Array of RR lists of active aggregates. */ struct hlist_head slots[QFQ_MAX_SLOTS]; }; struct qfq_sched { struct tcf_proto __rcu *filter_list; struct tcf_block *block; struct Qdisc_class_hash clhash; u64 oldV, V; /* Precise virtual times. */ struct qfq_aggregate *in_serv_agg; /* Aggregate being served. */ u32 wsum; /* weight sum */ u32 iwsum; /* inverse weight sum */ unsigned long bitmaps[QFQ_MAX_STATE]; /* Group bitmaps. */ struct qfq_group groups[QFQ_MAX_INDEX + 1]; /* The groups. */ u32 min_slot_shift; /* Index of the group-0 bit in the bitmaps. */ u32 max_agg_classes; /* Max number of classes per aggr. */ struct hlist_head nonfull_aggs; /* Aggs with room for more classes. */ }; /* * Possible reasons why the timestamps of an aggregate are updated * enqueue: the aggregate switches from idle to active and must scheduled * for service * requeue: the aggregate finishes its budget, so it stops being served and * must be rescheduled for service */ enum update_reason {enqueue, requeue}; static struct qfq_class *qfq_find_class(struct Qdisc *sch, u32 classid) { struct qfq_sched *q = qdisc_priv(sch); struct Qdisc_class_common *clc; clc = qdisc_class_find(&q->clhash, classid); if (clc == NULL) return NULL; return container_of(clc, struct qfq_class, common); } static const struct netlink_range_validation lmax_range = { .min = QFQ_MIN_LMAX, .max = QFQ_MAX_LMAX, }; static const struct nla_policy qfq_policy[TCA_QFQ_MAX + 1] = { [TCA_QFQ_WEIGHT] = NLA_POLICY_RANGE(NLA_U32, 1, QFQ_MAX_WEIGHT), [TCA_QFQ_LMAX] = NLA_POLICY_FULL_RANGE(NLA_U32, &lmax_range), }; /* * Calculate a flow index, given its weight and maximum packet length. * index = log_2(maxlen/weight) but we need to apply the scaling. * This is used only once at flow creation. */ static int qfq_calc_index(u32 inv_w, unsigned int maxlen, u32 min_slot_shift) { u64 slot_size = (u64)maxlen * inv_w; unsigned long size_map; int index = 0; size_map = slot_size >> min_slot_shift; if (!size_map) goto out; index = __fls(size_map) + 1; /* basically a log_2 */ index -= !(slot_size - (1ULL << (index + min_slot_shift - 1))); if (index < 0) index = 0; out: pr_debug("qfq calc_index: W = %lu, L = %u, I = %d\n", (unsigned long) ONE_FP/inv_w, maxlen, index); return index; } static void qfq_deactivate_agg(struct qfq_sched *, struct qfq_aggregate *); static void qfq_activate_agg(struct qfq_sched *, struct qfq_aggregate *, enum update_reason); static void qfq_init_agg(struct qfq_sched *q, struct qfq_aggregate *agg, u32 lmax, u32 weight) { INIT_LIST_HEAD(&agg->active); hlist_add_head(&agg->nonfull_next, &q->nonfull_aggs); agg->lmax = lmax; agg->class_weight = weight; } static struct qfq_aggregate *qfq_find_agg(struct qfq_sched *q, u32 lmax, u32 weight) { struct qfq_aggregate *agg; hlist_for_each_entry(agg, &q->nonfull_aggs, nonfull_next) if (agg->lmax == lmax && agg->class_weight == weight) return agg; return NULL; } /* Update aggregate as a function of the new number of classes. */ static void qfq_update_agg(struct qfq_sched *q, struct qfq_aggregate *agg, int new_num_classes) { u32 new_agg_weight; if (new_num_classes == q->max_agg_classes) hlist_del_init(&agg->nonfull_next); if (agg->num_classes > new_num_classes && new_num_classes == q->max_agg_classes - 1) /* agg no more full */ hlist_add_head(&agg->nonfull_next, &q->nonfull_aggs); /* The next assignment may let * agg->initial_budget > agg->budgetmax * hold, we will take it into account in charge_actual_service(). */ agg->budgetmax = new_num_classes * agg->lmax; new_agg_weight = agg->class_weight * new_num_classes; agg->inv_w = ONE_FP/new_agg_weight; if (agg->grp == NULL) { int i = qfq_calc_index(agg->inv_w, agg->budgetmax, q->min_slot_shift); agg->grp = &q->groups[i]; } q->wsum += (int) agg->class_weight * (new_num_classes - agg->num_classes); q->iwsum = ONE_FP / q->wsum; agg->num_classes = new_num_classes; } /* Add class to aggregate. */ static void qfq_add_to_agg(struct qfq_sched *q, struct qfq_aggregate *agg, struct qfq_class *cl) { cl->agg = agg; qfq_update_agg(q, agg, agg->num_classes+1); if (cl->qdisc->q.qlen > 0) { /* adding an active class */ list_add_tail(&cl->alist, &agg->active); if (list_first_entry(&agg->active, struct qfq_class, alist) == cl && q->in_serv_agg != agg) /* agg was inactive */ qfq_activate_agg(q, agg, enqueue); /* schedule agg */ } } static struct qfq_aggregate *qfq_choose_next_agg(struct qfq_sched *); static void qfq_destroy_agg(struct qfq_sched *q, struct qfq_aggregate *agg) { hlist_del_init(&agg->nonfull_next); q->wsum -= agg->class_weight; if (q->wsum != 0) q->iwsum = ONE_FP / q->wsum; if (q->in_serv_agg == agg) q->in_serv_agg = qfq_choose_next_agg(q); kfree(agg); } /* Deschedule class from within its parent aggregate. */ static void qfq_deactivate_class(struct qfq_sched *q, struct qfq_class *cl) { struct qfq_aggregate *agg = cl->agg; list_del(&cl->alist); /* remove from RR queue of the aggregate */ if (list_empty(&agg->active)) /* agg is now inactive */ qfq_deactivate_agg(q, agg); } /* Remove class from its parent aggregate. */ static void qfq_rm_from_agg(struct qfq_sched *q, struct qfq_class *cl) { struct qfq_aggregate *agg = cl->agg; cl->agg = NULL; if (agg->num_classes == 1) { /* agg being emptied, destroy it */ qfq_destroy_agg(q, agg); return; } qfq_update_agg(q, agg, agg->num_classes-1); } /* Deschedule class and remove it from its parent aggregate. */ static void qfq_deact_rm_from_agg(struct qfq_sched *q, struct qfq_class *cl) { if (cl->qdisc->q.qlen > 0) /* class is active */ qfq_deactivate_class(q, cl); qfq_rm_from_agg(q, cl); } /* Move class to a new aggregate, matching the new class weight and/or lmax */ static int qfq_change_agg(struct Qdisc *sch, struct qfq_class *cl, u32 weight, u32 lmax) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_aggregate *new_agg; /* 'lmax' can range from [QFQ_MIN_LMAX, pktlen + stab overhead] */ if (lmax > QFQ_MAX_LMAX) return -EINVAL; new_agg = qfq_find_agg(q, lmax, weight); if (new_agg == NULL) { /* create new aggregate */ new_agg = kzalloc(sizeof(*new_agg), GFP_ATOMIC); if (new_agg == NULL) return -ENOBUFS; qfq_init_agg(q, new_agg, lmax, weight); } qfq_deact_rm_from_agg(q, cl); qfq_add_to_agg(q, new_agg, cl); return 0; } static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, struct nlattr **tca, unsigned long *arg, struct netlink_ext_ack *extack) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_class *cl = (struct qfq_class *)*arg; bool existing = false; struct nlattr *tb[TCA_QFQ_MAX + 1]; struct qfq_aggregate *new_agg = NULL; u32 weight, lmax, inv_w; int err; int delta_w; if (NL_REQ_ATTR_CHECK(extack, NULL, tca, TCA_OPTIONS)) { NL_SET_ERR_MSG_MOD(extack, "missing options"); return -EINVAL; } err = nla_parse_nested_deprecated(tb, TCA_QFQ_MAX, tca[TCA_OPTIONS], qfq_policy, extack); if (err < 0) return err; if (tb[TCA_QFQ_WEIGHT]) weight = nla_get_u32(tb[TCA_QFQ_WEIGHT]); else weight = 1; if (tb[TCA_QFQ_LMAX]) { lmax = nla_get_u32(tb[TCA_QFQ_LMAX]); } else { /* MTU size is user controlled */ lmax = psched_mtu(qdisc_dev(sch)); if (lmax < QFQ_MIN_LMAX || lmax > QFQ_MAX_LMAX) { NL_SET_ERR_MSG_MOD(extack, "MTU size out of bounds for qfq"); return -EINVAL; } } inv_w = ONE_FP / weight; weight = ONE_FP / inv_w; if (cl != NULL && lmax == cl->agg->lmax && weight == cl->agg->class_weight) return 0; /* nothing to change */ delta_w = weight - (cl ? cl->agg->class_weight : 0); if (q->wsum + delta_w > QFQ_MAX_WSUM) { NL_SET_ERR_MSG_FMT_MOD(extack, "total weight out of range (%d + %u)\n", delta_w, q->wsum); return -EINVAL; } if (cl != NULL) { /* modify existing class */ if (tca[TCA_RATE]) { err = gen_replace_estimator(&cl->bstats, NULL, &cl->rate_est, NULL, true, tca[TCA_RATE]); if (err) return err; } existing = true; goto set_change_agg; } /* create and init new class */ cl = kzalloc(sizeof(struct qfq_class), GFP_KERNEL); if (cl == NULL) return -ENOBUFS; gnet_stats_basic_sync_init(&cl->bstats); cl->common.classid = classid; cl->deficit = lmax; cl->qdisc = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops, classid, NULL); if (cl->qdisc == NULL) cl->qdisc = &noop_qdisc; if (tca[TCA_RATE]) { err = gen_new_estimator(&cl->bstats, NULL, &cl->rate_est, NULL, true, tca[TCA_RATE]); if (err) goto destroy_class; } if (cl->qdisc != &noop_qdisc) qdisc_hash_add(cl->qdisc, true); set_change_agg: sch_tree_lock(sch); new_agg = qfq_find_agg(q, lmax, weight); if (new_agg == NULL) { /* create new aggregate */ sch_tree_unlock(sch); new_agg = kzalloc(sizeof(*new_agg), GFP_KERNEL); if (new_agg == NULL) { err = -ENOBUFS; gen_kill_estimator(&cl->rate_est); goto destroy_class; } sch_tree_lock(sch); qfq_init_agg(q, new_agg, lmax, weight); } if (existing) qfq_deact_rm_from_agg(q, cl); else qdisc_class_hash_insert(&q->clhash, &cl->common); qfq_add_to_agg(q, new_agg, cl); sch_tree_unlock(sch); qdisc_class_hash_grow(sch, &q->clhash); *arg = (unsigned long)cl; return 0; destroy_class: qdisc_put(cl->qdisc); kfree(cl); return err; } static void qfq_destroy_class(struct Qdisc *sch, struct qfq_class *cl) { struct qfq_sched *q = qdisc_priv(sch); qfq_rm_from_agg(q, cl); gen_kill_estimator(&cl->rate_est); qdisc_put(cl->qdisc); kfree(cl); } static int qfq_delete_class(struct Qdisc *sch, unsigned long arg, struct netlink_ext_ack *extack) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_class *cl = (struct qfq_class *)arg; if (qdisc_class_in_use(&cl->common)) { NL_SET_ERR_MSG_MOD(extack, "QFQ class in use"); return -EBUSY; } sch_tree_lock(sch); qdisc_purge_queue(cl->qdisc); qdisc_class_hash_remove(&q->clhash, &cl->common); sch_tree_unlock(sch); qfq_destroy_class(sch, cl); return 0; } static unsigned long qfq_search_class(struct Qdisc *sch, u32 classid) { return (unsigned long)qfq_find_class(sch, classid); } static struct tcf_block *qfq_tcf_block(struct Qdisc *sch, unsigned long cl, struct netlink_ext_ack *extack) { struct qfq_sched *q = qdisc_priv(sch); if (cl) return NULL; return q->block; } static unsigned long qfq_bind_tcf(struct Qdisc *sch, unsigned long parent, u32 classid) { struct qfq_class *cl = qfq_find_class(sch, classid); if (cl) qdisc_class_get(&cl->common); return (unsigned long)cl; } static void qfq_unbind_tcf(struct Qdisc *sch, unsigned long arg) { struct qfq_class *cl = (struct qfq_class *)arg; qdisc_class_put(&cl->common); } static int qfq_graft_class(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, struct Qdisc **old, struct netlink_ext_ack *extack) { struct qfq_class *cl = (struct qfq_class *)arg; if (new == NULL) { new = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops, cl->common.classid, NULL); if (new == NULL) new = &noop_qdisc; } *old = qdisc_replace(sch, new, &cl->qdisc); return 0; } static struct Qdisc *qfq_class_leaf(struct Qdisc *sch, unsigned long arg) { struct qfq_class *cl = (struct qfq_class *)arg; return cl->qdisc; } static int qfq_dump_class(struct Qdisc *sch, unsigned long arg, struct sk_buff *skb, struct tcmsg *tcm) { struct qfq_class *cl = (struct qfq_class *)arg; struct nlattr *nest; tcm->tcm_parent = TC_H_ROOT; tcm->tcm_handle = cl->common.classid; tcm->tcm_info = cl->qdisc->handle; nest = nla_nest_start_noflag(skb, TCA_OPTIONS); if (nest == NULL) goto nla_put_failure; if (nla_put_u32(skb, TCA_QFQ_WEIGHT, cl->agg->class_weight) || nla_put_u32(skb, TCA_QFQ_LMAX, cl->agg->lmax)) goto nla_put_failure; return nla_nest_end(skb, nest); nla_put_failure: nla_nest_cancel(skb, nest); return -EMSGSIZE; } static int qfq_dump_class_stats(struct Qdisc *sch, unsigned long arg, struct gnet_dump *d) { struct qfq_class *cl = (struct qfq_class *)arg; struct tc_qfq_stats xstats; memset(&xstats, 0, sizeof(xstats)); xstats.weight = cl->agg->class_weight; xstats.lmax = cl->agg->lmax; if (gnet_stats_copy_basic(d, NULL, &cl->bstats, true) < 0 || gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 || qdisc_qstats_copy(d, cl->qdisc) < 0) return -1; return gnet_stats_copy_app(d, &xstats, sizeof(xstats)); } static void qfq_walk(struct Qdisc *sch, struct qdisc_walker *arg) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_class *cl; unsigned int i; if (arg->stop) return; for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry(cl, &q->clhash.hash[i], common.hnode) { if (!tc_qdisc_stats_dump(sch, (unsigned long)cl, arg)) return; } } } static struct qfq_class *qfq_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_class *cl; struct tcf_result res; struct tcf_proto *fl; int result; if (TC_H_MAJ(skb->priority ^ sch->handle) == 0) { pr_debug("qfq_classify: found %d\n", skb->priority); cl = qfq_find_class(sch, skb->priority); if (cl != NULL) return cl; } *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; fl = rcu_dereference_bh(q->filter_list); result = tcf_classify(skb, NULL, fl, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { case TC_ACT_QUEUED: case TC_ACT_STOLEN: case TC_ACT_TRAP: *qerr = NET_XMIT_SUCCESS | __NET_XMIT_STOLEN; fallthrough; case TC_ACT_SHOT: return NULL; } #endif cl = (struct qfq_class *)res.class; if (cl == NULL) cl = qfq_find_class(sch, res.classid); return cl; } return NULL; } /* Generic comparison function, handling wraparound. */ static inline int qfq_gt(u64 a, u64 b) { return (s64)(a - b) > 0; } /* Round a precise timestamp to its slotted value. */ static inline u64 qfq_round_down(u64 ts, unsigned int shift) { return ts & ~((1ULL << shift) - 1); } /* return the pointer to the group with lowest index in the bitmap */ static inline struct qfq_group *qfq_ffs(struct qfq_sched *q, unsigned long bitmap) { int index = __ffs(bitmap); return &q->groups[index]; } /* Calculate a mask to mimic what would be ffs_from(). */ static inline unsigned long mask_from(unsigned long bitmap, int from) { return bitmap & ~((1UL << from) - 1); } /* * The state computation relies on ER=0, IR=1, EB=2, IB=3 * First compute eligibility comparing grp->S, q->V, * then check if someone is blocking us and possibly add EB */ static int qfq_calc_state(struct qfq_sched *q, const struct qfq_group *grp) { /* if S > V we are not eligible */ unsigned int state = qfq_gt(grp->S, q->V); unsigned long mask = mask_from(q->bitmaps[ER], grp->index); struct qfq_group *next; if (mask) { next = qfq_ffs(q, mask); if (qfq_gt(grp->F, next->F)) state |= EB; } return state; } /* * In principle * q->bitmaps[dst] |= q->bitmaps[src] & mask; * q->bitmaps[src] &= ~mask; * but we should make sure that src != dst */ static inline void qfq_move_groups(struct qfq_sched *q, unsigned long mask, int src, int dst) { q->bitmaps[dst] |= q->bitmaps[src] & mask; q->bitmaps[src] &= ~mask; } static void qfq_unblock_groups(struct qfq_sched *q, int index, u64 old_F) { unsigned long mask = mask_from(q->bitmaps[ER], index + 1); struct qfq_group *next; if (mask) { next = qfq_ffs(q, mask); if (!qfq_gt(next->F, old_F)) return; } mask = (1UL << index) - 1; qfq_move_groups(q, mask, EB, ER); qfq_move_groups(q, mask, IB, IR); } /* * perhaps * old_V ^= q->V; old_V >>= q->min_slot_shift; if (old_V) { ... } * */ static void qfq_make_eligible(struct qfq_sched *q) { unsigned long vslot = q->V >> q->min_slot_shift; unsigned long old_vslot = q->oldV >> q->min_slot_shift; if (vslot != old_vslot) { unsigned long mask; int last_flip_pos = fls(vslot ^ old_vslot); if (last_flip_pos > 31) /* higher than the number of groups */ mask = ~0UL; /* make all groups eligible */ else mask = (1UL << last_flip_pos) - 1; qfq_move_groups(q, mask, IR, ER); qfq_move_groups(q, mask, IB, EB); } } /* * The index of the slot in which the input aggregate agg is to be * inserted must not be higher than QFQ_MAX_SLOTS-2. There is a '-2' * and not a '-1' because the start time of the group may be moved * backward by one slot after the aggregate has been inserted, and * this would cause non-empty slots to be right-shifted by one * position. * * QFQ+ fully satisfies this bound to the slot index if the parameters * of the classes are not changed dynamically, and if QFQ+ never * happens to postpone the service of agg unjustly, i.e., it never * happens that the aggregate becomes backlogged and eligible, or just * eligible, while an aggregate with a higher approximated finish time * is being served. In particular, in this case QFQ+ guarantees that * the timestamps of agg are low enough that the slot index is never * higher than 2. Unfortunately, QFQ+ cannot provide the same * guarantee if it happens to unjustly postpone the service of agg, or * if the parameters of some class are changed. * * As for the first event, i.e., an out-of-order service, the * upper bound to the slot index guaranteed by QFQ+ grows to * 2 + * QFQ_MAX_AGG_CLASSES * ((1<<QFQ_MTU_SHIFT)/QFQ_MIN_LMAX) * * (current_max_weight/current_wsum) <= 2 + 8 * 128 * 1. * * The following function deals with this problem by backward-shifting * the timestamps of agg, if needed, so as to guarantee that the slot * index is never higher than QFQ_MAX_SLOTS-2. This backward-shift may * cause the service of other aggregates to be postponed, yet the * worst-case guarantees of these aggregates are not violated. In * fact, in case of no out-of-order service, the timestamps of agg * would have been even lower than they are after the backward shift, * because QFQ+ would have guaranteed a maximum value equal to 2 for * the slot index, and 2 < QFQ_MAX_SLOTS-2. Hence the aggregates whose * service is postponed because of the backward-shift would have * however waited for the service of agg before being served. * * The other event that may cause the slot index to be higher than 2 * for agg is a recent change of the parameters of some class. If the * weight of a class is increased or the lmax (max_pkt_size) of the * class is decreased, then a new aggregate with smaller slot size * than the original parent aggregate of the class may happen to be * activated. The activation of this aggregate should be properly * delayed to when the service of the class has finished in the ideal * system tracked by QFQ+. If the activation of the aggregate is not * delayed to this reference time instant, then this aggregate may be * unjustly served before other aggregates waiting for service. This * may cause the above bound to the slot index to be violated for some * of these unlucky aggregates. * * Instead of delaying the activation of the new aggregate, which is * quite complex, the above-discussed capping of the slot index is * used to handle also the consequences of a change of the parameters * of a class. */ static void qfq_slot_insert(struct qfq_group *grp, struct qfq_aggregate *agg, u64 roundedS) { u64 slot = (roundedS - grp->S) >> grp->slot_shift; unsigned int i; /* slot index in the bucket list */ if (unlikely(slot > QFQ_MAX_SLOTS - 2)) { u64 deltaS = roundedS - grp->S - ((u64)(QFQ_MAX_SLOTS - 2)<<grp->slot_shift); agg->S -= deltaS; agg->F -= deltaS; slot = QFQ_MAX_SLOTS - 2; } i = (grp->front + slot) % QFQ_MAX_SLOTS; hlist_add_head(&agg->next, &grp->slots[i]); __set_bit(slot, &grp->full_slots); } /* Maybe introduce hlist_first_entry?? */ static struct qfq_aggregate *qfq_slot_head(struct qfq_group *grp) { return hlist_entry(grp->slots[grp->front].first, struct qfq_aggregate, next); } /* * remove the entry from the slot */ static void qfq_front_slot_remove(struct qfq_group *grp) { struct qfq_aggregate *agg = qfq_slot_head(grp); BUG_ON(!agg); hlist_del(&agg->next); if (hlist_empty(&grp->slots[grp->front])) __clear_bit(0, &grp->full_slots); } /* * Returns the first aggregate in the first non-empty bucket of the * group. As a side effect, adjusts the bucket list so the first * non-empty bucket is at position 0 in full_slots. */ static struct qfq_aggregate *qfq_slot_scan(struct qfq_group *grp) { unsigned int i; pr_debug("qfq slot_scan: grp %u full %#lx\n", grp->index, grp->full_slots); if (grp->full_slots == 0) return NULL; i = __ffs(grp->full_slots); /* zero based */ if (i > 0) { grp->front = (grp->front + i) % QFQ_MAX_SLOTS; grp->full_slots >>= i; } return qfq_slot_head(grp); } /* * adjust the bucket list. When the start time of a group decreases, * we move the index down (modulo QFQ_MAX_SLOTS) so we don't need to * move the objects. The mask of occupied slots must be shifted * because we use ffs() to find the first non-empty slot. * This covers decreases in the group's start time, but what about * increases of the start time ? * Here too we should make sure that i is less than 32 */ static void qfq_slot_rotate(struct qfq_group *grp, u64 roundedS) { unsigned int i = (grp->S - roundedS) >> grp->slot_shift; grp->full_slots <<= i; grp->front = (grp->front - i) % QFQ_MAX_SLOTS; } static void qfq_update_eligible(struct qfq_sched *q) { struct qfq_group *grp; unsigned long ineligible; ineligible = q->bitmaps[IR] | q->bitmaps[IB]; if (ineligible) { if (!q->bitmaps[ER]) { grp = qfq_ffs(q, ineligible); if (qfq_gt(grp->S, q->V)) q->V = grp->S; } qfq_make_eligible(q); } } /* Dequeue head packet of the head class in the DRR queue of the aggregate. */ static struct sk_buff *agg_dequeue(struct qfq_aggregate *agg, struct qfq_class *cl, unsigned int len) { struct sk_buff *skb = qdisc_dequeue_peeked(cl->qdisc); if (!skb) return NULL; cl->deficit -= (int) len; if (cl->qdisc->q.qlen == 0) /* no more packets, remove from list */ list_del(&cl->alist); else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc))) { cl->deficit += agg->lmax; list_move_tail(&cl->alist, &agg->active); } return skb; } static inline struct sk_buff *qfq_peek_skb(struct qfq_aggregate *agg, struct qfq_class **cl, unsigned int *len) { struct sk_buff *skb; *cl = list_first_entry(&agg->active, struct qfq_class, alist); skb = (*cl)->qdisc->ops->peek((*cl)->qdisc); if (skb == NULL) qdisc_warn_nonwc("qfq_dequeue", (*cl)->qdisc); else *len = qdisc_pkt_len(skb); return skb; } /* Update F according to the actual service received by the aggregate. */ static inline void charge_actual_service(struct qfq_aggregate *agg) { /* Compute the service received by the aggregate, taking into * account that, after decreasing the number of classes in * agg, it may happen that * agg->initial_budget - agg->budget > agg->bugdetmax */ u32 service_received = min(agg->budgetmax, agg->initial_budget - agg->budget); agg->F = agg->S + (u64)service_received * agg->inv_w; } /* Assign a reasonable start time for a new aggregate in group i. * Admissible values for \hat(F) are multiples of \sigma_i * no greater than V+\sigma_i . Larger values mean that * we had a wraparound so we consider the timestamp to be stale. * * If F is not stale and F >= V then we set S = F. * Otherwise we should assign S = V, but this may violate * the ordering in EB (see [2]). So, if we have groups in ER, * set S to the F_j of the first group j which would be blocking us. * We are guaranteed not to move S backward because * otherwise our group i would still be blocked. */ static void qfq_update_start(struct qfq_sched *q, struct qfq_aggregate *agg) { unsigned long mask; u64 limit, roundedF; int slot_shift = agg->grp->slot_shift; roundedF = qfq_round_down(agg->F, slot_shift); limit = qfq_round_down(q->V, slot_shift) + (1ULL << slot_shift); if (!qfq_gt(agg->F, q->V) || qfq_gt(roundedF, limit)) { /* timestamp was stale */ mask = mask_from(q->bitmaps[ER], agg->grp->index); if (mask) { struct qfq_group *next = qfq_ffs(q, mask); if (qfq_gt(roundedF, next->F)) { if (qfq_gt(limit, next->F)) agg->S = next->F; else /* preserve timestamp correctness */ agg->S = limit; return; } } agg->S = q->V; } else /* timestamp is not stale */ agg->S = agg->F; } /* Update the timestamps of agg before scheduling/rescheduling it for * service. In particular, assign to agg->F its maximum possible * value, i.e., the virtual finish time with which the aggregate * should be labeled if it used all its budget once in service. */ static inline void qfq_update_agg_ts(struct qfq_sched *q, struct qfq_aggregate *agg, enum update_reason reason) { if (reason != requeue) qfq_update_start(q, agg); else /* just charge agg for the service received */ agg->S = agg->F; agg->F = agg->S + (u64)agg->budgetmax * agg->inv_w; } static void qfq_schedule_agg(struct qfq_sched *q, struct qfq_aggregate *agg); static struct sk_buff *qfq_dequeue(struct Qdisc *sch) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_aggregate *in_serv_agg = q->in_serv_agg; struct qfq_class *cl; struct sk_buff *skb = NULL; /* next-packet len, 0 means no more active classes in in-service agg */ unsigned int len = 0; if (in_serv_agg == NULL) return NULL; if (!list_empty(&in_serv_agg->active)) skb = qfq_peek_skb(in_serv_agg, &cl, &len); /* * If there are no active classes in the in-service aggregate, * or if the aggregate has not enough budget to serve its next * class, then choose the next aggregate to serve. */ if (len == 0 || in_serv_agg->budget < len) { charge_actual_service(in_serv_agg); /* recharge the budget of the aggregate */ in_serv_agg->initial_budget = in_serv_agg->budget = in_serv_agg->budgetmax; if (!list_empty(&in_serv_agg->active)) { /* * Still active: reschedule for * service. Possible optimization: if no other * aggregate is active, then there is no point * in rescheduling this aggregate, and we can * just keep it as the in-service one. This * should be however a corner case, and to * handle it, we would need to maintain an * extra num_active_aggs field. */ qfq_update_agg_ts(q, in_serv_agg, requeue); qfq_schedule_agg(q, in_serv_agg); } else if (sch->q.qlen == 0) { /* no aggregate to serve */ q->in_serv_agg = NULL; return NULL; } /* * If we get here, there are other aggregates queued: * choose the new aggregate to serve. */ in_serv_agg = q->in_serv_agg = qfq_choose_next_agg(q); skb = qfq_peek_skb(in_serv_agg, &cl, &len); } if (!skb) return NULL; sch->q.qlen--; skb = agg_dequeue(in_serv_agg, cl, len); if (!skb) { sch->q.qlen++; return NULL; } qdisc_qstats_backlog_dec(sch, skb); qdisc_bstats_update(sch, skb); /* If lmax is lowered, through qfq_change_class, for a class * owning pending packets with larger size than the new value * of lmax, then the following condition may hold. */ if (unlikely(in_serv_agg->budget < len)) in_serv_agg->budget = 0; else in_serv_agg->budget -= len; q->V += (u64)len * q->iwsum; pr_debug("qfq dequeue: len %u F %lld now %lld\n", len, (unsigned long long) in_serv_agg->F, (unsigned long long) q->V); return skb; } static struct qfq_aggregate *qfq_choose_next_agg(struct qfq_sched *q) { struct qfq_group *grp; struct qfq_aggregate *agg, *new_front_agg; u64 old_F; qfq_update_eligible(q); q->oldV = q->V; if (!q->bitmaps[ER]) return NULL; grp = qfq_ffs(q, q->bitmaps[ER]); old_F = grp->F; agg = qfq_slot_head(grp); /* agg starts to be served, remove it from schedule */ qfq_front_slot_remove(grp); new_front_agg = qfq_slot_scan(grp); if (new_front_agg == NULL) /* group is now inactive, remove from ER */ __clear_bit(grp->index, &q->bitmaps[ER]); else { u64 roundedS = qfq_round_down(new_front_agg->S, grp->slot_shift); unsigned int s; if (grp->S == roundedS) return agg; grp->S = roundedS; grp->F = roundedS + (2ULL << grp->slot_shift); __clear_bit(grp->index, &q->bitmaps[ER]); s = qfq_calc_state(q, grp); __set_bit(grp->index, &q->bitmaps[s]); } qfq_unblock_groups(q, grp->index, old_F); return agg; } static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) { unsigned int len = qdisc_pkt_len(skb), gso_segs; struct qfq_sched *q = qdisc_priv(sch); struct qfq_class *cl; struct qfq_aggregate *agg; int err = 0; bool first; cl = qfq_classify(skb, sch, &err); if (cl == NULL) { if (err & __NET_XMIT_BYPASS) qdisc_qstats_drop(sch); __qdisc_drop(skb, to_free); return err; } pr_debug("qfq_enqueue: cl = %x\n", cl->common.classid); if (unlikely(cl->agg->lmax < len)) { pr_debug("qfq: increasing maxpkt from %u to %u for class %u", cl->agg->lmax, len, cl->common.classid); err = qfq_change_agg(sch, cl, cl->agg->class_weight, len); if (err) { cl->qstats.drops++; return qdisc_drop(skb, sch, to_free); } } gso_segs = skb_is_gso(skb) ? skb_shinfo(skb)->gso_segs : 1; first = !cl->qdisc->q.qlen; err = qdisc_enqueue(skb, cl->qdisc, to_free); if (unlikely(err != NET_XMIT_SUCCESS)) { pr_debug("qfq_enqueue: enqueue failed %d\n", err); if (net_xmit_drop_count(err)) { cl->qstats.drops++; qdisc_qstats_drop(sch); } return err; } _bstats_update(&cl->bstats, len, gso_segs); sch->qstats.backlog += len; ++sch->q.qlen; agg = cl->agg; /* if the queue was not empty, then done here */ if (!first) { if (unlikely(skb == cl->qdisc->ops->peek(cl->qdisc)) && list_first_entry(&agg->active, struct qfq_class, alist) == cl && cl->deficit < len) list_move_tail(&cl->alist, &agg->active); return err; } /* schedule class for service within the aggregate */ cl->deficit = agg->lmax; list_add_tail(&cl->alist, &agg->active); if (list_first_entry(&agg->active, struct qfq_class, alist) != cl || q->in_serv_agg == agg) return err; /* non-empty or in service, nothing else to do */ qfq_activate_agg(q, agg, enqueue); return err; } /* * Schedule aggregate according to its timestamps. */ static void qfq_schedule_agg(struct qfq_sched *q, struct qfq_aggregate *agg) { struct qfq_group *grp = agg->grp; u64 roundedS; int s; roundedS = qfq_round_down(agg->S, grp->slot_shift); /* * Insert agg in the correct bucket. * If agg->S >= grp->S we don't need to adjust the * bucket list and simply go to the insertion phase. * Otherwise grp->S is decreasing, we must make room * in the bucket list, and also recompute the group state. * Finally, if there were no flows in this group and nobody * was in ER make sure to adjust V. */ if (grp->full_slots) { if (!qfq_gt(grp->S, agg->S)) goto skip_update; /* create a slot for this agg->S */ qfq_slot_rotate(grp, roundedS); /* group was surely ineligible, remove */ __clear_bit(grp->index, &q->bitmaps[IR]); __clear_bit(grp->index, &q->bitmaps[IB]); } else if (!q->bitmaps[ER] && qfq_gt(roundedS, q->V) && q->in_serv_agg == NULL) q->V = roundedS; grp->S = roundedS; grp->F = roundedS + (2ULL << grp->slot_shift); s = qfq_calc_state(q, grp); __set_bit(grp->index, &q->bitmaps[s]); pr_debug("qfq enqueue: new state %d %#lx S %lld F %lld V %lld\n", s, q->bitmaps[s], (unsigned long long) agg->S, (unsigned long long) agg->F, (unsigned long long) q->V); skip_update: qfq_slot_insert(grp, agg, roundedS); } /* Update agg ts and schedule agg for service */ static void qfq_activate_agg(struct qfq_sched *q, struct qfq_aggregate *agg, enum update_reason reason) { agg->initial_budget = agg->budget = agg->budgetmax; /* recharge budg. */ qfq_update_agg_ts(q, agg, reason); if (q->in_serv_agg == NULL) { /* no aggr. in service or scheduled */ q->in_serv_agg = agg; /* start serving this aggregate */ /* update V: to be in service, agg must be eligible */ q->oldV = q->V = agg->S; } else if (agg != q->in_serv_agg) qfq_schedule_agg(q, agg); } static void qfq_slot_remove(struct qfq_sched *q, struct qfq_group *grp, struct qfq_aggregate *agg) { unsigned int i, offset; u64 roundedS; roundedS = qfq_round_down(agg->S, grp->slot_shift); offset = (roundedS - grp->S) >> grp->slot_shift; i = (grp->front + offset) % QFQ_MAX_SLOTS; hlist_del(&agg->next); if (hlist_empty(&grp->slots[i])) __clear_bit(offset, &grp->full_slots); } /* * Called to forcibly deschedule an aggregate. If the aggregate is * not in the front bucket, or if the latter has other aggregates in * the front bucket, we can simply remove the aggregate with no other * side effects. * Otherwise we must propagate the event up. */ static void qfq_deactivate_agg(struct qfq_sched *q, struct qfq_aggregate *agg) { struct qfq_group *grp = agg->grp; unsigned long mask; u64 roundedS; int s; if (agg == q->in_serv_agg) { charge_actual_service(agg); q->in_serv_agg = qfq_choose_next_agg(q); return; } agg->F = agg->S; qfq_slot_remove(q, grp, agg); if (!grp->full_slots) { __clear_bit(grp->index, &q->bitmaps[IR]); __clear_bit(grp->index, &q->bitmaps[EB]); __clear_bit(grp->index, &q->bitmaps[IB]); if (test_bit(grp->index, &q->bitmaps[ER]) && !(q->bitmaps[ER] & ~((1UL << grp->index) - 1))) { mask = q->bitmaps[ER] & ((1UL << grp->index) - 1); if (mask) mask = ~((1UL << __fls(mask)) - 1); else mask = ~0UL; qfq_move_groups(q, mask, EB, ER); qfq_move_groups(q, mask, IB, IR); } __clear_bit(grp->index, &q->bitmaps[ER]); } else if (hlist_empty(&grp->slots[grp->front])) { agg = qfq_slot_scan(grp); roundedS = qfq_round_down(agg->S, grp->slot_shift); if (grp->S != roundedS) { __clear_bit(grp->index, &q->bitmaps[ER]); __clear_bit(grp->index, &q->bitmaps[IR]); __clear_bit(grp->index, &q->bitmaps[EB]); __clear_bit(grp->index, &q->bitmaps[IB]); grp->S = roundedS; grp->F = roundedS + (2ULL << grp->slot_shift); s = qfq_calc_state(q, grp); __set_bit(grp->index, &q->bitmaps[s]); } } } static void qfq_qlen_notify(struct Qdisc *sch, unsigned long arg) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_class *cl = (struct qfq_class *)arg; qfq_deactivate_class(q, cl); } static int qfq_init_qdisc(struct Qdisc *sch, struct nlattr *opt, struct netlink_ext_ack *extack) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_group *grp; int i, j, err; u32 max_cl_shift, maxbudg_shift, max_classes; err = tcf_block_get(&q->block, &q->filter_list, sch, extack); if (err) return err; err = qdisc_class_hash_init(&q->clhash); if (err < 0) return err; max_classes = min_t(u64, (u64)qdisc_dev(sch)->tx_queue_len + 1, QFQ_MAX_AGG_CLASSES); /* max_cl_shift = floor(log_2(max_classes)) */ max_cl_shift = __fls(max_classes); q->max_agg_classes = 1<<max_cl_shift; /* maxbudg_shift = log2(max_len * max_classes_per_agg) */ maxbudg_shift = QFQ_MTU_SHIFT + max_cl_shift; q->min_slot_shift = FRAC_BITS + maxbudg_shift - QFQ_MAX_INDEX; for (i = 0; i <= QFQ_MAX_INDEX; i++) { grp = &q->groups[i]; grp->index = i; grp->slot_shift = q->min_slot_shift + i; for (j = 0; j < QFQ_MAX_SLOTS; j++) INIT_HLIST_HEAD(&grp->slots[j]); } INIT_HLIST_HEAD(&q->nonfull_aggs); return 0; } static void qfq_reset_qdisc(struct Qdisc *sch) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_class *cl; unsigned int i; for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry(cl, &q->clhash.hash[i], common.hnode) { if (cl->qdisc->q.qlen > 0) qfq_deactivate_class(q, cl); qdisc_reset(cl->qdisc); } } } static void qfq_destroy_qdisc(struct Qdisc *sch) { struct qfq_sched *q = qdisc_priv(sch); struct qfq_class *cl; struct hlist_node *next; unsigned int i; tcf_block_put(q->block); for (i = 0; i < q->clhash.hashsize; i++) { hlist_for_each_entry_safe(cl, next, &q->clhash.hash[i], common.hnode) { qfq_destroy_class(sch, cl); } } qdisc_class_hash_destroy(&q->clhash); } static const struct Qdisc_class_ops qfq_class_ops = { .change = qfq_change_class, .delete = qfq_delete_class, .find = qfq_search_class, .tcf_block = qfq_tcf_block, .bind_tcf = qfq_bind_tcf, .unbind_tcf = qfq_unbind_tcf, .graft = qfq_graft_class, .leaf = qfq_class_leaf, .qlen_notify = qfq_qlen_notify, .dump = qfq_dump_class, .dump_stats = qfq_dump_class_stats, .walk = qfq_walk, }; static struct Qdisc_ops qfq_qdisc_ops __read_mostly = { .cl_ops = &qfq_class_ops, .id = "qfq", .priv_size = sizeof(struct qfq_sched), .enqueue = qfq_enqueue, .dequeue = qfq_dequeue, .peek = qdisc_peek_dequeued, .init = qfq_init_qdisc, .reset = qfq_reset_qdisc, .destroy = qfq_destroy_qdisc, .owner = THIS_MODULE, }; MODULE_ALIAS_NET_SCH("qfq"); static int __init qfq_init(void) { return register_qdisc(&qfq_qdisc_ops); } static void __exit qfq_exit(void) { unregister_qdisc(&qfq_qdisc_ops); } module_init(qfq_init); module_exit(qfq_exit); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Quick Fair Queueing Plus qdisc"); |
1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 | // SPDX-License-Identifier: GPL-2.0-or-later /* * HID driver for some microsoft "special" devices * * Copyright (c) 1999 Andreas Gal * Copyright (c) 2000-2005 Vojtech Pavlik <vojtech@suse.cz> * Copyright (c) 2005 Michael Haboustak <mike-@cinci.rr.com> for Concept2, Inc * Copyright (c) 2006-2007 Jiri Kosina * Copyright (c) 2008 Jiri Slaby */ /* */ #include <linux/device.h> #include <linux/input.h> #include <linux/hid.h> #include <linux/module.h> #include "hid-ids.h" #define MS_HIDINPUT BIT(0) #define MS_ERGONOMY BIT(1) #define MS_PRESENTER BIT(2) #define MS_RDESC BIT(3) #define MS_NOGET BIT(4) #define MS_DUPLICATE_USAGES BIT(5) #define MS_SURFACE_DIAL BIT(6) #define MS_QUIRK_FF BIT(7) struct ms_data { unsigned long quirks; struct hid_device *hdev; struct work_struct ff_worker; __u8 strong; __u8 weak; void *output_report_dmabuf; }; #define XB1S_FF_REPORT 3 #define ENABLE_WEAK BIT(0) #define ENABLE_STRONG BIT(1) enum { MAGNITUDE_STRONG = 2, MAGNITUDE_WEAK, MAGNITUDE_NUM }; struct xb1s_ff_report { __u8 report_id; __u8 enable; __u8 magnitude[MAGNITUDE_NUM]; __u8 duration_10ms; __u8 start_delay_10ms; __u8 loop_count; } __packed; static __u8 *ms_report_fixup(struct hid_device *hdev, __u8 *rdesc, unsigned int *rsize) { struct ms_data *ms = hid_get_drvdata(hdev); unsigned long quirks = ms->quirks; /* * Microsoft Wireless Desktop Receiver (Model 1028) has * 'Usage Min/Max' where it ought to have 'Physical Min/Max' */ if ((quirks & MS_RDESC) && *rsize == 571 && rdesc[557] == 0x19 && rdesc[559] == 0x29) { hid_info(hdev, "fixing up Microsoft Wireless Receiver Model 1028 report descriptor\n"); rdesc[557] = 0x35; rdesc[559] = 0x45; } return rdesc; } #define ms_map_key_clear(c) hid_map_usage_clear(hi, usage, bit, max, \ EV_KEY, (c)) static int ms_ergonomy_kb_quirk(struct hid_input *hi, struct hid_usage *usage, unsigned long **bit, int *max) { struct input_dev *input = hi->input; if ((usage->hid & HID_USAGE_PAGE) == HID_UP_CONSUMER) { switch (usage->hid & HID_USAGE) { /* * Microsoft uses these 2 reserved usage ids for 2 keys on * the MS office kb labelled "Office Home" and "Task Pane". */ case 0x29d: ms_map_key_clear(KEY_PROG1); return 1; case 0x29e: ms_map_key_clear(KEY_PROG2); return 1; } return 0; } if ((usage->hid & HID_USAGE_PAGE) != HID_UP_MSVENDOR) return 0; switch (usage->hid & HID_USAGE) { case 0xfd06: ms_map_key_clear(KEY_CHAT); break; case 0xfd07: ms_map_key_clear(KEY_PHONE); break; case 0xff00: /* Special keypad keys */ ms_map_key_clear(KEY_KPEQUAL); set_bit(KEY_KPLEFTPAREN, input->keybit); set_bit(KEY_KPRIGHTPAREN, input->keybit); break; case 0xff01: /* Scroll wheel */ hid_map_usage_clear(hi, usage, bit, max, EV_REL, REL_WHEEL); break; case 0xff02: /* * This byte contains a copy of the modifier keys byte of a * standard hid keyboard report, as send by interface 0 * (this usage is found on interface 1). * * This byte only gets send when another key in the same report * changes state, and as such is useless, ignore it. */ return -1; case 0xff05: set_bit(EV_REP, input->evbit); ms_map_key_clear(KEY_F13); set_bit(KEY_F14, input->keybit); set_bit(KEY_F15, input->keybit); set_bit(KEY_F16, input->keybit); set_bit(KEY_F17, input->keybit); set_bit(KEY_F18, input->keybit); break; default: return 0; } return 1; } static int ms_presenter_8k_quirk(struct hid_input *hi, struct hid_usage *usage, unsigned long **bit, int *max) { if ((usage->hid & HID_USAGE_PAGE) != HID_UP_MSVENDOR) return 0; set_bit(EV_REP, hi->input->evbit); switch (usage->hid & HID_USAGE) { case 0xfd08: ms_map_key_clear(KEY_FORWARD); break; case 0xfd09: ms_map_key_clear(KEY_BACK); break; case 0xfd0b: ms_map_key_clear(KEY_PLAYPAUSE); break; case 0xfd0e: ms_map_key_clear(KEY_CLOSE); break; case 0xfd0f: ms_map_key_clear(KEY_PLAY); break; default: return 0; } return 1; } static int ms_surface_dial_quirk(struct hid_input *hi, struct hid_field *field, struct hid_usage *usage, unsigned long **bit, int *max) { switch (usage->hid & HID_USAGE_PAGE) { case 0xff070000: case HID_UP_DIGITIZER: /* ignore those axis */ return -1; case HID_UP_GENDESK: switch (usage->hid) { case HID_GD_X: case HID_GD_Y: case HID_GD_RFKILL_BTN: /* ignore those axis */ return -1; } } return 0; } static int ms_input_mapping(struct hid_device *hdev, struct hid_input *hi, struct hid_field *field, struct hid_usage *usage, unsigned long **bit, int *max) { struct ms_data *ms = hid_get_drvdata(hdev); unsigned long quirks = ms->quirks; if (quirks & MS_ERGONOMY) { int ret = ms_ergonomy_kb_quirk(hi, usage, bit, max); if (ret) return ret; } if ((quirks & MS_PRESENTER) && ms_presenter_8k_quirk(hi, usage, bit, max)) return 1; if (quirks & MS_SURFACE_DIAL) { int ret = ms_surface_dial_quirk(hi, field, usage, bit, max); if (ret) return ret; } return 0; } static int ms_input_mapped(struct hid_device *hdev, struct hid_input *hi, struct hid_field *field, struct hid_usage *usage, unsigned long **bit, int *max) { struct ms_data *ms = hid_get_drvdata(hdev); unsigned long quirks = ms->quirks; if (quirks & MS_DUPLICATE_USAGES) clear_bit(usage->code, *bit); return 0; } static int ms_event(struct hid_device *hdev, struct hid_field *field, struct hid_usage *usage, __s32 value) { struct ms_data *ms = hid_get_drvdata(hdev); unsigned long quirks = ms->quirks; struct input_dev *input; if (!(hdev->claimed & HID_CLAIMED_INPUT) || !field->hidinput || !usage->type) return 0; input = field->hidinput->input; /* Handling MS keyboards special buttons */ if (quirks & MS_ERGONOMY && usage->hid == (HID_UP_MSVENDOR | 0xff00)) { /* Special keypad keys */ input_report_key(input, KEY_KPEQUAL, value & 0x01); input_report_key(input, KEY_KPLEFTPAREN, value & 0x02); input_report_key(input, KEY_KPRIGHTPAREN, value & 0x04); return 1; } if (quirks & MS_ERGONOMY && usage->hid == (HID_UP_MSVENDOR | 0xff01)) { /* Scroll wheel */ int step = ((value & 0x60) >> 5) + 1; switch (value & 0x1f) { case 0x01: input_report_rel(input, REL_WHEEL, step); break; case 0x1f: input_report_rel(input, REL_WHEEL, -step); break; } return 1; } if (quirks & MS_ERGONOMY && usage->hid == (HID_UP_MSVENDOR | 0xff05)) { static unsigned int last_key = 0; unsigned int key = 0; switch (value) { case 0x01: key = KEY_F14; break; case 0x02: key = KEY_F15; break; case 0x04: key = KEY_F16; break; case 0x08: key = KEY_F17; break; case 0x10: key = KEY_F18; break; } if (key) { input_event(input, usage->type, key, 1); last_key = key; } else input_event(input, usage->type, last_key, 0); return 1; } return 0; } static void ms_ff_worker(struct work_struct *work) { struct ms_data *ms = container_of(work, struct ms_data, ff_worker); struct hid_device *hdev = ms->hdev; struct xb1s_ff_report *r = ms->output_report_dmabuf; int ret; memset(r, 0, sizeof(*r)); r->report_id = XB1S_FF_REPORT; r->enable = ENABLE_WEAK | ENABLE_STRONG; /* * Specifying maximum duration and maximum loop count should * cover maximum duration of a single effect, which is 65536 * ms */ r->duration_10ms = U8_MAX; r->loop_count = U8_MAX; r->magnitude[MAGNITUDE_STRONG] = ms->strong; /* left actuator */ r->magnitude[MAGNITUDE_WEAK] = ms->weak; /* right actuator */ ret = hid_hw_output_report(hdev, (__u8 *)r, sizeof(*r)); if (ret < 0) hid_warn(hdev, "failed to send FF report\n"); } static int ms_play_effect(struct input_dev *dev, void *data, struct ff_effect *effect) { struct hid_device *hid = input_get_drvdata(dev); struct ms_data *ms = hid_get_drvdata(hid); if (effect->type != FF_RUMBLE) return 0; /* * Magnitude is 0..100 so scale the 16-bit input here */ ms->strong = ((u32) effect->u.rumble.strong_magnitude * 100) / U16_MAX; ms->weak = ((u32) effect->u.rumble.weak_magnitude * 100) / U16_MAX; schedule_work(&ms->ff_worker); return 0; } static int ms_init_ff(struct hid_device *hdev) { struct hid_input *hidinput; struct input_dev *input_dev; struct ms_data *ms = hid_get_drvdata(hdev); if (list_empty(&hdev->inputs)) { hid_err(hdev, "no inputs found\n"); return -ENODEV; } hidinput = list_entry(hdev->inputs.next, struct hid_input, list); input_dev = hidinput->input; if (!(ms->quirks & MS_QUIRK_FF)) return 0; ms->hdev = hdev; INIT_WORK(&ms->ff_worker, ms_ff_worker); ms->output_report_dmabuf = devm_kzalloc(&hdev->dev, sizeof(struct xb1s_ff_report), GFP_KERNEL); if (ms->output_report_dmabuf == NULL) return -ENOMEM; input_set_capability(input_dev, EV_FF, FF_RUMBLE); return input_ff_create_memless(input_dev, NULL, ms_play_effect); } static void ms_remove_ff(struct hid_device *hdev) { struct ms_data *ms = hid_get_drvdata(hdev); if (!(ms->quirks & MS_QUIRK_FF)) return; cancel_work_sync(&ms->ff_worker); } static int ms_probe(struct hid_device *hdev, const struct hid_device_id *id) { unsigned long quirks = id->driver_data; struct ms_data *ms; int ret; ms = devm_kzalloc(&hdev->dev, sizeof(*ms), GFP_KERNEL); if (ms == NULL) return -ENOMEM; ms->quirks = quirks; hid_set_drvdata(hdev, ms); if (quirks & MS_NOGET) hdev->quirks |= HID_QUIRK_NOGET; if (quirks & MS_SURFACE_DIAL) hdev->quirks |= HID_QUIRK_INPUT_PER_APP; ret = hid_parse(hdev); if (ret) { hid_err(hdev, "parse failed\n"); goto err_free; } ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT | ((quirks & MS_HIDINPUT) ? HID_CONNECT_HIDINPUT_FORCE : 0)); if (ret) { hid_err(hdev, "hw start failed\n"); goto err_free; } ret = ms_init_ff(hdev); if (ret) hid_err(hdev, "could not initialize ff, continuing anyway"); return 0; err_free: return ret; } static void ms_remove(struct hid_device *hdev) { hid_hw_stop(hdev); ms_remove_ff(hdev); } static const struct hid_device_id ms_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_SIDEWINDER_GV), .driver_data = MS_HIDINPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_OFFICE_KB), .driver_data = MS_ERGONOMY }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_NE4K), .driver_data = MS_ERGONOMY }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_NE4K_JP), .driver_data = MS_ERGONOMY }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_NE7K), .driver_data = MS_ERGONOMY }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_LK6K), .driver_data = MS_ERGONOMY | MS_RDESC }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_PRESENTER_8K_USB), .driver_data = MS_PRESENTER }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_DIGITAL_MEDIA_3K), .driver_data = MS_ERGONOMY }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_DIGITAL_MEDIA_7K), .driver_data = MS_ERGONOMY }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_DIGITAL_MEDIA_600), .driver_data = MS_ERGONOMY }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_DIGITAL_MEDIA_3KV1), .driver_data = MS_ERGONOMY }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_WIRELESS_OPTICAL_DESKTOP_3_0), .driver_data = MS_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_COMFORT_MOUSE_4500), .driver_data = MS_DUPLICATE_USAGES }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_POWER_COVER), .driver_data = MS_HIDINPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_COMFORT_KEYBOARD), .driver_data = MS_ERGONOMY}, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_PRESENTER_8K_BT), .driver_data = MS_PRESENTER }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_MICROSOFT, 0x091B), .driver_data = MS_SURFACE_DIAL }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_XBOX_CONTROLLER_MODEL_1708), .driver_data = MS_QUIRK_FF }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_XBOX_CONTROLLER_MODEL_1708_BLE), .driver_data = MS_QUIRK_FF }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_XBOX_CONTROLLER_MODEL_1914), .driver_data = MS_QUIRK_FF }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_XBOX_CONTROLLER_MODEL_1797), .driver_data = MS_QUIRK_FF }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_XBOX_CONTROLLER_MODEL_1797_BLE), .driver_data = MS_QUIRK_FF }, { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_8BITDO_SN30_PRO_PLUS), .driver_data = MS_QUIRK_FF }, { } }; MODULE_DEVICE_TABLE(hid, ms_devices); static struct hid_driver ms_driver = { .name = "microsoft", .id_table = ms_devices, .report_fixup = ms_report_fixup, .input_mapping = ms_input_mapping, .input_mapped = ms_input_mapped, .event = ms_event, .probe = ms_probe, .remove = ms_remove, }; module_hid_driver(ms_driver); MODULE_LICENSE("GPL"); |
2 5 3 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Force feedback support for various HID compliant devices by ThrustMaster: * ThrustMaster FireStorm Dual Power 2 * and possibly others whose device ids haven't been added. * * Modified to support ThrustMaster devices by Zinx Verituse * on 2003-01-25 from the Logitech force feedback driver, * which is by Johann Deneux. * * Copyright (c) 2003 Zinx Verituse <zinx@epicsol.org> * Copyright (c) 2002 Johann Deneux */ /* */ #include <linux/hid.h> #include <linux/input.h> #include <linux/slab.h> #include <linux/module.h> #include "hid-ids.h" #define THRUSTMASTER_DEVICE_ID_2_IN_1_DT 0xb320 static const signed short ff_rumble[] = { FF_RUMBLE, -1 }; static const signed short ff_joystick[] = { FF_CONSTANT, -1 }; #ifdef CONFIG_THRUSTMASTER_FF /* Usages for thrustmaster devices I know about */ #define THRUSTMASTER_USAGE_FF (HID_UP_GENDESK | 0xbb) struct tmff_device { struct hid_report *report; struct hid_field *ff_field; }; /* Changes values from 0 to 0xffff into values from minimum to maximum */ static inline int tmff_scale_u16(unsigned int in, int minimum, int maximum) { int ret; ret = (in * (maximum - minimum) / 0xffff) + minimum; if (ret < minimum) return minimum; if (ret > maximum) return maximum; return ret; } /* Changes values from -0x80 to 0x7f into values from minimum to maximum */ static inline int tmff_scale_s8(int in, int minimum, int maximum) { int ret; ret = (((in + 0x80) * (maximum - minimum)) / 0xff) + minimum; if (ret < minimum) return minimum; if (ret > maximum) return maximum; return ret; } static int tmff_play(struct input_dev *dev, void *data, struct ff_effect *effect) { struct hid_device *hid = input_get_drvdata(dev); struct tmff_device *tmff = data; struct hid_field *ff_field = tmff->ff_field; int x, y; int left, right; /* Rumbling */ switch (effect->type) { case FF_CONSTANT: x = tmff_scale_s8(effect->u.ramp.start_level, ff_field->logical_minimum, ff_field->logical_maximum); y = tmff_scale_s8(effect->u.ramp.end_level, ff_field->logical_minimum, ff_field->logical_maximum); dbg_hid("(x, y)=(%04x, %04x)\n", x, y); ff_field->value[0] = x; ff_field->value[1] = y; hid_hw_request(hid, tmff->report, HID_REQ_SET_REPORT); break; case FF_RUMBLE: left = tmff_scale_u16(effect->u.rumble.weak_magnitude, ff_field->logical_minimum, ff_field->logical_maximum); right = tmff_scale_u16(effect->u.rumble.strong_magnitude, ff_field->logical_minimum, ff_field->logical_maximum); /* 2-in-1 strong motor is left */ if (hid->product == THRUSTMASTER_DEVICE_ID_2_IN_1_DT) swap(left, right); dbg_hid("(left,right)=(%08x, %08x)\n", left, right); ff_field->value[0] = left; ff_field->value[1] = right; hid_hw_request(hid, tmff->report, HID_REQ_SET_REPORT); break; } return 0; } static int tmff_init(struct hid_device *hid, const signed short *ff_bits) { struct tmff_device *tmff; struct hid_report *report; struct list_head *report_list; struct hid_input *hidinput; struct input_dev *input_dev; int error; int i; if (list_empty(&hid->inputs)) { hid_err(hid, "no inputs found\n"); return -ENODEV; } hidinput = list_entry(hid->inputs.next, struct hid_input, list); input_dev = hidinput->input; tmff = kzalloc(sizeof(struct tmff_device), GFP_KERNEL); if (!tmff) return -ENOMEM; /* Find the report to use */ report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list; list_for_each_entry(report, report_list, list) { int fieldnum; for (fieldnum = 0; fieldnum < report->maxfield; ++fieldnum) { struct hid_field *field = report->field[fieldnum]; if (field->maxusage <= 0) continue; switch (field->usage[0].hid) { case THRUSTMASTER_USAGE_FF: if (field->report_count < 2) { hid_warn(hid, "ignoring FF field with report_count < 2\n"); continue; } if (field->logical_maximum == field->logical_minimum) { hid_warn(hid, "ignoring FF field with logical_maximum == logical_minimum\n"); continue; } if (tmff->report && tmff->report != report) { hid_warn(hid, "ignoring FF field in other report\n"); continue; } if (tmff->ff_field && tmff->ff_field != field) { hid_warn(hid, "ignoring duplicate FF field\n"); continue; } tmff->report = report; tmff->ff_field = field; for (i = 0; ff_bits[i] >= 0; i++) set_bit(ff_bits[i], input_dev->ffbit); break; default: hid_warn(hid, "ignoring unknown output usage %08x\n", field->usage[0].hid); continue; } } } if (!tmff->report) { hid_err(hid, "can't find FF field in output reports\n"); error = -ENODEV; goto fail; } error = input_ff_create_memless(input_dev, tmff, tmff_play); if (error) goto fail; hid_info(hid, "force feedback for ThrustMaster devices by Zinx Verituse <zinx@epicsol.org>\n"); return 0; fail: kfree(tmff); return error; } #else static inline int tmff_init(struct hid_device *hid, const signed short *ff_bits) { return 0; } #endif static int tm_probe(struct hid_device *hdev, const struct hid_device_id *id) { int ret; ret = hid_parse(hdev); if (ret) { hid_err(hdev, "parse failed\n"); goto err; } ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT & ~HID_CONNECT_FF); if (ret) { hid_err(hdev, "hw start failed\n"); goto err; } tmff_init(hdev, (void *)id->driver_data); return 0; err: return ret; } static const struct hid_device_id tm_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb300), .driver_data = (unsigned long)ff_rumble }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb304), /* FireStorm Dual Power 2 (and 3) */ .driver_data = (unsigned long)ff_rumble }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, THRUSTMASTER_DEVICE_ID_2_IN_1_DT), /* Dual Trigger 2-in-1 */ .driver_data = (unsigned long)ff_rumble }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb323), /* Dual Trigger 3-in-1 (PC Mode) */ .driver_data = (unsigned long)ff_rumble }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb324), /* Dual Trigger 3-in-1 (PS3 Mode) */ .driver_data = (unsigned long)ff_rumble }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb605), /* NASCAR PRO FF2 Wheel */ .driver_data = (unsigned long)ff_joystick }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb651), /* FGT Rumble Force Wheel */ .driver_data = (unsigned long)ff_rumble }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb653), /* RGT Force Feedback CLUTCH Raging Wheel */ .driver_data = (unsigned long)ff_joystick }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb654), /* FGT Force Feedback Wheel */ .driver_data = (unsigned long)ff_joystick }, { HID_USB_DEVICE(USB_VENDOR_ID_THRUSTMASTER, 0xb65a), /* F430 Force Feedback Wheel */ .driver_data = (unsigned long)ff_joystick }, { } }; MODULE_DEVICE_TABLE(hid, tm_devices); static struct hid_driver tm_driver = { .name = "thrustmaster", .id_table = tm_devices, .probe = tm_probe, }; module_hid_driver(tm_driver); MODULE_LICENSE("GPL"); |
4 4 3 7 4 6 7 5 3 4 4 4 1 1 1 7 4 3 4 27 9 2 16 3 3 17 20 2 3 1 3 14 3 1 14 1 1 1 1 4 4 16 4 4 2 15 1 2 3 9 7 4 3 2 6 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 | // SPDX-License-Identifier: GPL-2.0-or-later #include <linux/plist.h> #include <linux/sched/signal.h> #include "futex.h" #include "../locking/rtmutex_common.h" /* * On PREEMPT_RT, the hash bucket lock is a 'sleeping' spinlock with an * underlying rtmutex. The task which is about to be requeued could have * just woken up (timeout, signal). After the wake up the task has to * acquire hash bucket lock, which is held by the requeue code. As a task * can only be blocked on _ONE_ rtmutex at a time, the proxy lock blocking * and the hash bucket lock blocking would collide and corrupt state. * * On !PREEMPT_RT this is not a problem and everything could be serialized * on hash bucket lock, but aside of having the benefit of common code, * this allows to avoid doing the requeue when the task is already on the * way out and taking the hash bucket lock of the original uaddr1 when the * requeue has been completed. * * The following state transitions are valid: * * On the waiter side: * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_IGNORE * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_WAIT * * On the requeue side: * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_INPROGRESS * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_DONE/LOCKED * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_NONE (requeue failed) * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_DONE/LOCKED * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_IGNORE (requeue failed) * * The requeue side ignores a waiter with state Q_REQUEUE_PI_IGNORE as this * signals that the waiter is already on the way out. It also means that * the waiter is still on the 'wait' futex, i.e. uaddr1. * * The waiter side signals early wakeup to the requeue side either through * setting state to Q_REQUEUE_PI_IGNORE or to Q_REQUEUE_PI_WAIT depending * on the current state. In case of Q_REQUEUE_PI_IGNORE it can immediately * proceed to take the hash bucket lock of uaddr1. If it set state to WAIT, * which means the wakeup is interleaving with a requeue in progress it has * to wait for the requeue side to change the state. Either to DONE/LOCKED * or to IGNORE. DONE/LOCKED means the waiter q is now on the uaddr2 futex * and either blocked (DONE) or has acquired it (LOCKED). IGNORE is set by * the requeue side when the requeue attempt failed via deadlock detection * and therefore the waiter q is still on the uaddr1 futex. */ enum { Q_REQUEUE_PI_NONE = 0, Q_REQUEUE_PI_IGNORE, Q_REQUEUE_PI_IN_PROGRESS, Q_REQUEUE_PI_WAIT, Q_REQUEUE_PI_DONE, Q_REQUEUE_PI_LOCKED, }; const struct futex_q futex_q_init = { /* list gets initialized in futex_queue()*/ .wake = futex_wake_mark, .key = FUTEX_KEY_INIT, .bitset = FUTEX_BITSET_MATCH_ANY, .requeue_state = ATOMIC_INIT(Q_REQUEUE_PI_NONE), }; /** * requeue_futex() - Requeue a futex_q from one hb to another * @q: the futex_q to requeue * @hb1: the source hash_bucket * @hb2: the target hash_bucket * @key2: the new key for the requeued futex_q */ static inline void requeue_futex(struct futex_q *q, struct futex_hash_bucket *hb1, struct futex_hash_bucket *hb2, union futex_key *key2) { /* * If key1 and key2 hash to the same bucket, no need to * requeue. */ if (likely(&hb1->chain != &hb2->chain)) { plist_del(&q->list, &hb1->chain); futex_hb_waiters_dec(hb1); futex_hb_waiters_inc(hb2); plist_add(&q->list, &hb2->chain); q->lock_ptr = &hb2->lock; } q->key = *key2; } static inline bool futex_requeue_pi_prepare(struct futex_q *q, struct futex_pi_state *pi_state) { int old, new; /* * Set state to Q_REQUEUE_PI_IN_PROGRESS unless an early wakeup has * already set Q_REQUEUE_PI_IGNORE to signal that requeue should * ignore the waiter. */ old = atomic_read_acquire(&q->requeue_state); do { if (old == Q_REQUEUE_PI_IGNORE) return false; /* * futex_proxy_trylock_atomic() might have set it to * IN_PROGRESS and a interleaved early wake to WAIT. * * It was considered to have an extra state for that * trylock, but that would just add more conditionals * all over the place for a dubious value. */ if (old != Q_REQUEUE_PI_NONE) break; new = Q_REQUEUE_PI_IN_PROGRESS; } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new)); q->pi_state = pi_state; return true; } static inline void futex_requeue_pi_complete(struct futex_q *q, int locked) { int old, new; old = atomic_read_acquire(&q->requeue_state); do { if (old == Q_REQUEUE_PI_IGNORE) return; if (locked >= 0) { /* Requeue succeeded. Set DONE or LOCKED */ WARN_ON_ONCE(old != Q_REQUEUE_PI_IN_PROGRESS && old != Q_REQUEUE_PI_WAIT); new = Q_REQUEUE_PI_DONE + locked; } else if (old == Q_REQUEUE_PI_IN_PROGRESS) { /* Deadlock, no early wakeup interleave */ new = Q_REQUEUE_PI_NONE; } else { /* Deadlock, early wakeup interleave. */ WARN_ON_ONCE(old != Q_REQUEUE_PI_WAIT); new = Q_REQUEUE_PI_IGNORE; } } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new)); #ifdef CONFIG_PREEMPT_RT /* If the waiter interleaved with the requeue let it know */ if (unlikely(old == Q_REQUEUE_PI_WAIT)) rcuwait_wake_up(&q->requeue_wait); #endif } static inline int futex_requeue_pi_wakeup_sync(struct futex_q *q) { int old, new; old = atomic_read_acquire(&q->requeue_state); do { /* Is requeue done already? */ if (old >= Q_REQUEUE_PI_DONE) return old; /* * If not done, then tell the requeue code to either ignore * the waiter or to wake it up once the requeue is done. */ new = Q_REQUEUE_PI_WAIT; if (old == Q_REQUEUE_PI_NONE) new = Q_REQUEUE_PI_IGNORE; } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new)); /* If the requeue was in progress, wait for it to complete */ if (old == Q_REQUEUE_PI_IN_PROGRESS) { #ifdef CONFIG_PREEMPT_RT rcuwait_wait_event(&q->requeue_wait, atomic_read(&q->requeue_state) != Q_REQUEUE_PI_WAIT, TASK_UNINTERRUPTIBLE); #else (void)atomic_cond_read_relaxed(&q->requeue_state, VAL != Q_REQUEUE_PI_WAIT); #endif } /* * Requeue is now either prohibited or complete. Reread state * because during the wait above it might have changed. Nothing * will modify q->requeue_state after this point. */ return atomic_read(&q->requeue_state); } /** * requeue_pi_wake_futex() - Wake a task that acquired the lock during requeue * @q: the futex_q * @key: the key of the requeue target futex * @hb: the hash_bucket of the requeue target futex * * During futex_requeue, with requeue_pi=1, it is possible to acquire the * target futex if it is uncontended or via a lock steal. * * 1) Set @q::key to the requeue target futex key so the waiter can detect * the wakeup on the right futex. * * 2) Dequeue @q from the hash bucket. * * 3) Set @q::rt_waiter to NULL so the woken up task can detect atomic lock * acquisition. * * 4) Set the q->lock_ptr to the requeue target hb->lock for the case that * the waiter has to fixup the pi state. * * 5) Complete the requeue state so the waiter can make progress. After * this point the waiter task can return from the syscall immediately in * case that the pi state does not have to be fixed up. * * 6) Wake the waiter task. * * Must be called with both q->lock_ptr and hb->lock held. */ static inline void requeue_pi_wake_futex(struct futex_q *q, union futex_key *key, struct futex_hash_bucket *hb) { q->key = *key; __futex_unqueue(q); WARN_ON(!q->rt_waiter); q->rt_waiter = NULL; q->lock_ptr = &hb->lock; /* Signal locked state to the waiter */ futex_requeue_pi_complete(q, 1); wake_up_state(q->task, TASK_NORMAL); } /** * futex_proxy_trylock_atomic() - Attempt an atomic lock for the top waiter * @pifutex: the user address of the to futex * @hb1: the from futex hash bucket, must be locked by the caller * @hb2: the to futex hash bucket, must be locked by the caller * @key1: the from futex key * @key2: the to futex key * @ps: address to store the pi_state pointer * @exiting: Pointer to store the task pointer of the owner task * which is in the middle of exiting * @set_waiters: force setting the FUTEX_WAITERS bit (1) or not (0) * * Try and get the lock on behalf of the top waiter if we can do it atomically. * Wake the top waiter if we succeed. If the caller specified set_waiters, * then direct futex_lock_pi_atomic() to force setting the FUTEX_WAITERS bit. * hb1 and hb2 must be held by the caller. * * @exiting is only set when the return value is -EBUSY. If so, this holds * a refcount on the exiting task on return and the caller needs to drop it * after waiting for the exit to complete. * * Return: * - 0 - failed to acquire the lock atomically; * - >0 - acquired the lock, return value is vpid of the top_waiter * - <0 - error */ static int futex_proxy_trylock_atomic(u32 __user *pifutex, struct futex_hash_bucket *hb1, struct futex_hash_bucket *hb2, union futex_key *key1, union futex_key *key2, struct futex_pi_state **ps, struct task_struct **exiting, int set_waiters) { struct futex_q *top_waiter; u32 curval; int ret; if (futex_get_value_locked(&curval, pifutex)) return -EFAULT; if (unlikely(should_fail_futex(true))) return -EFAULT; /* * Find the top_waiter and determine if there are additional waiters. * If the caller intends to requeue more than 1 waiter to pifutex, * force futex_lock_pi_atomic() to set the FUTEX_WAITERS bit now, * as we have means to handle the possible fault. If not, don't set * the bit unnecessarily as it will force the subsequent unlock to enter * the kernel. */ top_waiter = futex_top_waiter(hb1, key1); /* There are no waiters, nothing for us to do. */ if (!top_waiter) return 0; /* * Ensure that this is a waiter sitting in futex_wait_requeue_pi() * and waiting on the 'waitqueue' futex which is always !PI. */ if (!top_waiter->rt_waiter || top_waiter->pi_state) return -EINVAL; /* Ensure we requeue to the expected futex. */ if (!futex_match(top_waiter->requeue_pi_key, key2)) return -EINVAL; /* Ensure that this does not race against an early wakeup */ if (!futex_requeue_pi_prepare(top_waiter, NULL)) return -EAGAIN; /* * Try to take the lock for top_waiter and set the FUTEX_WAITERS bit * in the contended case or if @set_waiters is true. * * In the contended case PI state is attached to the lock owner. If * the user space lock can be acquired then PI state is attached to * the new owner (@top_waiter->task) when @set_waiters is true. */ ret = futex_lock_pi_atomic(pifutex, hb2, key2, ps, top_waiter->task, exiting, set_waiters); if (ret == 1) { /* * Lock was acquired in user space and PI state was * attached to @top_waiter->task. That means state is fully * consistent and the waiter can return to user space * immediately after the wakeup. */ requeue_pi_wake_futex(top_waiter, key2, hb2); } else if (ret < 0) { /* Rewind top_waiter::requeue_state */ futex_requeue_pi_complete(top_waiter, ret); } else { /* * futex_lock_pi_atomic() did not acquire the user space * futex, but managed to establish the proxy lock and pi * state. top_waiter::requeue_state cannot be fixed up here * because the waiter is not enqueued on the rtmutex * yet. This is handled at the callsite depending on the * result of rt_mutex_start_proxy_lock() which is * guaranteed to be reached with this function returning 0. */ } return ret; } /** * futex_requeue() - Requeue waiters from uaddr1 to uaddr2 * @uaddr1: source futex user address * @flags1: futex flags (FLAGS_SHARED, etc.) * @uaddr2: target futex user address * @flags2: futex flags (FLAGS_SHARED, etc.) * @nr_wake: number of waiters to wake (must be 1 for requeue_pi) * @nr_requeue: number of waiters to requeue (0-INT_MAX) * @cmpval: @uaddr1 expected value (or %NULL) * @requeue_pi: if we are attempting to requeue from a non-pi futex to a * pi futex (pi to pi requeue is not supported) * * Requeue waiters on uaddr1 to uaddr2. In the requeue_pi case, try to acquire * uaddr2 atomically on behalf of the top waiter. * * Return: * - >=0 - on success, the number of tasks requeued or woken; * - <0 - on error */ int futex_requeue(u32 __user *uaddr1, unsigned int flags1, u32 __user *uaddr2, unsigned int flags2, int nr_wake, int nr_requeue, u32 *cmpval, int requeue_pi) { union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT; int task_count = 0, ret; struct futex_pi_state *pi_state = NULL; struct futex_hash_bucket *hb1, *hb2; struct futex_q *this, *next; DEFINE_WAKE_Q(wake_q); if (nr_wake < 0 || nr_requeue < 0) return -EINVAL; /* * When PI not supported: return -ENOSYS if requeue_pi is true, * consequently the compiler knows requeue_pi is always false past * this point which will optimize away all the conditional code * further down. */ if (!IS_ENABLED(CONFIG_FUTEX_PI) && requeue_pi) return -ENOSYS; if (requeue_pi) { /* * Requeue PI only works on two distinct uaddrs. This * check is only valid for private futexes. See below. */ if (uaddr1 == uaddr2) return -EINVAL; /* * futex_requeue() allows the caller to define the number * of waiters to wake up via the @nr_wake argument. With * REQUEUE_PI, waking up more than one waiter is creating * more problems than it solves. Waking up a waiter makes * only sense if the PI futex @uaddr2 is uncontended as * this allows the requeue code to acquire the futex * @uaddr2 before waking the waiter. The waiter can then * return to user space without further action. A secondary * wakeup would just make the futex_wait_requeue_pi() * handling more complex, because that code would have to * look up pi_state and do more or less all the handling * which the requeue code has to do for the to be requeued * waiters. So restrict the number of waiters to wake to * one, and only wake it up when the PI futex is * uncontended. Otherwise requeue it and let the unlock of * the PI futex handle the wakeup. * * All REQUEUE_PI users, e.g. pthread_cond_signal() and * pthread_cond_broadcast() must use nr_wake=1. */ if (nr_wake != 1) return -EINVAL; /* * requeue_pi requires a pi_state, try to allocate it now * without any locks in case it fails. */ if (refill_pi_state_cache()) return -ENOMEM; } retry: ret = get_futex_key(uaddr1, flags1, &key1, FUTEX_READ); if (unlikely(ret != 0)) return ret; ret = get_futex_key(uaddr2, flags2, &key2, requeue_pi ? FUTEX_WRITE : FUTEX_READ); if (unlikely(ret != 0)) return ret; /* * The check above which compares uaddrs is not sufficient for * shared futexes. We need to compare the keys: */ if (requeue_pi && futex_match(&key1, &key2)) return -EINVAL; hb1 = futex_hash(&key1); hb2 = futex_hash(&key2); retry_private: futex_hb_waiters_inc(hb2); double_lock_hb(hb1, hb2); if (likely(cmpval != NULL)) { u32 curval; ret = futex_get_value_locked(&curval, uaddr1); if (unlikely(ret)) { double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); ret = get_user(curval, uaddr1); if (ret) return ret; if (!(flags1 & FLAGS_SHARED)) goto retry_private; goto retry; } if (curval != *cmpval) { ret = -EAGAIN; goto out_unlock; } } if (requeue_pi) { struct task_struct *exiting = NULL; /* * Attempt to acquire uaddr2 and wake the top waiter. If we * intend to requeue waiters, force setting the FUTEX_WAITERS * bit. We force this here where we are able to easily handle * faults rather in the requeue loop below. * * Updates topwaiter::requeue_state if a top waiter exists. */ ret = futex_proxy_trylock_atomic(uaddr2, hb1, hb2, &key1, &key2, &pi_state, &exiting, nr_requeue); /* * At this point the top_waiter has either taken uaddr2 or * is waiting on it. In both cases pi_state has been * established and an initial refcount on it. In case of an * error there's nothing. * * The top waiter's requeue_state is up to date: * * - If the lock was acquired atomically (ret == 1), then * the state is Q_REQUEUE_PI_LOCKED. * * The top waiter has been dequeued and woken up and can * return to user space immediately. The kernel/user * space state is consistent. In case that there must be * more waiters requeued the WAITERS bit in the user * space futex is set so the top waiter task has to go * into the syscall slowpath to unlock the futex. This * will block until this requeue operation has been * completed and the hash bucket locks have been * dropped. * * - If the trylock failed with an error (ret < 0) then * the state is either Q_REQUEUE_PI_NONE, i.e. "nothing * happened", or Q_REQUEUE_PI_IGNORE when there was an * interleaved early wakeup. * * - If the trylock did not succeed (ret == 0) then the * state is either Q_REQUEUE_PI_IN_PROGRESS or * Q_REQUEUE_PI_WAIT if an early wakeup interleaved. * This will be cleaned up in the loop below, which * cannot fail because futex_proxy_trylock_atomic() did * the same sanity checks for requeue_pi as the loop * below does. */ switch (ret) { case 0: /* We hold a reference on the pi state. */ break; case 1: /* * futex_proxy_trylock_atomic() acquired the user space * futex. Adjust task_count. */ task_count++; ret = 0; break; /* * If the above failed, then pi_state is NULL and * waiter::requeue_state is correct. */ case -EFAULT: double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); ret = fault_in_user_writeable(uaddr2); if (!ret) goto retry; return ret; case -EBUSY: case -EAGAIN: /* * Two reasons for this: * - EBUSY: Owner is exiting and we just wait for the * exit to complete. * - EAGAIN: The user space value changed. */ double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); /* * Handle the case where the owner is in the middle of * exiting. Wait for the exit to complete otherwise * this task might loop forever, aka. live lock. */ wait_for_owner_exiting(ret, exiting); cond_resched(); goto retry; default: goto out_unlock; } } plist_for_each_entry_safe(this, next, &hb1->chain, list) { if (task_count - nr_wake >= nr_requeue) break; if (!futex_match(&this->key, &key1)) continue; /* * FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI should always * be paired with each other and no other futex ops. * * We should never be requeueing a futex_q with a pi_state, * which is awaiting a futex_unlock_pi(). */ if ((requeue_pi && !this->rt_waiter) || (!requeue_pi && this->rt_waiter) || this->pi_state) { ret = -EINVAL; break; } /* Plain futexes just wake or requeue and are done */ if (!requeue_pi) { if (++task_count <= nr_wake) this->wake(&wake_q, this); else requeue_futex(this, hb1, hb2, &key2); continue; } /* Ensure we requeue to the expected futex for requeue_pi. */ if (!futex_match(this->requeue_pi_key, &key2)) { ret = -EINVAL; break; } /* * Requeue nr_requeue waiters and possibly one more in the case * of requeue_pi if we couldn't acquire the lock atomically. * * Prepare the waiter to take the rt_mutex. Take a refcount * on the pi_state and store the pointer in the futex_q * object of the waiter. */ get_pi_state(pi_state); /* Don't requeue when the waiter is already on the way out. */ if (!futex_requeue_pi_prepare(this, pi_state)) { /* * Early woken waiter signaled that it is on the * way out. Drop the pi_state reference and try the * next waiter. @this->pi_state is still NULL. */ put_pi_state(pi_state); continue; } ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex, this->rt_waiter, this->task); if (ret == 1) { /* * We got the lock. We do neither drop the refcount * on pi_state nor clear this->pi_state because the * waiter needs the pi_state for cleaning up the * user space value. It will drop the refcount * after doing so. this::requeue_state is updated * in the wakeup as well. */ requeue_pi_wake_futex(this, &key2, hb2); task_count++; } else if (!ret) { /* Waiter is queued, move it to hb2 */ requeue_futex(this, hb1, hb2, &key2); futex_requeue_pi_complete(this, 0); task_count++; } else { /* * rt_mutex_start_proxy_lock() detected a potential * deadlock when we tried to queue that waiter. * Drop the pi_state reference which we took above * and remove the pointer to the state from the * waiters futex_q object. */ this->pi_state = NULL; put_pi_state(pi_state); futex_requeue_pi_complete(this, ret); /* * We stop queueing more waiters and let user space * deal with the mess. */ break; } } /* * We took an extra initial reference to the pi_state in * futex_proxy_trylock_atomic(). We need to drop it here again. */ put_pi_state(pi_state); out_unlock: double_unlock_hb(hb1, hb2); wake_up_q(&wake_q); futex_hb_waiters_dec(hb2); return ret ? ret : task_count; } /** * handle_early_requeue_pi_wakeup() - Handle early wakeup on the initial futex * @hb: the hash_bucket futex_q was original enqueued on * @q: the futex_q woken while waiting to be requeued * @timeout: the timeout associated with the wait (NULL if none) * * Determine the cause for the early wakeup. * * Return: * -EWOULDBLOCK or -ETIMEDOUT or -ERESTARTNOINTR */ static inline int handle_early_requeue_pi_wakeup(struct futex_hash_bucket *hb, struct futex_q *q, struct hrtimer_sleeper *timeout) { int ret; /* * With the hb lock held, we avoid races while we process the wakeup. * We only need to hold hb (and not hb2) to ensure atomicity as the * wakeup code can't change q.key from uaddr to uaddr2 if we hold hb. * It can't be requeued from uaddr2 to something else since we don't * support a PI aware source futex for requeue. */ WARN_ON_ONCE(&hb->lock != q->lock_ptr); /* * We were woken prior to requeue by a timeout or a signal. * Unqueue the futex_q and determine which it was. */ plist_del(&q->list, &hb->chain); futex_hb_waiters_dec(hb); /* Handle spurious wakeups gracefully */ ret = -EWOULDBLOCK; if (timeout && !timeout->task) ret = -ETIMEDOUT; else if (signal_pending(current)) ret = -ERESTARTNOINTR; return ret; } /** * futex_wait_requeue_pi() - Wait on uaddr and take uaddr2 * @uaddr: the futex we initially wait on (non-pi) * @flags: futex flags (FLAGS_SHARED, FLAGS_CLOCKRT, etc.), they must be * the same type, no requeueing from private to shared, etc. * @val: the expected value of uaddr * @abs_time: absolute timeout * @bitset: 32 bit wakeup bitset set by userspace, defaults to all * @uaddr2: the pi futex we will take prior to returning to user-space * * The caller will wait on uaddr and will be requeued by futex_requeue() to * uaddr2 which must be PI aware and unique from uaddr. Normal wakeup will wake * on uaddr2 and complete the acquisition of the rt_mutex prior to returning to * userspace. This ensures the rt_mutex maintains an owner when it has waiters; * without one, the pi logic would not know which task to boost/deboost, if * there was a need to. * * We call schedule in futex_wait_queue() when we enqueue and return there * via the following-- * 1) wakeup on uaddr2 after an atomic lock acquisition by futex_requeue() * 2) wakeup on uaddr2 after a requeue * 3) signal * 4) timeout * * If 3, cleanup and return -ERESTARTNOINTR. * * If 2, we may then block on trying to take the rt_mutex and return via: * 5) successful lock * 6) signal * 7) timeout * 8) other lock acquisition failure * * If 6, return -EWOULDBLOCK (restarting the syscall would do the same). * * If 4 or 7, we cleanup and return with -ETIMEDOUT. * * Return: * - 0 - On success; * - <0 - On error */ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, u32 val, ktime_t *abs_time, u32 bitset, u32 __user *uaddr2) { struct hrtimer_sleeper timeout, *to; struct rt_mutex_waiter rt_waiter; struct futex_hash_bucket *hb; union futex_key key2 = FUTEX_KEY_INIT; struct futex_q q = futex_q_init; struct rt_mutex_base *pi_mutex; int res, ret; if (!IS_ENABLED(CONFIG_FUTEX_PI)) return -ENOSYS; if (uaddr == uaddr2) return -EINVAL; if (!bitset) return -EINVAL; to = futex_setup_timer(abs_time, &timeout, flags, current->timer_slack_ns); /* * The waiter is allocated on our stack, manipulated by the requeue * code while we sleep on uaddr. */ rt_mutex_init_waiter(&rt_waiter); ret = get_futex_key(uaddr2, flags, &key2, FUTEX_WRITE); if (unlikely(ret != 0)) goto out; q.bitset = bitset; q.rt_waiter = &rt_waiter; q.requeue_pi_key = &key2; /* * Prepare to wait on uaddr. On success, it holds hb->lock and q * is initialized. */ ret = futex_wait_setup(uaddr, val, flags, &q, &hb); if (ret) goto out; /* * The check above which compares uaddrs is not sufficient for * shared futexes. We need to compare the keys: */ if (futex_match(&q.key, &key2)) { futex_q_unlock(hb); ret = -EINVAL; goto out; } /* Queue the futex_q, drop the hb lock, wait for wakeup. */ futex_wait_queue(hb, &q, to); switch (futex_requeue_pi_wakeup_sync(&q)) { case Q_REQUEUE_PI_IGNORE: /* The waiter is still on uaddr1 */ spin_lock(&hb->lock); ret = handle_early_requeue_pi_wakeup(hb, &q, to); spin_unlock(&hb->lock); break; case Q_REQUEUE_PI_LOCKED: /* The requeue acquired the lock */ if (q.pi_state && (q.pi_state->owner != current)) { spin_lock(q.lock_ptr); ret = fixup_pi_owner(uaddr2, &q, true); /* * Drop the reference to the pi state which the * requeue_pi() code acquired for us. */ put_pi_state(q.pi_state); spin_unlock(q.lock_ptr); /* * Adjust the return value. It's either -EFAULT or * success (1) but the caller expects 0 for success. */ ret = ret < 0 ? ret : 0; } break; case Q_REQUEUE_PI_DONE: /* Requeue completed. Current is 'pi_blocked_on' the rtmutex */ pi_mutex = &q.pi_state->pi_mutex; ret = rt_mutex_wait_proxy_lock(pi_mutex, to, &rt_waiter); /* * See futex_unlock_pi()'s cleanup: comment. */ if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter)) ret = 0; spin_lock(q.lock_ptr); debug_rt_mutex_free_waiter(&rt_waiter); /* * Fixup the pi_state owner and possibly acquire the lock if we * haven't already. */ res = fixup_pi_owner(uaddr2, &q, !ret); /* * If fixup_pi_owner() returned an error, propagate that. If it * acquired the lock, clear -ETIMEDOUT or -EINTR. */ if (res) ret = (res < 0) ? res : 0; futex_unqueue_pi(&q); spin_unlock(q.lock_ptr); if (ret == -EINTR) { /* * We've already been requeued, but cannot restart * by calling futex_lock_pi() directly. We could * restart this syscall, but it would detect that * the user space "val" changed and return * -EWOULDBLOCK. Save the overhead of the restart * and return -EWOULDBLOCK directly. */ ret = -EWOULDBLOCK; } break; default: BUG(); } out: if (to) { hrtimer_cancel(&to->timer); destroy_hrtimer_on_stack(&to->timer); } return ret; } |
51 11 51 51 1 1 1 3 5 5 1 3 3 5 50 13 10 51 50 51 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | // SPDX-License-Identifier: GPL-2.0 #include <linux/proc_fs.h> #include <linux/nsproxy.h> #include <linux/ptrace.h> #include <linux/namei.h> #include <linux/file.h> #include <linux/utsname.h> #include <net/net_namespace.h> #include <linux/ipc_namespace.h> #include <linux/pid_namespace.h> #include <linux/user_namespace.h> #include "internal.h" static const struct proc_ns_operations *ns_entries[] = { #ifdef CONFIG_NET_NS &netns_operations, #endif #ifdef CONFIG_UTS_NS &utsns_operations, #endif #ifdef CONFIG_IPC_NS &ipcns_operations, #endif #ifdef CONFIG_PID_NS &pidns_operations, &pidns_for_children_operations, #endif #ifdef CONFIG_USER_NS &userns_operations, #endif &mntns_operations, #ifdef CONFIG_CGROUPS &cgroupns_operations, #endif #ifdef CONFIG_TIME_NS &timens_operations, &timens_for_children_operations, #endif }; static const char *proc_ns_get_link(struct dentry *dentry, struct inode *inode, struct delayed_call *done) { const struct proc_ns_operations *ns_ops = PROC_I(inode)->ns_ops; struct task_struct *task; struct path ns_path; int error = -EACCES; if (!dentry) return ERR_PTR(-ECHILD); task = get_proc_task(inode); if (!task) return ERR_PTR(-EACCES); if (!ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) goto out; error = ns_get_path(&ns_path, task, ns_ops); if (error) goto out; error = nd_jump_link(&ns_path); out: put_task_struct(task); return ERR_PTR(error); } static int proc_ns_readlink(struct dentry *dentry, char __user *buffer, int buflen) { struct inode *inode = d_inode(dentry); const struct proc_ns_operations *ns_ops = PROC_I(inode)->ns_ops; struct task_struct *task; char name[50]; int res = -EACCES; task = get_proc_task(inode); if (!task) return res; if (ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS)) { res = ns_get_name(name, sizeof(name), task, ns_ops); if (res >= 0) res = readlink_copy(buffer, buflen, name); } put_task_struct(task); return res; } static const struct inode_operations proc_ns_link_inode_operations = { .readlink = proc_ns_readlink, .get_link = proc_ns_get_link, .setattr = proc_setattr, }; static struct dentry *proc_ns_instantiate(struct dentry *dentry, struct task_struct *task, const void *ptr) { const struct proc_ns_operations *ns_ops = ptr; struct inode *inode; struct proc_inode *ei; inode = proc_pid_make_inode(dentry->d_sb, task, S_IFLNK | S_IRWXUGO); if (!inode) return ERR_PTR(-ENOENT); ei = PROC_I(inode); inode->i_op = &proc_ns_link_inode_operations; ei->ns_ops = ns_ops; pid_update_inode(task, inode); d_set_d_op(dentry, &pid_dentry_operations); return d_splice_alias(inode, dentry); } static int proc_ns_dir_readdir(struct file *file, struct dir_context *ctx) { struct task_struct *task = get_proc_task(file_inode(file)); const struct proc_ns_operations **entry, **last; if (!task) return -ENOENT; if (!dir_emit_dots(file, ctx)) goto out; if (ctx->pos >= 2 + ARRAY_SIZE(ns_entries)) goto out; entry = ns_entries + (ctx->pos - 2); last = &ns_entries[ARRAY_SIZE(ns_entries) - 1]; while (entry <= last) { const struct proc_ns_operations *ops = *entry; if (!proc_fill_cache(file, ctx, ops->name, strlen(ops->name), proc_ns_instantiate, task, ops)) break; ctx->pos++; entry++; } out: put_task_struct(task); return 0; } const struct file_operations proc_ns_dir_operations = { .read = generic_read_dir, .iterate_shared = proc_ns_dir_readdir, .llseek = generic_file_llseek, }; static struct dentry *proc_ns_dir_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { struct task_struct *task = get_proc_task(dir); const struct proc_ns_operations **entry, **last; unsigned int len = dentry->d_name.len; struct dentry *res = ERR_PTR(-ENOENT); if (!task) goto out_no_task; last = &ns_entries[ARRAY_SIZE(ns_entries)]; for (entry = ns_entries; entry < last; entry++) { if (strlen((*entry)->name) != len) continue; if (!memcmp(dentry->d_name.name, (*entry)->name, len)) break; } if (entry == last) goto out; res = proc_ns_instantiate(dentry, task, *entry); out: put_task_struct(task); out_no_task: return res; } const struct inode_operations proc_ns_dir_inode_operations = { .lookup = proc_ns_dir_lookup, .getattr = pid_getattr, .setattr = proc_setattr, }; |
3338 3329 3326 3332 3327 3344 3113 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 | // SPDX-License-Identifier: GPL-2.0 /* * SHA1 routine optimized to do word accesses rather than byte accesses, * and to avoid unnecessary copies into the context array. * * This was based on the git SHA1 implementation. */ #include <linux/kernel.h> #include <linux/export.h> #include <linux/module.h> #include <linux/bitops.h> #include <linux/string.h> #include <crypto/sha1.h> #include <asm/unaligned.h> /* * If you have 32 registers or more, the compiler can (and should) * try to change the array[] accesses into registers. However, on * machines with less than ~25 registers, that won't really work, * and at least gcc will make an unholy mess of it. * * So to avoid that mess which just slows things down, we force * the stores to memory to actually happen (we might be better off * with a 'W(t)=(val);asm("":"+m" (W(t))' there instead, as * suggested by Artur Skawina - that will also make gcc unable to * try to do the silly "optimize away loads" part because it won't * see what the value will be). * * Ben Herrenschmidt reports that on PPC, the C version comes close * to the optimized asm with this (ie on PPC you don't want that * 'volatile', since there are lots of registers). * * On ARM we get the best code generation by forcing a full memory barrier * between each SHA_ROUND, otherwise gcc happily get wild with spilling and * the stack frame size simply explode and performance goes down the drain. */ #ifdef CONFIG_X86 #define setW(x, val) (*(volatile __u32 *)&W(x) = (val)) #elif defined(CONFIG_ARM) #define setW(x, val) do { W(x) = (val); __asm__("":::"memory"); } while (0) #else #define setW(x, val) (W(x) = (val)) #endif /* This "rolls" over the 512-bit array */ #define W(x) (array[(x)&15]) /* * Where do we get the source from? The first 16 iterations get it from * the input data, the next mix it from the 512-bit array. */ #define SHA_SRC(t) get_unaligned_be32((__u32 *)data + t) #define SHA_MIX(t) rol32(W(t+13) ^ W(t+8) ^ W(t+2) ^ W(t), 1) #define SHA_ROUND(t, input, fn, constant, A, B, C, D, E) do { \ __u32 TEMP = input(t); setW(t, TEMP); \ E += TEMP + rol32(A,5) + (fn) + (constant); \ B = ror32(B, 2); \ TEMP = E; E = D; D = C; C = B; B = A; A = TEMP; } while (0) #define T_0_15(t, A, B, C, D, E) SHA_ROUND(t, SHA_SRC, (((C^D)&B)^D) , 0x5a827999, A, B, C, D, E ) #define T_16_19(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, (((C^D)&B)^D) , 0x5a827999, A, B, C, D, E ) #define T_20_39(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, (B^C^D) , 0x6ed9eba1, A, B, C, D, E ) #define T_40_59(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, ((B&C)+(D&(B^C))) , 0x8f1bbcdc, A, B, C, D, E ) #define T_60_79(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, (B^C^D) , 0xca62c1d6, A, B, C, D, E ) /** * sha1_transform - single block SHA1 transform (deprecated) * * @digest: 160 bit digest to update * @data: 512 bits of data to hash * @array: 16 words of workspace (see note) * * This function executes SHA-1's internal compression function. It updates the * 160-bit internal state (@digest) with a single 512-bit data block (@data). * * Don't use this function. SHA-1 is no longer considered secure. And even if * you do have to use SHA-1, this isn't the correct way to hash something with * SHA-1 as this doesn't handle padding and finalization. * * Note: If the hash is security sensitive, the caller should be sure * to clear the workspace. This is left to the caller to avoid * unnecessary clears between chained hashing operations. */ void sha1_transform(__u32 *digest, const char *data, __u32 *array) { __u32 A, B, C, D, E; unsigned int i = 0; A = digest[0]; B = digest[1]; C = digest[2]; D = digest[3]; E = digest[4]; /* Round 1 - iterations 0-16 take their input from 'data' */ for (; i < 16; ++i) T_0_15(i, A, B, C, D, E); /* Round 1 - tail. Input from 512-bit mixing array */ for (; i < 20; ++i) T_16_19(i, A, B, C, D, E); /* Round 2 */ for (; i < 40; ++i) T_20_39(i, A, B, C, D, E); /* Round 3 */ for (; i < 60; ++i) T_40_59(i, A, B, C, D, E); /* Round 4 */ for (; i < 80; ++i) T_60_79(i, A, B, C, D, E); digest[0] += A; digest[1] += B; digest[2] += C; digest[3] += D; digest[4] += E; } EXPORT_SYMBOL(sha1_transform); /** * sha1_init - initialize the vectors for a SHA1 digest * @buf: vector to initialize */ void sha1_init(__u32 *buf) { buf[0] = 0x67452301; buf[1] = 0xefcdab89; buf[2] = 0x98badcfe; buf[3] = 0x10325476; buf[4] = 0xc3d2e1f0; } EXPORT_SYMBOL(sha1_init); MODULE_LICENSE("GPL"); |
1 1 2 2 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Virtual NCI device simulation driver * * Copyright (C) 2020 Samsung Electrnoics * Bongsu Jeon <bongsu.jeon@samsung.com> */ #include <linux/kernel.h> #include <linux/module.h> #include <linux/miscdevice.h> #include <linux/mutex.h> #include <linux/wait.h> #include <net/nfc/nci_core.h> #define IOCTL_GET_NCIDEV_IDX 0 #define VIRTUAL_NFC_PROTOCOLS (NFC_PROTO_JEWEL_MASK | \ NFC_PROTO_MIFARE_MASK | \ NFC_PROTO_FELICA_MASK | \ NFC_PROTO_ISO14443_MASK | \ NFC_PROTO_ISO14443_B_MASK | \ NFC_PROTO_ISO15693_MASK) struct virtual_nci_dev { struct nci_dev *ndev; struct mutex mtx; struct sk_buff *send_buff; struct wait_queue_head wq; bool running; }; static int virtual_nci_open(struct nci_dev *ndev) { struct virtual_nci_dev *vdev = nci_get_drvdata(ndev); vdev->running = true; return 0; } static int virtual_nci_close(struct nci_dev *ndev) { struct virtual_nci_dev *vdev = nci_get_drvdata(ndev); mutex_lock(&vdev->mtx); kfree_skb(vdev->send_buff); vdev->send_buff = NULL; vdev->running = false; mutex_unlock(&vdev->mtx); return 0; } static int virtual_nci_send(struct nci_dev *ndev, struct sk_buff *skb) { struct virtual_nci_dev *vdev = nci_get_drvdata(ndev); mutex_lock(&vdev->mtx); if (vdev->send_buff || !vdev->running) { mutex_unlock(&vdev->mtx); kfree_skb(skb); return -1; } vdev->send_buff = skb_copy(skb, GFP_KERNEL); if (!vdev->send_buff) { mutex_unlock(&vdev->mtx); kfree_skb(skb); return -1; } mutex_unlock(&vdev->mtx); wake_up_interruptible(&vdev->wq); consume_skb(skb); return 0; } static const struct nci_ops virtual_nci_ops = { .open = virtual_nci_open, .close = virtual_nci_close, .send = virtual_nci_send }; static ssize_t virtual_ncidev_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { struct virtual_nci_dev *vdev = file->private_data; size_t actual_len; mutex_lock(&vdev->mtx); while (!vdev->send_buff) { mutex_unlock(&vdev->mtx); if (wait_event_interruptible(vdev->wq, vdev->send_buff)) return -EFAULT; mutex_lock(&vdev->mtx); } actual_len = min_t(size_t, count, vdev->send_buff->len); if (copy_to_user(buf, vdev->send_buff->data, actual_len)) { mutex_unlock(&vdev->mtx); return -EFAULT; } skb_pull(vdev->send_buff, actual_len); if (vdev->send_buff->len == 0) { consume_skb(vdev->send_buff); vdev->send_buff = NULL; } mutex_unlock(&vdev->mtx); return actual_len; } static ssize_t virtual_ncidev_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { struct virtual_nci_dev *vdev = file->private_data; struct sk_buff *skb; skb = alloc_skb(count, GFP_KERNEL); if (!skb) return -ENOMEM; if (copy_from_user(skb_put(skb, count), buf, count)) { kfree_skb(skb); return -EFAULT; } nci_recv_frame(vdev->ndev, skb); return count; } static int virtual_ncidev_open(struct inode *inode, struct file *file) { int ret = 0; struct virtual_nci_dev *vdev; vdev = kzalloc(sizeof(*vdev), GFP_KERNEL); if (!vdev) return -ENOMEM; vdev->ndev = nci_allocate_device(&virtual_nci_ops, VIRTUAL_NFC_PROTOCOLS, 0, 0); if (!vdev->ndev) { kfree(vdev); return -ENOMEM; } mutex_init(&vdev->mtx); init_waitqueue_head(&vdev->wq); file->private_data = vdev; nci_set_drvdata(vdev->ndev, vdev); ret = nci_register_device(vdev->ndev); if (ret < 0) { nci_free_device(vdev->ndev); mutex_destroy(&vdev->mtx); kfree(vdev); return ret; } return 0; } static int virtual_ncidev_close(struct inode *inode, struct file *file) { struct virtual_nci_dev *vdev = file->private_data; nci_unregister_device(vdev->ndev); nci_free_device(vdev->ndev); mutex_destroy(&vdev->mtx); kfree(vdev); return 0; } static long virtual_ncidev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct virtual_nci_dev *vdev = file->private_data; const struct nfc_dev *nfc_dev = vdev->ndev->nfc_dev; void __user *p = (void __user *)arg; if (cmd != IOCTL_GET_NCIDEV_IDX) return -ENOTTY; if (copy_to_user(p, &nfc_dev->idx, sizeof(nfc_dev->idx))) return -EFAULT; return 0; } static const struct file_operations virtual_ncidev_fops = { .owner = THIS_MODULE, .read = virtual_ncidev_read, .write = virtual_ncidev_write, .open = virtual_ncidev_open, .release = virtual_ncidev_close, .unlocked_ioctl = virtual_ncidev_ioctl }; static struct miscdevice miscdev = { .minor = MISC_DYNAMIC_MINOR, .name = "virtual_nci", .fops = &virtual_ncidev_fops, .mode = 0600, }; module_misc_device(miscdev); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("Virtual NCI device simulation driver"); MODULE_AUTHOR("Bongsu Jeon <bongsu.jeon@samsung.com>"); |
11 12 12 12 12 11 12 12 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 | // SPDX-License-Identifier: GPL-2.0-or-later /* Request key authorisation token key definition. * * Copyright (C) 2005 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) * * See Documentation/security/keys/request-key.rst */ #include <linux/sched.h> #include <linux/err.h> #include <linux/seq_file.h> #include <linux/slab.h> #include <linux/uaccess.h> #include "internal.h" #include <keys/request_key_auth-type.h> static int request_key_auth_preparse(struct key_preparsed_payload *); static void request_key_auth_free_preparse(struct key_preparsed_payload *); static int request_key_auth_instantiate(struct key *, struct key_preparsed_payload *); static void request_key_auth_describe(const struct key *, struct seq_file *); static void request_key_auth_revoke(struct key *); static void request_key_auth_destroy(struct key *); static long request_key_auth_read(const struct key *, char *, size_t); /* * The request-key authorisation key type definition. */ struct key_type key_type_request_key_auth = { .name = ".request_key_auth", .def_datalen = sizeof(struct request_key_auth), .preparse = request_key_auth_preparse, .free_preparse = request_key_auth_free_preparse, .instantiate = request_key_auth_instantiate, .describe = request_key_auth_describe, .revoke = request_key_auth_revoke, .destroy = request_key_auth_destroy, .read = request_key_auth_read, }; static int request_key_auth_preparse(struct key_preparsed_payload *prep) { return 0; } static void request_key_auth_free_preparse(struct key_preparsed_payload *prep) { } /* * Instantiate a request-key authorisation key. */ static int request_key_auth_instantiate(struct key *key, struct key_preparsed_payload *prep) { rcu_assign_keypointer(key, (struct request_key_auth *)prep->data); return 0; } /* * Describe an authorisation token. */ static void request_key_auth_describe(const struct key *key, struct seq_file *m) { struct request_key_auth *rka = dereference_key_rcu(key); if (!rka) return; seq_puts(m, "key:"); seq_puts(m, key->description); if (key_is_positive(key)) seq_printf(m, " pid:%d ci:%zu", rka->pid, rka->callout_len); } /* * Read the callout_info data (retrieves the callout information). * - the key's semaphore is read-locked */ static long request_key_auth_read(const struct key *key, char *buffer, size_t buflen) { struct request_key_auth *rka = dereference_key_locked(key); size_t datalen; long ret; if (!rka) return -EKEYREVOKED; datalen = rka->callout_len; ret = datalen; /* we can return the data as is */ if (buffer && buflen > 0) { if (buflen > datalen) buflen = datalen; memcpy(buffer, rka->callout_info, buflen); } return ret; } static void free_request_key_auth(struct request_key_auth *rka) { if (!rka) return; key_put(rka->target_key); key_put(rka->dest_keyring); if (rka->cred) put_cred(rka->cred); kfree(rka->callout_info); kfree(rka); } /* * Dispose of the request_key_auth record under RCU conditions */ static void request_key_auth_rcu_disposal(struct rcu_head *rcu) { struct request_key_auth *rka = container_of(rcu, struct request_key_auth, rcu); free_request_key_auth(rka); } /* * Handle revocation of an authorisation token key. * * Called with the key sem write-locked. */ static void request_key_auth_revoke(struct key *key) { struct request_key_auth *rka = dereference_key_locked(key); kenter("{%d}", key->serial); rcu_assign_keypointer(key, NULL); call_rcu(&rka->rcu, request_key_auth_rcu_disposal); } /* * Destroy an instantiation authorisation token key. */ static void request_key_auth_destroy(struct key *key) { struct request_key_auth *rka = rcu_access_pointer(key->payload.rcu_data0); kenter("{%d}", key->serial); if (rka) { rcu_assign_keypointer(key, NULL); call_rcu(&rka->rcu, request_key_auth_rcu_disposal); } } /* * Create an authorisation token for /sbin/request-key or whoever to gain * access to the caller's security data. */ struct key *request_key_auth_new(struct key *target, const char *op, const void *callout_info, size_t callout_len, struct key *dest_keyring) { struct request_key_auth *rka, *irka; const struct cred *cred = current_cred(); struct key *authkey = NULL; char desc[20]; int ret = -ENOMEM; kenter("%d,", target->serial); /* allocate a auth record */ rka = kzalloc(sizeof(*rka), GFP_KERNEL); if (!rka) goto error; rka->callout_info = kmemdup(callout_info, callout_len, GFP_KERNEL); if (!rka->callout_info) goto error_free_rka; rka->callout_len = callout_len; strscpy(rka->op, op, sizeof(rka->op)); /* see if the calling process is already servicing the key request of * another process */ if (cred->request_key_auth) { /* it is - use that instantiation context here too */ down_read(&cred->request_key_auth->sem); /* if the auth key has been revoked, then the key we're * servicing is already instantiated */ if (test_bit(KEY_FLAG_REVOKED, &cred->request_key_auth->flags)) { up_read(&cred->request_key_auth->sem); ret = -EKEYREVOKED; goto error_free_rka; } irka = cred->request_key_auth->payload.data[0]; rka->cred = get_cred(irka->cred); rka->pid = irka->pid; up_read(&cred->request_key_auth->sem); } else { /* it isn't - use this process as the context */ rka->cred = get_cred(cred); rka->pid = current->pid; } rka->target_key = key_get(target); rka->dest_keyring = key_get(dest_keyring); /* allocate the auth key */ sprintf(desc, "%x", target->serial); authkey = key_alloc(&key_type_request_key_auth, desc, cred->fsuid, cred->fsgid, cred, KEY_POS_VIEW | KEY_POS_READ | KEY_POS_SEARCH | KEY_POS_LINK | KEY_USR_VIEW, KEY_ALLOC_NOT_IN_QUOTA, NULL); if (IS_ERR(authkey)) { ret = PTR_ERR(authkey); goto error_free_rka; } /* construct the auth key */ ret = key_instantiate_and_link(authkey, rka, 0, NULL, NULL); if (ret < 0) goto error_put_authkey; kleave(" = {%d,%d}", authkey->serial, refcount_read(&authkey->usage)); return authkey; error_put_authkey: key_put(authkey); error_free_rka: free_request_key_auth(rka); error: kleave("= %d", ret); return ERR_PTR(ret); } /* * Search the current process's keyrings for the authorisation key for * instantiation of a key. */ struct key *key_get_instantiation_authkey(key_serial_t target_id) { char description[16]; struct keyring_search_context ctx = { .index_key.type = &key_type_request_key_auth, .index_key.description = description, .cred = current_cred(), .match_data.cmp = key_default_cmp, .match_data.raw_data = description, .match_data.lookup_type = KEYRING_SEARCH_LOOKUP_DIRECT, .flags = (KEYRING_SEARCH_DO_STATE_CHECK | KEYRING_SEARCH_RECURSE), }; struct key *authkey; key_ref_t authkey_ref; ctx.index_key.desc_len = sprintf(description, "%x", target_id); rcu_read_lock(); authkey_ref = search_process_keyrings_rcu(&ctx); rcu_read_unlock(); if (IS_ERR(authkey_ref)) { authkey = ERR_CAST(authkey_ref); if (authkey == ERR_PTR(-EAGAIN)) authkey = ERR_PTR(-ENOKEY); goto error; } authkey = key_ref_to_ptr(authkey_ref); if (test_bit(KEY_FLAG_REVOKED, &authkey->flags)) { key_put(authkey); authkey = ERR_PTR(-EKEYREVOKED); } error: return authkey; } |
8 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | // SPDX-License-Identifier: GPL-2.0-only /* * OF helpers for network devices. * * Initially copied out of arch/powerpc/kernel/prom_parse.c */ #include <linux/etherdevice.h> #include <linux/kernel.h> #include <linux/of_net.h> #include <linux/of_platform.h> #include <linux/platform_device.h> #include <linux/phy.h> #include <linux/export.h> #include <linux/device.h> #include <linux/nvmem-consumer.h> /** * of_get_phy_mode - Get phy mode for given device_node * @np: Pointer to the given device_node * @interface: Pointer to the result * * The function gets phy interface string from property 'phy-mode' or * 'phy-connection-type'. The index in phy_modes table is set in * interface and 0 returned. In case of error interface is set to * PHY_INTERFACE_MODE_NA and an errno is returned, e.g. -ENODEV. */ int of_get_phy_mode(struct device_node *np, phy_interface_t *interface) { const char *pm; int err, i; *interface = PHY_INTERFACE_MODE_NA; err = of_property_read_string(np, "phy-mode", &pm); if (err < 0) err = of_property_read_string(np, "phy-connection-type", &pm); if (err < 0) return err; for (i = 0; i < PHY_INTERFACE_MODE_MAX; i++) if (!strcasecmp(pm, phy_modes(i))) { *interface = i; return 0; } return -ENODEV; } EXPORT_SYMBOL_GPL(of_get_phy_mode); static int of_get_mac_addr(struct device_node *np, const char *name, u8 *addr) { struct property *pp = of_find_property(np, name, NULL); if (pp && pp->length == ETH_ALEN && is_valid_ether_addr(pp->value)) { memcpy(addr, pp->value, ETH_ALEN); return 0; } return -ENODEV; } int of_get_mac_address_nvmem(struct device_node *np, u8 *addr) { struct platform_device *pdev = of_find_device_by_node(np); struct nvmem_cell *cell; const void *mac; size_t len; int ret; /* Try lookup by device first, there might be a nvmem_cell_lookup * associated with a given device. */ if (pdev) { ret = nvmem_get_mac_address(&pdev->dev, addr); put_device(&pdev->dev); return ret; } cell = of_nvmem_cell_get(np, "mac-address"); if (IS_ERR(cell)) return PTR_ERR(cell); mac = nvmem_cell_read(cell, &len); nvmem_cell_put(cell); if (IS_ERR(mac)) return PTR_ERR(mac); if (len != ETH_ALEN || !is_valid_ether_addr(mac)) { kfree(mac); return -EINVAL; } memcpy(addr, mac, ETH_ALEN); kfree(mac); return 0; } EXPORT_SYMBOL(of_get_mac_address_nvmem); /** * of_get_mac_address() * @np: Caller's Device Node * @addr: Pointer to a six-byte array for the result * * Search the device tree for the best MAC address to use. 'mac-address' is * checked first, because that is supposed to contain to "most recent" MAC * address. If that isn't set, then 'local-mac-address' is checked next, * because that is the default address. If that isn't set, then the obsolete * 'address' is checked, just in case we're using an old device tree. If any * of the above isn't set, then try to get MAC address from nvmem cell named * 'mac-address'. * * Note that the 'address' property is supposed to contain a virtual address of * the register set, but some DTS files have redefined that property to be the * MAC address. * * All-zero MAC addresses are rejected, because those could be properties that * exist in the device tree, but were not set by U-Boot. For example, the * DTS could define 'mac-address' and 'local-mac-address', with zero MAC * addresses. Some older U-Boots only initialized 'local-mac-address'. In * this case, the real MAC is in 'local-mac-address', and 'mac-address' exists * but is all zeros. * * Return: 0 on success and errno in case of error. */ int of_get_mac_address(struct device_node *np, u8 *addr) { int ret; if (!np) return -ENODEV; ret = of_get_mac_addr(np, "mac-address", addr); if (!ret) return 0; ret = of_get_mac_addr(np, "local-mac-address", addr); if (!ret) return 0; ret = of_get_mac_addr(np, "address", addr); if (!ret) return 0; return of_get_mac_address_nvmem(np, addr); } EXPORT_SYMBOL(of_get_mac_address); /** * of_get_ethdev_address() * @np: Caller's Device Node * @dev: Pointer to netdevice which address will be updated * * Search the device tree for the best MAC address to use. * If found set @dev->dev_addr to that address. * * See documentation of of_get_mac_address() for more information on how * the best address is determined. * * Return: 0 on success and errno in case of error. */ int of_get_ethdev_address(struct device_node *np, struct net_device *dev) { u8 addr[ETH_ALEN]; int ret; ret = of_get_mac_address(np, addr); if (!ret) eth_hw_addr_set(dev, addr); return ret; } EXPORT_SYMBOL(of_get_ethdev_address); |
2 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Clock domain and sample rate management functions */ #include <linux/bitops.h> #include <linux/init.h> #include <linux/string.h> #include <linux/usb.h> #include <linux/usb/audio.h> #include <linux/usb/audio-v2.h> #include <linux/usb/audio-v3.h> #include <sound/core.h> #include <sound/info.h> #include <sound/pcm.h> #include "usbaudio.h" #include "card.h" #include "helper.h" #include "clock.h" #include "quirks.h" union uac23_clock_source_desc { struct uac_clock_source_descriptor v2; struct uac3_clock_source_descriptor v3; }; union uac23_clock_selector_desc { struct uac_clock_selector_descriptor v2; struct uac3_clock_selector_descriptor v3; }; union uac23_clock_multiplier_desc { struct uac_clock_multiplier_descriptor v2; struct uac_clock_multiplier_descriptor v3; }; #define GET_VAL(p, proto, field) \ ((proto) == UAC_VERSION_3 ? (p)->v3.field : (p)->v2.field) static void *find_uac_clock_desc(struct usb_host_interface *iface, int id, bool (*validator)(void *, int, int), u8 type, int proto) { void *cs = NULL; while ((cs = snd_usb_find_csint_desc(iface->extra, iface->extralen, cs, type))) { if (validator(cs, id, proto)) return cs; } return NULL; } static bool validate_clock_source(void *p, int id, int proto) { union uac23_clock_source_desc *cs = p; return GET_VAL(cs, proto, bClockID) == id; } static bool validate_clock_selector(void *p, int id, int proto) { union uac23_clock_selector_desc *cs = p; return GET_VAL(cs, proto, bClockID) == id; } static bool validate_clock_multiplier(void *p, int id, int proto) { union uac23_clock_multiplier_desc *cs = p; return GET_VAL(cs, proto, bClockID) == id; } #define DEFINE_FIND_HELPER(name, obj, validator, type2, type3) \ static obj *name(struct snd_usb_audio *chip, int id, int proto) \ { \ return find_uac_clock_desc(chip->ctrl_intf, id, validator, \ proto == UAC_VERSION_3 ? (type3) : (type2), \ proto); \ } DEFINE_FIND_HELPER(snd_usb_find_clock_source, union uac23_clock_source_desc, validate_clock_source, UAC2_CLOCK_SOURCE, UAC3_CLOCK_SOURCE); DEFINE_FIND_HELPER(snd_usb_find_clock_selector, union uac23_clock_selector_desc, validate_clock_selector, UAC2_CLOCK_SELECTOR, UAC3_CLOCK_SELECTOR); DEFINE_FIND_HELPER(snd_usb_find_clock_multiplier, union uac23_clock_multiplier_desc, validate_clock_multiplier, UAC2_CLOCK_MULTIPLIER, UAC3_CLOCK_MULTIPLIER); static int uac_clock_selector_get_val(struct snd_usb_audio *chip, int selector_id) { unsigned char buf; int ret; ret = snd_usb_ctl_msg(chip->dev, usb_rcvctrlpipe(chip->dev, 0), UAC2_CS_CUR, USB_RECIP_INTERFACE | USB_TYPE_CLASS | USB_DIR_IN, UAC2_CX_CLOCK_SELECTOR << 8, snd_usb_ctrl_intf(chip) | (selector_id << 8), &buf, sizeof(buf)); if (ret < 0) return ret; return buf; } static int uac_clock_selector_set_val(struct snd_usb_audio *chip, int selector_id, unsigned char pin) { int ret; ret = snd_usb_ctl_msg(chip->dev, usb_sndctrlpipe(chip->dev, 0), UAC2_CS_CUR, USB_RECIP_INTERFACE | USB_TYPE_CLASS | USB_DIR_OUT, UAC2_CX_CLOCK_SELECTOR << 8, snd_usb_ctrl_intf(chip) | (selector_id << 8), &pin, sizeof(pin)); if (ret < 0) return ret; if (ret != sizeof(pin)) { usb_audio_err(chip, "setting selector (id %d) unexpected length %d\n", selector_id, ret); return -EINVAL; } ret = uac_clock_selector_get_val(chip, selector_id); if (ret < 0) return ret; if (ret != pin) { usb_audio_err(chip, "setting selector (id %d) to %x failed (current: %d)\n", selector_id, pin, ret); return -EINVAL; } return ret; } static bool uac_clock_source_is_valid_quirk(struct snd_usb_audio *chip, const struct audioformat *fmt, int source_id) { bool ret = false; int count; unsigned char data; struct usb_device *dev = chip->dev; union uac23_clock_source_desc *cs_desc; cs_desc = snd_usb_find_clock_source(chip, source_id, fmt->protocol); if (!cs_desc) return false; if (fmt->protocol == UAC_VERSION_2) { /* * Assume the clock is valid if clock source supports only one * single sample rate, the terminal is connected directly to it * (there is no clock selector) and clock type is internal. * This is to deal with some Denon DJ controllers that always * reports that clock is invalid. */ if (fmt->nr_rates == 1 && (fmt->clock & 0xff) == cs_desc->v2.bClockID && (cs_desc->v2.bmAttributes & 0x3) != UAC_CLOCK_SOURCE_TYPE_EXT) return true; } /* * MOTU MicroBook IIc * Sample rate changes takes more than 2 seconds for this device. Clock * validity request returns false during that period. */ if (chip->usb_id == USB_ID(0x07fd, 0x0004)) { count = 0; while ((!ret) && (count < 50)) { int err; msleep(100); err = snd_usb_ctl_msg(dev, usb_rcvctrlpipe(dev, 0), UAC2_CS_CUR, USB_TYPE_CLASS | USB_RECIP_INTERFACE | USB_DIR_IN, UAC2_CS_CONTROL_CLOCK_VALID << 8, snd_usb_ctrl_intf(chip) | (source_id << 8), &data, sizeof(data)); if (err < 0) { dev_warn(&dev->dev, "%s(): cannot get clock validity for id %d\n", __func__, source_id); return false; } ret = !!data; count++; } } return ret; } static bool uac_clock_source_is_valid(struct snd_usb_audio *chip, const struct audioformat *fmt, int source_id) { int err; unsigned char data; struct usb_device *dev = chip->dev; u32 bmControls; union uac23_clock_source_desc *cs_desc; cs_desc = snd_usb_find_clock_source(chip, source_id, fmt->protocol); if (!cs_desc) return false; if (fmt->protocol == UAC_VERSION_3) bmControls = le32_to_cpu(cs_desc->v3.bmControls); else bmControls = cs_desc->v2.bmControls; /* If a clock source can't tell us whether it's valid, we assume it is */ if (!uac_v2v3_control_is_readable(bmControls, UAC2_CS_CONTROL_CLOCK_VALID)) return true; err = snd_usb_ctl_msg(dev, usb_rcvctrlpipe(dev, 0), UAC2_CS_CUR, USB_TYPE_CLASS | USB_RECIP_INTERFACE | USB_DIR_IN, UAC2_CS_CONTROL_CLOCK_VALID << 8, snd_usb_ctrl_intf(chip) | (source_id << 8), &data, sizeof(data)); if (err < 0) { dev_warn(&dev->dev, "%s(): cannot get clock validity for id %d\n", __func__, source_id); return false; } if (data) return true; else return uac_clock_source_is_valid_quirk(chip, fmt, source_id); } static int __uac_clock_find_source(struct snd_usb_audio *chip, const struct audioformat *fmt, int entity_id, unsigned long *visited, bool validate) { union uac23_clock_source_desc *source; union uac23_clock_selector_desc *selector; union uac23_clock_multiplier_desc *multiplier; int ret, i, cur, err, pins, clock_id; const u8 *sources; int proto = fmt->protocol; bool readable, writeable; u32 bmControls; entity_id &= 0xff; if (test_and_set_bit(entity_id, visited)) { usb_audio_warn(chip, "%s(): recursive clock topology detected, id %d.\n", __func__, entity_id); return -EINVAL; } /* first, see if the ID we're looking at is a clock source already */ source = snd_usb_find_clock_source(chip, entity_id, proto); if (source) { entity_id = GET_VAL(source, proto, bClockID); if (validate && !uac_clock_source_is_valid(chip, fmt, entity_id)) { usb_audio_err(chip, "clock source %d is not valid, cannot use\n", entity_id); return -ENXIO; } return entity_id; } selector = snd_usb_find_clock_selector(chip, entity_id, proto); if (selector) { pins = GET_VAL(selector, proto, bNrInPins); clock_id = GET_VAL(selector, proto, bClockID); sources = GET_VAL(selector, proto, baCSourceID); cur = 0; if (proto == UAC_VERSION_3) bmControls = le32_to_cpu(*(__le32 *)(&selector->v3.baCSourceID[0] + pins)); else bmControls = *(__u8 *)(&selector->v2.baCSourceID[0] + pins); readable = uac_v2v3_control_is_readable(bmControls, UAC2_CX_CLOCK_SELECTOR); writeable = uac_v2v3_control_is_writeable(bmControls, UAC2_CX_CLOCK_SELECTOR); if (pins == 1) { ret = 1; goto find_source; } /* for now just warn about buggy device */ if (!readable) usb_audio_warn(chip, "%s(): clock selector control is not readable, id %d\n", __func__, clock_id); /* the entity ID we are looking at is a selector. * find out what it currently selects */ ret = uac_clock_selector_get_val(chip, clock_id); if (ret < 0) { if (!chip->autoclock) return ret; goto find_others; } /* Selector values are one-based */ if (ret > pins || ret < 1) { usb_audio_err(chip, "%s(): selector reported illegal value, id %d, ret %d\n", __func__, clock_id, ret); if (!chip->autoclock) return -EINVAL; goto find_others; } find_source: cur = ret; ret = __uac_clock_find_source(chip, fmt, sources[ret - 1], visited, validate); if (ret > 0) { /* Skip setting clock selector again for some devices */ if (chip->quirk_flags & QUIRK_FLAG_SKIP_CLOCK_SELECTOR || !writeable) return ret; err = uac_clock_selector_set_val(chip, entity_id, cur); if (err < 0) { if (pins == 1) { usb_audio_dbg(chip, "%s(): selector returned an error, " "assuming a firmware bug, id %d, ret %d\n", __func__, clock_id, err); return ret; } return err; } } if (!validate || ret > 0 || !chip->autoclock) return ret; find_others: if (!writeable) return -ENXIO; /* The current clock source is invalid, try others. */ for (i = 1; i <= pins; i++) { if (i == cur) continue; ret = __uac_clock_find_source(chip, fmt, sources[i - 1], visited, true); if (ret < 0) continue; err = uac_clock_selector_set_val(chip, entity_id, i); if (err < 0) continue; usb_audio_info(chip, "found and selected valid clock source %d\n", ret); return ret; } return -ENXIO; } /* FIXME: multipliers only act as pass-thru element for now */ multiplier = snd_usb_find_clock_multiplier(chip, entity_id, proto); if (multiplier) return __uac_clock_find_source(chip, fmt, GET_VAL(multiplier, proto, bCSourceID), visited, validate); return -EINVAL; } /* * For all kinds of sample rate settings and other device queries, * the clock source (end-leaf) must be used. However, clock selectors, * clock multipliers and sample rate converters may be specified as * clock source input to terminal. This functions walks the clock path * to its end and tries to find the source. * * The 'visited' bitfield is used internally to detect recursive loops. * * Returns the clock source UnitID (>=0) on success, or an error. */ int snd_usb_clock_find_source(struct snd_usb_audio *chip, const struct audioformat *fmt, bool validate) { DECLARE_BITMAP(visited, 256); memset(visited, 0, sizeof(visited)); switch (fmt->protocol) { case UAC_VERSION_2: case UAC_VERSION_3: return __uac_clock_find_source(chip, fmt, fmt->clock, visited, validate); default: return -EINVAL; } } static int set_sample_rate_v1(struct snd_usb_audio *chip, const struct audioformat *fmt, int rate) { struct usb_device *dev = chip->dev; unsigned char data[3]; int err, crate; /* if endpoint doesn't have sampling rate control, bail out */ if (!(fmt->attributes & UAC_EP_CS_ATTR_SAMPLE_RATE)) return 0; data[0] = rate; data[1] = rate >> 8; data[2] = rate >> 16; err = snd_usb_ctl_msg(dev, usb_sndctrlpipe(dev, 0), UAC_SET_CUR, USB_TYPE_CLASS | USB_RECIP_ENDPOINT | USB_DIR_OUT, UAC_EP_CS_ATTR_SAMPLE_RATE << 8, fmt->endpoint, data, sizeof(data)); if (err < 0) { dev_err(&dev->dev, "%d:%d: cannot set freq %d to ep %#x\n", fmt->iface, fmt->altsetting, rate, fmt->endpoint); return err; } /* Don't check the sample rate for devices which we know don't * support reading */ if (chip->quirk_flags & QUIRK_FLAG_GET_SAMPLE_RATE) return 0; /* the firmware is likely buggy, don't repeat to fail too many times */ if (chip->sample_rate_read_error > 2) return 0; err = snd_usb_ctl_msg(dev, usb_rcvctrlpipe(dev, 0), UAC_GET_CUR, USB_TYPE_CLASS | USB_RECIP_ENDPOINT | USB_DIR_IN, UAC_EP_CS_ATTR_SAMPLE_RATE << 8, fmt->endpoint, data, sizeof(data)); if (err < 0) { dev_err(&dev->dev, "%d:%d: cannot get freq at ep %#x\n", fmt->iface, fmt->altsetting, fmt->endpoint); chip->sample_rate_read_error++; return 0; /* some devices don't support reading */ } crate = data[0] | (data[1] << 8) | (data[2] << 16); if (!crate) { dev_info(&dev->dev, "failed to read current rate; disabling the check\n"); chip->sample_rate_read_error = 3; /* three strikes, see above */ return 0; } if (crate != rate) { dev_warn(&dev->dev, "current rate %d is different from the runtime rate %d\n", crate, rate); // runtime->rate = crate; } return 0; } static int get_sample_rate_v2v3(struct snd_usb_audio *chip, int iface, int altsetting, int clock) { struct usb_device *dev = chip->dev; __le32 data; int err; err = snd_usb_ctl_msg(dev, usb_rcvctrlpipe(dev, 0), UAC2_CS_CUR, USB_TYPE_CLASS | USB_RECIP_INTERFACE | USB_DIR_IN, UAC2_CS_CONTROL_SAM_FREQ << 8, snd_usb_ctrl_intf(chip) | (clock << 8), &data, sizeof(data)); if (err < 0) { dev_warn(&dev->dev, "%d:%d: cannot get freq (v2/v3): err %d\n", iface, altsetting, err); return 0; } return le32_to_cpu(data); } /* * Try to set the given sample rate: * * Return 0 if the clock source is read-only, the actual rate on success, * or a negative error code. * * This function gets called from format.c to validate each sample rate, too. * Hence no message is shown upon error */ int snd_usb_set_sample_rate_v2v3(struct snd_usb_audio *chip, const struct audioformat *fmt, int clock, int rate) { bool writeable; u32 bmControls; __le32 data; int err; union uac23_clock_source_desc *cs_desc; cs_desc = snd_usb_find_clock_source(chip, clock, fmt->protocol); if (!cs_desc) return 0; if (fmt->protocol == UAC_VERSION_3) bmControls = le32_to_cpu(cs_desc->v3.bmControls); else bmControls = cs_desc->v2.bmControls; writeable = uac_v2v3_control_is_writeable(bmControls, UAC2_CS_CONTROL_SAM_FREQ); if (!writeable) return 0; data = cpu_to_le32(rate); err = snd_usb_ctl_msg(chip->dev, usb_sndctrlpipe(chip->dev, 0), UAC2_CS_CUR, USB_TYPE_CLASS | USB_RECIP_INTERFACE | USB_DIR_OUT, UAC2_CS_CONTROL_SAM_FREQ << 8, snd_usb_ctrl_intf(chip) | (clock << 8), &data, sizeof(data)); if (err < 0) return err; return get_sample_rate_v2v3(chip, fmt->iface, fmt->altsetting, clock); } static int set_sample_rate_v2v3(struct snd_usb_audio *chip, const struct audioformat *fmt, int rate) { int cur_rate, prev_rate; int clock; /* First, try to find a valid clock. This may trigger * automatic clock selection if the current clock is not * valid. */ clock = snd_usb_clock_find_source(chip, fmt, true); if (clock < 0) { /* We did not find a valid clock, but that might be * because the current sample rate does not match an * external clock source. Try again without validation * and we will do another validation after setting the * rate. */ clock = snd_usb_clock_find_source(chip, fmt, false); /* Hardcoded sample rates */ if (chip->quirk_flags & QUIRK_FLAG_IGNORE_CLOCK_SOURCE) return 0; if (clock < 0) return clock; } prev_rate = get_sample_rate_v2v3(chip, fmt->iface, fmt->altsetting, clock); if (prev_rate == rate) goto validation; cur_rate = snd_usb_set_sample_rate_v2v3(chip, fmt, clock, rate); if (cur_rate < 0) { usb_audio_err(chip, "%d:%d: cannot set freq %d (v2/v3): err %d\n", fmt->iface, fmt->altsetting, rate, cur_rate); return cur_rate; } if (!cur_rate) cur_rate = prev_rate; if (cur_rate != rate) { usb_audio_dbg(chip, "%d:%d: freq mismatch: req %d, clock runs @%d\n", fmt->iface, fmt->altsetting, rate, cur_rate); /* continue processing */ } /* FIXME - TEAC devices require the immediate interface setup */ if (USB_ID_VENDOR(chip->usb_id) == 0x0644) { bool cur_base_48k = (rate % 48000 == 0); bool prev_base_48k = (prev_rate % 48000 == 0); if (cur_base_48k != prev_base_48k) { usb_set_interface(chip->dev, fmt->iface, fmt->altsetting); if (chip->quirk_flags & QUIRK_FLAG_IFACE_DELAY) msleep(50); } } validation: /* validate clock after rate change */ if (!uac_clock_source_is_valid(chip, fmt, clock)) return -ENXIO; return 0; } int snd_usb_init_sample_rate(struct snd_usb_audio *chip, const struct audioformat *fmt, int rate) { usb_audio_dbg(chip, "%d:%d Set sample rate %d, clock %d\n", fmt->iface, fmt->altsetting, rate, fmt->clock); switch (fmt->protocol) { case UAC_VERSION_1: default: return set_sample_rate_v1(chip, fmt, rate); case UAC_VERSION_3: if (chip->badd_profile >= UAC3_FUNCTION_SUBCLASS_GENERIC_IO) { if (rate != UAC3_BADD_SAMPLING_RATE) return -ENXIO; else return 0; } fallthrough; case UAC_VERSION_2: return set_sample_rate_v2v3(chip, fmt, rate); } } |
1 2 2 2 1 1 1 1 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 | // SPDX-License-Identifier: GPL-2.0-or-later #include <linux/mrp_bridge.h> #include "br_private_mrp.h" static const u8 mrp_test_dmac[ETH_ALEN] = { 0x1, 0x15, 0x4e, 0x0, 0x0, 0x1 }; static const u8 mrp_in_test_dmac[ETH_ALEN] = { 0x1, 0x15, 0x4e, 0x0, 0x0, 0x3 }; static int br_mrp_process(struct net_bridge_port *p, struct sk_buff *skb); static struct br_frame_type mrp_frame_type __read_mostly = { .type = cpu_to_be16(ETH_P_MRP), .frame_handler = br_mrp_process, }; static bool br_mrp_is_ring_port(struct net_bridge_port *p_port, struct net_bridge_port *s_port, struct net_bridge_port *port) { if (port == p_port || port == s_port) return true; return false; } static bool br_mrp_is_in_port(struct net_bridge_port *i_port, struct net_bridge_port *port) { if (port == i_port) return true; return false; } static struct net_bridge_port *br_mrp_get_port(struct net_bridge *br, u32 ifindex) { struct net_bridge_port *res = NULL; struct net_bridge_port *port; list_for_each_entry(port, &br->port_list, list) { if (port->dev->ifindex == ifindex) { res = port; break; } } return res; } static struct br_mrp *br_mrp_find_id(struct net_bridge *br, u32 ring_id) { struct br_mrp *res = NULL; struct br_mrp *mrp; hlist_for_each_entry_rcu(mrp, &br->mrp_list, list, lockdep_rtnl_is_held()) { if (mrp->ring_id == ring_id) { res = mrp; break; } } return res; } static struct br_mrp *br_mrp_find_in_id(struct net_bridge *br, u32 in_id) { struct br_mrp *res = NULL; struct br_mrp *mrp; hlist_for_each_entry_rcu(mrp, &br->mrp_list, list, lockdep_rtnl_is_held()) { if (mrp->in_id == in_id) { res = mrp; break; } } return res; } static bool br_mrp_unique_ifindex(struct net_bridge *br, u32 ifindex) { struct br_mrp *mrp; hlist_for_each_entry_rcu(mrp, &br->mrp_list, list, lockdep_rtnl_is_held()) { struct net_bridge_port *p; p = rtnl_dereference(mrp->p_port); if (p && p->dev->ifindex == ifindex) return false; p = rtnl_dereference(mrp->s_port); if (p && p->dev->ifindex == ifindex) return false; p = rtnl_dereference(mrp->i_port); if (p && p->dev->ifindex == ifindex) return false; } return true; } static struct br_mrp *br_mrp_find_port(struct net_bridge *br, struct net_bridge_port *p) { struct br_mrp *res = NULL; struct br_mrp *mrp; hlist_for_each_entry_rcu(mrp, &br->mrp_list, list, lockdep_rtnl_is_held()) { if (rcu_access_pointer(mrp->p_port) == p || rcu_access_pointer(mrp->s_port) == p || rcu_access_pointer(mrp->i_port) == p) { res = mrp; break; } } return res; } static int br_mrp_next_seq(struct br_mrp *mrp) { mrp->seq_id++; return mrp->seq_id; } static struct sk_buff *br_mrp_skb_alloc(struct net_bridge_port *p, const u8 *src, const u8 *dst) { struct ethhdr *eth_hdr; struct sk_buff *skb; __be16 *version; skb = dev_alloc_skb(MRP_MAX_FRAME_LENGTH); if (!skb) return NULL; skb->dev = p->dev; skb->protocol = htons(ETH_P_MRP); skb->priority = MRP_FRAME_PRIO; skb_reserve(skb, sizeof(*eth_hdr)); eth_hdr = skb_push(skb, sizeof(*eth_hdr)); ether_addr_copy(eth_hdr->h_dest, dst); ether_addr_copy(eth_hdr->h_source, src); eth_hdr->h_proto = htons(ETH_P_MRP); version = skb_put(skb, sizeof(*version)); *version = cpu_to_be16(MRP_VERSION); return skb; } static void br_mrp_skb_tlv(struct sk_buff *skb, enum br_mrp_tlv_header_type type, u8 length) { struct br_mrp_tlv_hdr *hdr; hdr = skb_put(skb, sizeof(*hdr)); hdr->type = type; hdr->length = length; } static void br_mrp_skb_common(struct sk_buff *skb, struct br_mrp *mrp) { struct br_mrp_common_hdr *hdr; br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_COMMON, sizeof(*hdr)); hdr = skb_put(skb, sizeof(*hdr)); hdr->seq_id = cpu_to_be16(br_mrp_next_seq(mrp)); memset(hdr->domain, 0xff, MRP_DOMAIN_UUID_LENGTH); } static struct sk_buff *br_mrp_alloc_test_skb(struct br_mrp *mrp, struct net_bridge_port *p, enum br_mrp_port_role_type port_role) { struct br_mrp_ring_test_hdr *hdr = NULL; struct sk_buff *skb = NULL; if (!p) return NULL; skb = br_mrp_skb_alloc(p, p->dev->dev_addr, mrp_test_dmac); if (!skb) return NULL; br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_RING_TEST, sizeof(*hdr)); hdr = skb_put(skb, sizeof(*hdr)); hdr->prio = cpu_to_be16(mrp->prio); ether_addr_copy(hdr->sa, p->br->dev->dev_addr); hdr->port_role = cpu_to_be16(port_role); hdr->state = cpu_to_be16(mrp->ring_state); hdr->transitions = cpu_to_be16(mrp->ring_transitions); hdr->timestamp = cpu_to_be32(jiffies_to_msecs(jiffies)); br_mrp_skb_common(skb, mrp); /* In case the node behaves as MRA then the Test frame needs to have * an Option TLV which includes eventually a sub-option TLV that has * the type AUTO_MGR */ if (mrp->ring_role == BR_MRP_RING_ROLE_MRA) { struct br_mrp_sub_option1_hdr *sub_opt = NULL; struct br_mrp_tlv_hdr *sub_tlv = NULL; struct br_mrp_oui_hdr *oui = NULL; u8 length; length = sizeof(*sub_opt) + sizeof(*sub_tlv) + sizeof(oui) + MRP_OPT_PADDING; br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_OPTION, length); oui = skb_put(skb, sizeof(*oui)); memset(oui, 0x0, sizeof(*oui)); sub_opt = skb_put(skb, sizeof(*sub_opt)); memset(sub_opt, 0x0, sizeof(*sub_opt)); sub_tlv = skb_put(skb, sizeof(*sub_tlv)); sub_tlv->type = BR_MRP_SUB_TLV_HEADER_TEST_AUTO_MGR; /* 32 bit alligment shall be ensured therefore add 2 bytes */ skb_put(skb, MRP_OPT_PADDING); } br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_END, 0x0); return skb; } static struct sk_buff *br_mrp_alloc_in_test_skb(struct br_mrp *mrp, struct net_bridge_port *p, enum br_mrp_port_role_type port_role) { struct br_mrp_in_test_hdr *hdr = NULL; struct sk_buff *skb = NULL; if (!p) return NULL; skb = br_mrp_skb_alloc(p, p->dev->dev_addr, mrp_in_test_dmac); if (!skb) return NULL; br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_IN_TEST, sizeof(*hdr)); hdr = skb_put(skb, sizeof(*hdr)); hdr->id = cpu_to_be16(mrp->in_id); ether_addr_copy(hdr->sa, p->br->dev->dev_addr); hdr->port_role = cpu_to_be16(port_role); hdr->state = cpu_to_be16(mrp->in_state); hdr->transitions = cpu_to_be16(mrp->in_transitions); hdr->timestamp = cpu_to_be32(jiffies_to_msecs(jiffies)); br_mrp_skb_common(skb, mrp); br_mrp_skb_tlv(skb, BR_MRP_TLV_HEADER_END, 0x0); return skb; } /* This function is continuously called in the following cases: * - when node role is MRM, in this case test_monitor is always set to false * because it needs to notify the userspace that the ring is open and needs to * send MRP_Test frames * - when node role is MRA, there are 2 subcases: * - when MRA behaves as MRM, in this case is similar with MRM role * - when MRA behaves as MRC, in this case test_monitor is set to true, * because it needs to detect when it stops seeing MRP_Test frames * from MRM node but it doesn't need to send MRP_Test frames. */ static void br_mrp_test_work_expired(struct work_struct *work) { struct delayed_work *del_work = to_delayed_work(work); struct br_mrp *mrp = container_of(del_work, struct br_mrp, test_work); struct net_bridge_port *p; bool notify_open = false; struct sk_buff *skb; if (time_before_eq(mrp->test_end, jiffies)) return; if (mrp->test_count_miss < mrp->test_max_miss) { mrp->test_count_miss++; } else { /* Notify that the ring is open only if the ring state is * closed, otherwise it would continue to notify at every * interval. * Also notify that the ring is open when the node has the * role MRA and behaves as MRC. The reason is that the * userspace needs to know when the MRM stopped sending * MRP_Test frames so that the current node to try to take * the role of a MRM. */ if (mrp->ring_state == BR_MRP_RING_STATE_CLOSED || mrp->test_monitor) notify_open = true; } rcu_read_lock(); p = rcu_dereference(mrp->p_port); if (p) { if (!mrp->test_monitor) { skb = br_mrp_alloc_test_skb(mrp, p, BR_MRP_PORT_ROLE_PRIMARY); if (!skb) goto out; skb_reset_network_header(skb); dev_queue_xmit(skb); } if (notify_open && !mrp->ring_role_offloaded) br_mrp_ring_port_open(p->dev, true); } p = rcu_dereference(mrp->s_port); if (p) { if (!mrp->test_monitor) { skb = br_mrp_alloc_test_skb(mrp, p, BR_MRP_PORT_ROLE_SECONDARY); if (!skb) goto out; skb_reset_network_header(skb); dev_queue_xmit(skb); } if (notify_open && !mrp->ring_role_offloaded) br_mrp_ring_port_open(p->dev, true); } out: rcu_read_unlock(); queue_delayed_work(system_wq, &mrp->test_work, usecs_to_jiffies(mrp->test_interval)); } /* This function is continuously called when the node has the interconnect role * MIM. It would generate interconnect test frames and will send them on all 3 * ports. But will also check if it stop receiving interconnect test frames. */ static void br_mrp_in_test_work_expired(struct work_struct *work) { struct delayed_work *del_work = to_delayed_work(work); struct br_mrp *mrp = container_of(del_work, struct br_mrp, in_test_work); struct net_bridge_port *p; bool notify_open = false; struct sk_buff *skb; if (time_before_eq(mrp->in_test_end, jiffies)) return; if (mrp->in_test_count_miss < mrp->in_test_max_miss) { mrp->in_test_count_miss++; } else { /* Notify that the interconnect ring is open only if the * interconnect ring state is closed, otherwise it would * continue to notify at every interval. */ if (mrp->in_state == BR_MRP_IN_STATE_CLOSED) notify_open = true; } rcu_read_lock(); p = rcu_dereference(mrp->p_port); if (p) { skb = br_mrp_alloc_in_test_skb(mrp, p, BR_MRP_PORT_ROLE_PRIMARY); if (!skb) goto out; skb_reset_network_header(skb); dev_queue_xmit(skb); if (notify_open && !mrp->in_role_offloaded) br_mrp_in_port_open(p->dev, true); } p = rcu_dereference(mrp->s_port); if (p) { skb = br_mrp_alloc_in_test_skb(mrp, p, BR_MRP_PORT_ROLE_SECONDARY); if (!skb) goto out; skb_reset_network_header(skb); dev_queue_xmit(skb); if (notify_open && !mrp->in_role_offloaded) br_mrp_in_port_open(p->dev, true); } p = rcu_dereference(mrp->i_port); if (p) { skb = br_mrp_alloc_in_test_skb(mrp, p, BR_MRP_PORT_ROLE_INTER); if (!skb) goto out; skb_reset_network_header(skb); dev_queue_xmit(skb); if (notify_open && !mrp->in_role_offloaded) br_mrp_in_port_open(p->dev, true); } out: rcu_read_unlock(); queue_delayed_work(system_wq, &mrp->in_test_work, usecs_to_jiffies(mrp->in_test_interval)); } /* Deletes the MRP instance. * note: called under rtnl_lock */ static void br_mrp_del_impl(struct net_bridge *br, struct br_mrp *mrp) { struct net_bridge_port *p; u8 state; /* Stop sending MRP_Test frames */ cancel_delayed_work_sync(&mrp->test_work); br_mrp_switchdev_send_ring_test(br, mrp, 0, 0, 0, 0); /* Stop sending MRP_InTest frames if has an interconnect role */ cancel_delayed_work_sync(&mrp->in_test_work); br_mrp_switchdev_send_in_test(br, mrp, 0, 0, 0); /* Disable the roles */ br_mrp_switchdev_set_ring_role(br, mrp, BR_MRP_RING_ROLE_DISABLED); p = rtnl_dereference(mrp->i_port); if (p) br_mrp_switchdev_set_in_role(br, mrp, mrp->in_id, mrp->ring_id, BR_MRP_IN_ROLE_DISABLED); br_mrp_switchdev_del(br, mrp); /* Reset the ports */ p = rtnl_dereference(mrp->p_port); if (p) { spin_lock_bh(&br->lock); state = netif_running(br->dev) ? BR_STATE_FORWARDING : BR_STATE_DISABLED; p->state = state; p->flags &= ~BR_MRP_AWARE; spin_unlock_bh(&br->lock); br_mrp_port_switchdev_set_state(p, state); rcu_assign_pointer(mrp->p_port, NULL); } p = rtnl_dereference(mrp->s_port); if (p) { spin_lock_bh(&br->lock); state = netif_running(br->dev) ? BR_STATE_FORWARDING : BR_STATE_DISABLED; p->state = state; p->flags &= ~BR_MRP_AWARE; spin_unlock_bh(&br->lock); br_mrp_port_switchdev_set_state(p, state); rcu_assign_pointer(mrp->s_port, NULL); } p = rtnl_dereference(mrp->i_port); if (p) { spin_lock_bh(&br->lock); state = netif_running(br->dev) ? BR_STATE_FORWARDING : BR_STATE_DISABLED; p->state = state; p->flags &= ~BR_MRP_AWARE; spin_unlock_bh(&br->lock); br_mrp_port_switchdev_set_state(p, state); rcu_assign_pointer(mrp->i_port, NULL); } hlist_del_rcu(&mrp->list); kfree_rcu(mrp, rcu); if (hlist_empty(&br->mrp_list)) br_del_frame(br, &mrp_frame_type); } /* Adds a new MRP instance. * note: called under rtnl_lock */ int br_mrp_add(struct net_bridge *br, struct br_mrp_instance *instance) { struct net_bridge_port *p; struct br_mrp *mrp; int err; /* If the ring exists, it is not possible to create another one with the * same ring_id */ mrp = br_mrp_find_id(br, instance->ring_id); if (mrp) return -EINVAL; if (!br_mrp_get_port(br, instance->p_ifindex) || !br_mrp_get_port(br, instance->s_ifindex)) return -EINVAL; /* It is not possible to have the same port part of multiple rings */ if (!br_mrp_unique_ifindex(br, instance->p_ifindex) || !br_mrp_unique_ifindex(br, instance->s_ifindex)) return -EINVAL; mrp = kzalloc(sizeof(*mrp), GFP_KERNEL); if (!mrp) return -ENOMEM; mrp->ring_id = instance->ring_id; mrp->prio = instance->prio; p = br_mrp_get_port(br, instance->p_ifindex); spin_lock_bh(&br->lock); p->state = BR_STATE_FORWARDING; p->flags |= BR_MRP_AWARE; spin_unlock_bh(&br->lock); rcu_assign_pointer(mrp->p_port, p); p = br_mrp_get_port(br, instance->s_ifindex); spin_lock_bh(&br->lock); p->state = BR_STATE_FORWARDING; p->flags |= BR_MRP_AWARE; spin_unlock_bh(&br->lock); rcu_assign_pointer(mrp->s_port, p); if (hlist_empty(&br->mrp_list)) br_add_frame(br, &mrp_frame_type); INIT_DELAYED_WORK(&mrp->test_work, br_mrp_test_work_expired); INIT_DELAYED_WORK(&mrp->in_test_work, br_mrp_in_test_work_expired); hlist_add_tail_rcu(&mrp->list, &br->mrp_list); err = br_mrp_switchdev_add(br, mrp); if (err) goto delete_mrp; return 0; delete_mrp: br_mrp_del_impl(br, mrp); return err; } /* Deletes the MRP instance from which the port is part of * note: called under rtnl_lock */ void br_mrp_port_del(struct net_bridge *br, struct net_bridge_port *p) { struct br_mrp *mrp = br_mrp_find_port(br, p); /* If the port is not part of a MRP instance just bail out */ if (!mrp) return; br_mrp_del_impl(br, mrp); } /* Deletes existing MRP instance based on ring_id * note: called under rtnl_lock */ int br_mrp_del(struct net_bridge *br, struct br_mrp_instance *instance) { struct br_mrp *mrp = br_mrp_find_id(br, instance->ring_id); if (!mrp) return -EINVAL; br_mrp_del_impl(br, mrp); return 0; } /* Set port state, port state can be forwarding, blocked or disabled * note: already called with rtnl_lock */ int br_mrp_set_port_state(struct net_bridge_port *p, enum br_mrp_port_state_type state) { u32 port_state; if (!p || !(p->flags & BR_MRP_AWARE)) return -EINVAL; spin_lock_bh(&p->br->lock); if (state == BR_MRP_PORT_STATE_FORWARDING) port_state = BR_STATE_FORWARDING; else port_state = BR_STATE_BLOCKING; p->state = port_state; spin_unlock_bh(&p->br->lock); br_mrp_port_switchdev_set_state(p, port_state); return 0; } /* Set port role, port role can be primary or secondary * note: already called with rtnl_lock */ int br_mrp_set_port_role(struct net_bridge_port *p, enum br_mrp_port_role_type role) { struct br_mrp *mrp; if (!p || !(p->flags & BR_MRP_AWARE)) return -EINVAL; mrp = br_mrp_find_port(p->br, p); if (!mrp) return -EINVAL; switch (role) { case BR_MRP_PORT_ROLE_PRIMARY: rcu_assign_pointer(mrp->p_port, p); break; case BR_MRP_PORT_ROLE_SECONDARY: rcu_assign_pointer(mrp->s_port, p); break; default: return -EINVAL; } br_mrp_port_switchdev_set_role(p, role); return 0; } /* Set ring state, ring state can be only Open or Closed * note: already called with rtnl_lock */ int br_mrp_set_ring_state(struct net_bridge *br, struct br_mrp_ring_state *state) { struct br_mrp *mrp = br_mrp_find_id(br, state->ring_id); if (!mrp) return -EINVAL; if (mrp->ring_state != state->ring_state) mrp->ring_transitions++; mrp->ring_state = state->ring_state; br_mrp_switchdev_set_ring_state(br, mrp, state->ring_state); return 0; } /* Set ring role, ring role can be only MRM(Media Redundancy Manager) or * MRC(Media Redundancy Client). * note: already called with rtnl_lock */ int br_mrp_set_ring_role(struct net_bridge *br, struct br_mrp_ring_role *role) { struct br_mrp *mrp = br_mrp_find_id(br, role->ring_id); enum br_mrp_hw_support support; if (!mrp) return -EINVAL; mrp->ring_role = role->ring_role; /* If there is an error just bailed out */ support = br_mrp_switchdev_set_ring_role(br, mrp, role->ring_role); if (support == BR_MRP_NONE) return -EOPNOTSUPP; /* Now detect if the HW actually applied the role or not. If the HW * applied the role it means that the SW will not to do those operations * anymore. For example if the role ir MRM then the HW will notify the * SW when ring is open, but if the is not pushed to the HW the SW will * need to detect when the ring is open */ mrp->ring_role_offloaded = support == BR_MRP_SW ? 0 : 1; return 0; } /* Start to generate or monitor MRP test frames, the frames are generated by * HW and if it fails, they are generated by the SW. * note: already called with rtnl_lock */ int br_mrp_start_test(struct net_bridge *br, struct br_mrp_start_test *test) { struct br_mrp *mrp = br_mrp_find_id(br, test->ring_id); enum br_mrp_hw_support support; if (!mrp) return -EINVAL; /* Try to push it to the HW and if it fails then continue with SW * implementation and if that also fails then return error. */ support = br_mrp_switchdev_send_ring_test(br, mrp, test->interval, test->max_miss, test->period, test->monitor); if (support == BR_MRP_NONE) return -EOPNOTSUPP; if (support == BR_MRP_HW) return 0; mrp->test_interval = test->interval; mrp->test_end = jiffies + usecs_to_jiffies(test->period); mrp->test_max_miss = test->max_miss; mrp->test_monitor = test->monitor; mrp->test_count_miss = 0; queue_delayed_work(system_wq, &mrp->test_work, usecs_to_jiffies(test->interval)); return 0; } /* Set in state, int state can be only Open or Closed * note: already called with rtnl_lock */ int br_mrp_set_in_state(struct net_bridge *br, struct br_mrp_in_state *state) { struct br_mrp *mrp = br_mrp_find_in_id(br, state->in_id); if (!mrp) return -EINVAL; if (mrp->in_state != state->in_state) mrp->in_transitions++; mrp->in_state = state->in_state; br_mrp_switchdev_set_in_state(br, mrp, state->in_state); return 0; } /* Set in role, in role can be only MIM(Media Interconnection Manager) or * MIC(Media Interconnection Client). * note: already called with rtnl_lock */ int br_mrp_set_in_role(struct net_bridge *br, struct br_mrp_in_role *role) { struct br_mrp *mrp = br_mrp_find_id(br, role->ring_id); enum br_mrp_hw_support support; struct net_bridge_port *p; if (!mrp) return -EINVAL; if (!br_mrp_get_port(br, role->i_ifindex)) return -EINVAL; if (role->in_role == BR_MRP_IN_ROLE_DISABLED) { u8 state; /* It is not allowed to disable a port that doesn't exist */ p = rtnl_dereference(mrp->i_port); if (!p) return -EINVAL; /* Stop the generating MRP_InTest frames */ cancel_delayed_work_sync(&mrp->in_test_work); br_mrp_switchdev_send_in_test(br, mrp, 0, 0, 0); /* Remove the port */ spin_lock_bh(&br->lock); state = netif_running(br->dev) ? BR_STATE_FORWARDING : BR_STATE_DISABLED; p->state = state; p->flags &= ~BR_MRP_AWARE; spin_unlock_bh(&br->lock); br_mrp_port_switchdev_set_state(p, state); rcu_assign_pointer(mrp->i_port, NULL); mrp->in_role = role->in_role; mrp->in_id = 0; return 0; } /* It is not possible to have the same port part of multiple rings */ if (!br_mrp_unique_ifindex(br, role->i_ifindex)) return -EINVAL; /* It is not allowed to set a different interconnect port if the mrp * instance has already one. First it needs to be disabled and after * that set the new port */ if (rcu_access_pointer(mrp->i_port)) return -EINVAL; p = br_mrp_get_port(br, role->i_ifindex); spin_lock_bh(&br->lock); p->state = BR_STATE_FORWARDING; p->flags |= BR_MRP_AWARE; spin_unlock_bh(&br->lock); rcu_assign_pointer(mrp->i_port, p); mrp->in_role = role->in_role; mrp->in_id = role->in_id; /* If there is an error just bailed out */ support = br_mrp_switchdev_set_in_role(br, mrp, role->in_id, role->ring_id, role->in_role); if (support == BR_MRP_NONE) return -EOPNOTSUPP; /* Now detect if the HW actually applied the role or not. If the HW * applied the role it means that the SW will not to do those operations * anymore. For example if the role is MIM then the HW will notify the * SW when interconnect ring is open, but if the is not pushed to the HW * the SW will need to detect when the interconnect ring is open. */ mrp->in_role_offloaded = support == BR_MRP_SW ? 0 : 1; return 0; } /* Start to generate MRP_InTest frames, the frames are generated by * HW and if it fails, they are generated by the SW. * note: already called with rtnl_lock */ int br_mrp_start_in_test(struct net_bridge *br, struct br_mrp_start_in_test *in_test) { struct br_mrp *mrp = br_mrp_find_in_id(br, in_test->in_id); enum br_mrp_hw_support support; if (!mrp) return -EINVAL; if (mrp->in_role != BR_MRP_IN_ROLE_MIM) return -EINVAL; /* Try to push it to the HW and if it fails then continue with SW * implementation and if that also fails then return error. */ support = br_mrp_switchdev_send_in_test(br, mrp, in_test->interval, in_test->max_miss, in_test->period); if (support == BR_MRP_NONE) return -EOPNOTSUPP; if (support == BR_MRP_HW) return 0; mrp->in_test_interval = in_test->interval; mrp->in_test_end = jiffies + usecs_to_jiffies(in_test->period); mrp->in_test_max_miss = in_test->max_miss; mrp->in_test_count_miss = 0; queue_delayed_work(system_wq, &mrp->in_test_work, usecs_to_jiffies(in_test->interval)); return 0; } /* Determine if the frame type is a ring frame */ static bool br_mrp_ring_frame(struct sk_buff *skb) { const struct br_mrp_tlv_hdr *hdr; struct br_mrp_tlv_hdr _hdr; hdr = skb_header_pointer(skb, sizeof(uint16_t), sizeof(_hdr), &_hdr); if (!hdr) return false; if (hdr->type == BR_MRP_TLV_HEADER_RING_TEST || hdr->type == BR_MRP_TLV_HEADER_RING_TOPO || hdr->type == BR_MRP_TLV_HEADER_RING_LINK_DOWN || hdr->type == BR_MRP_TLV_HEADER_RING_LINK_UP || hdr->type == BR_MRP_TLV_HEADER_OPTION) return true; return false; } /* Determine if the frame type is an interconnect frame */ static bool br_mrp_in_frame(struct sk_buff *skb) { const struct br_mrp_tlv_hdr *hdr; struct br_mrp_tlv_hdr _hdr; hdr = skb_header_pointer(skb, sizeof(uint16_t), sizeof(_hdr), &_hdr); if (!hdr) return false; if (hdr->type == BR_MRP_TLV_HEADER_IN_TEST || hdr->type == BR_MRP_TLV_HEADER_IN_TOPO || hdr->type == BR_MRP_TLV_HEADER_IN_LINK_DOWN || hdr->type == BR_MRP_TLV_HEADER_IN_LINK_UP || hdr->type == BR_MRP_TLV_HEADER_IN_LINK_STATUS) return true; return false; } /* Process only MRP Test frame. All the other MRP frames are processed by * userspace application * note: already called with rcu_read_lock */ static void br_mrp_mrm_process(struct br_mrp *mrp, struct net_bridge_port *port, struct sk_buff *skb) { const struct br_mrp_tlv_hdr *hdr; struct br_mrp_tlv_hdr _hdr; /* Each MRP header starts with a version field which is 16 bits. * Therefore skip the version and get directly the TLV header. */ hdr = skb_header_pointer(skb, sizeof(uint16_t), sizeof(_hdr), &_hdr); if (!hdr) return; if (hdr->type != BR_MRP_TLV_HEADER_RING_TEST) return; mrp->test_count_miss = 0; /* Notify the userspace that the ring is closed only when the ring is * not closed */ if (mrp->ring_state != BR_MRP_RING_STATE_CLOSED) br_mrp_ring_port_open(port->dev, false); } /* Determine if the test hdr has a better priority than the node */ static bool br_mrp_test_better_than_own(struct br_mrp *mrp, struct net_bridge *br, const struct br_mrp_ring_test_hdr *hdr) { u16 prio = be16_to_cpu(hdr->prio); if (prio < mrp->prio || (prio == mrp->prio && ether_addr_to_u64(hdr->sa) < ether_addr_to_u64(br->dev->dev_addr))) return true; return false; } /* Process only MRP Test frame. All the other MRP frames are processed by * userspace application * note: already called with rcu_read_lock */ static void br_mrp_mra_process(struct br_mrp *mrp, struct net_bridge *br, struct net_bridge_port *port, struct sk_buff *skb) { const struct br_mrp_ring_test_hdr *test_hdr; struct br_mrp_ring_test_hdr _test_hdr; const struct br_mrp_tlv_hdr *hdr; struct br_mrp_tlv_hdr _hdr; /* Each MRP header starts with a version field which is 16 bits. * Therefore skip the version and get directly the TLV header. */ hdr = skb_header_pointer(skb, sizeof(uint16_t), sizeof(_hdr), &_hdr); if (!hdr) return; if (hdr->type != BR_MRP_TLV_HEADER_RING_TEST) return; test_hdr = skb_header_pointer(skb, sizeof(uint16_t) + sizeof(_hdr), sizeof(_test_hdr), &_test_hdr); if (!test_hdr) return; /* Only frames that have a better priority than the node will * clear the miss counter because otherwise the node will need to behave * as MRM. */ if (br_mrp_test_better_than_own(mrp, br, test_hdr)) mrp->test_count_miss = 0; } /* Process only MRP InTest frame. All the other MRP frames are processed by * userspace application * note: already called with rcu_read_lock */ static bool br_mrp_mim_process(struct br_mrp *mrp, struct net_bridge_port *port, struct sk_buff *skb) { const struct br_mrp_in_test_hdr *in_hdr; struct br_mrp_in_test_hdr _in_hdr; const struct br_mrp_tlv_hdr *hdr; struct br_mrp_tlv_hdr _hdr; /* Each MRP header starts with a version field which is 16 bits. * Therefore skip the version and get directly the TLV header. */ hdr = skb_header_pointer(skb, sizeof(uint16_t), sizeof(_hdr), &_hdr); if (!hdr) return false; /* The check for InTest frame type was already done */ in_hdr = skb_header_pointer(skb, sizeof(uint16_t) + sizeof(_hdr), sizeof(_in_hdr), &_in_hdr); if (!in_hdr) return false; /* It needs to process only it's own InTest frames. */ if (mrp->in_id != ntohs(in_hdr->id)) return false; mrp->in_test_count_miss = 0; /* Notify the userspace that the ring is closed only when the ring is * not closed */ if (mrp->in_state != BR_MRP_IN_STATE_CLOSED) br_mrp_in_port_open(port->dev, false); return true; } /* Get the MRP frame type * note: already called with rcu_read_lock */ static u8 br_mrp_get_frame_type(struct sk_buff *skb) { const struct br_mrp_tlv_hdr *hdr; struct br_mrp_tlv_hdr _hdr; /* Each MRP header starts with a version field which is 16 bits. * Therefore skip the version and get directly the TLV header. */ hdr = skb_header_pointer(skb, sizeof(uint16_t), sizeof(_hdr), &_hdr); if (!hdr) return 0xff; return hdr->type; } static bool br_mrp_mrm_behaviour(struct br_mrp *mrp) { if (mrp->ring_role == BR_MRP_RING_ROLE_MRM || (mrp->ring_role == BR_MRP_RING_ROLE_MRA && !mrp->test_monitor)) return true; return false; } static bool br_mrp_mrc_behaviour(struct br_mrp *mrp) { if (mrp->ring_role == BR_MRP_RING_ROLE_MRC || (mrp->ring_role == BR_MRP_RING_ROLE_MRA && mrp->test_monitor)) return true; return false; } /* This will just forward the frame to the other mrp ring ports, depending on * the frame type, ring role and interconnect role * note: already called with rcu_read_lock */ static int br_mrp_rcv(struct net_bridge_port *p, struct sk_buff *skb, struct net_device *dev) { struct net_bridge_port *p_port, *s_port, *i_port = NULL; struct net_bridge_port *p_dst, *s_dst, *i_dst = NULL; struct net_bridge *br; struct br_mrp *mrp; /* If port is disabled don't accept any frames */ if (p->state == BR_STATE_DISABLED) return 0; br = p->br; mrp = br_mrp_find_port(br, p); if (unlikely(!mrp)) return 0; p_port = rcu_dereference(mrp->p_port); if (!p_port) return 0; p_dst = p_port; s_port = rcu_dereference(mrp->s_port); if (!s_port) return 0; s_dst = s_port; /* If the frame is a ring frame then it is not required to check the * interconnect role and ports to process or forward the frame */ if (br_mrp_ring_frame(skb)) { /* If the role is MRM then don't forward the frames */ if (mrp->ring_role == BR_MRP_RING_ROLE_MRM) { br_mrp_mrm_process(mrp, p, skb); goto no_forward; } /* If the role is MRA then don't forward the frames if it * behaves as MRM node */ if (mrp->ring_role == BR_MRP_RING_ROLE_MRA) { if (!mrp->test_monitor) { br_mrp_mrm_process(mrp, p, skb); goto no_forward; } br_mrp_mra_process(mrp, br, p, skb); } goto forward; } if (br_mrp_in_frame(skb)) { u8 in_type = br_mrp_get_frame_type(skb); i_port = rcu_dereference(mrp->i_port); i_dst = i_port; /* If the ring port is in block state it should not forward * In_Test frames */ if (br_mrp_is_ring_port(p_port, s_port, p) && p->state == BR_STATE_BLOCKING && in_type == BR_MRP_TLV_HEADER_IN_TEST) goto no_forward; /* Nodes that behaves as MRM needs to stop forwarding the * frames in case the ring is closed, otherwise will be a loop. * In this case the frame is no forward between the ring ports. */ if (br_mrp_mrm_behaviour(mrp) && br_mrp_is_ring_port(p_port, s_port, p) && (s_port->state != BR_STATE_FORWARDING || p_port->state != BR_STATE_FORWARDING)) { p_dst = NULL; s_dst = NULL; } /* A node that behaves as MRC and doesn't have a interconnect * role then it should forward all frames between the ring ports * because it doesn't have an interconnect port */ if (br_mrp_mrc_behaviour(mrp) && mrp->in_role == BR_MRP_IN_ROLE_DISABLED) goto forward; if (mrp->in_role == BR_MRP_IN_ROLE_MIM) { if (in_type == BR_MRP_TLV_HEADER_IN_TEST) { /* MIM should not forward it's own InTest * frames */ if (br_mrp_mim_process(mrp, p, skb)) { goto no_forward; } else { if (br_mrp_is_ring_port(p_port, s_port, p)) i_dst = NULL; if (br_mrp_is_in_port(i_port, p)) goto no_forward; } } else { /* MIM should forward IntLinkChange/Status and * IntTopoChange between ring ports but MIM * should not forward IntLinkChange/Status and * IntTopoChange if the frame was received at * the interconnect port */ if (br_mrp_is_ring_port(p_port, s_port, p)) i_dst = NULL; if (br_mrp_is_in_port(i_port, p)) goto no_forward; } } if (mrp->in_role == BR_MRP_IN_ROLE_MIC) { /* MIC should forward InTest frames on all ports * regardless of the received port */ if (in_type == BR_MRP_TLV_HEADER_IN_TEST) goto forward; /* MIC should forward IntLinkChange frames only if they * are received on ring ports to all the ports */ if (br_mrp_is_ring_port(p_port, s_port, p) && (in_type == BR_MRP_TLV_HEADER_IN_LINK_UP || in_type == BR_MRP_TLV_HEADER_IN_LINK_DOWN)) goto forward; /* MIC should forward IntLinkStatus frames only to * interconnect port if it was received on a ring port. * If it is received on interconnect port then, it * should be forward on both ring ports */ if (br_mrp_is_ring_port(p_port, s_port, p) && in_type == BR_MRP_TLV_HEADER_IN_LINK_STATUS) { p_dst = NULL; s_dst = NULL; } /* Should forward the InTopo frames only between the * ring ports */ if (in_type == BR_MRP_TLV_HEADER_IN_TOPO) { i_dst = NULL; goto forward; } /* In all the other cases don't forward the frames */ goto no_forward; } } forward: if (p_dst) br_forward(p_dst, skb, true, false); if (s_dst) br_forward(s_dst, skb, true, false); if (i_dst) br_forward(i_dst, skb, true, false); no_forward: return 1; } /* Check if the frame was received on a port that is part of MRP ring * and if the frame has MRP eth. In that case process the frame otherwise do * normal forwarding. * note: already called with rcu_read_lock */ static int br_mrp_process(struct net_bridge_port *p, struct sk_buff *skb) { /* If there is no MRP instance do normal forwarding */ if (likely(!(p->flags & BR_MRP_AWARE))) goto out; return br_mrp_rcv(p, skb, p->dev); out: return 0; } bool br_mrp_enabled(struct net_bridge *br) { return !hlist_empty(&br->mrp_list); } |
14 14 14 14 14 14 14 14 14 14 14 14 22 15 6 15 1 15 14 14 14 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 | // SPDX-License-Identifier: GPL-2.0 #include <linux/fs.h> #include <linux/random.h> #include <linux/buffer_head.h> #include <linux/utsname.h> #include <linux/kthread.h> #include "ext4.h" /* Checksumming functions */ static __le32 ext4_mmp_csum(struct super_block *sb, struct mmp_struct *mmp) { struct ext4_sb_info *sbi = EXT4_SB(sb); int offset = offsetof(struct mmp_struct, mmp_checksum); __u32 csum; csum = ext4_chksum(sbi, sbi->s_csum_seed, (char *)mmp, offset); return cpu_to_le32(csum); } static int ext4_mmp_csum_verify(struct super_block *sb, struct mmp_struct *mmp) { if (!ext4_has_metadata_csum(sb)) return 1; return mmp->mmp_checksum == ext4_mmp_csum(sb, mmp); } static void ext4_mmp_csum_set(struct super_block *sb, struct mmp_struct *mmp) { if (!ext4_has_metadata_csum(sb)) return; mmp->mmp_checksum = ext4_mmp_csum(sb, mmp); } /* * Write the MMP block using REQ_SYNC to try to get the block on-disk * faster. */ static int write_mmp_block_thawed(struct super_block *sb, struct buffer_head *bh) { struct mmp_struct *mmp = (struct mmp_struct *)(bh->b_data); ext4_mmp_csum_set(sb, mmp); lock_buffer(bh); bh->b_end_io = end_buffer_write_sync; get_bh(bh); submit_bh(REQ_OP_WRITE | REQ_SYNC | REQ_META | REQ_PRIO, bh); wait_on_buffer(bh); if (unlikely(!buffer_uptodate(bh))) return -EIO; return 0; } static int write_mmp_block(struct super_block *sb, struct buffer_head *bh) { int err; /* * We protect against freezing so that we don't create dirty buffers * on frozen filesystem. */ sb_start_write(sb); err = write_mmp_block_thawed(sb, bh); sb_end_write(sb); return err; } /* * Read the MMP block. It _must_ be read from disk and hence we clear the * uptodate flag on the buffer. */ static int read_mmp_block(struct super_block *sb, struct buffer_head **bh, ext4_fsblk_t mmp_block) { struct mmp_struct *mmp; int ret; if (*bh) clear_buffer_uptodate(*bh); /* This would be sb_bread(sb, mmp_block), except we need to be sure * that the MD RAID device cache has been bypassed, and that the read * is not blocked in the elevator. */ if (!*bh) { *bh = sb_getblk(sb, mmp_block); if (!*bh) { ret = -ENOMEM; goto warn_exit; } } lock_buffer(*bh); ret = ext4_read_bh(*bh, REQ_META | REQ_PRIO, NULL); if (ret) goto warn_exit; mmp = (struct mmp_struct *)((*bh)->b_data); if (le32_to_cpu(mmp->mmp_magic) != EXT4_MMP_MAGIC) { ret = -EFSCORRUPTED; goto warn_exit; } if (!ext4_mmp_csum_verify(sb, mmp)) { ret = -EFSBADCRC; goto warn_exit; } return 0; warn_exit: brelse(*bh); *bh = NULL; ext4_warning(sb, "Error %d while reading MMP block %llu", ret, mmp_block); return ret; } /* * Dump as much information as possible to help the admin. */ void __dump_mmp_msg(struct super_block *sb, struct mmp_struct *mmp, const char *function, unsigned int line, const char *msg) { __ext4_warning(sb, function, line, "%s", msg); __ext4_warning(sb, function, line, "MMP failure info: last update time: %llu, last update node: %.*s, last update device: %.*s", (unsigned long long)le64_to_cpu(mmp->mmp_time), (int)sizeof(mmp->mmp_nodename), mmp->mmp_nodename, (int)sizeof(mmp->mmp_bdevname), mmp->mmp_bdevname); } /* * kmmpd will update the MMP sequence every s_mmp_update_interval seconds */ static int kmmpd(void *data) { struct super_block *sb = data; struct ext4_super_block *es = EXT4_SB(sb)->s_es; struct buffer_head *bh = EXT4_SB(sb)->s_mmp_bh; struct mmp_struct *mmp; ext4_fsblk_t mmp_block; u32 seq = 0; unsigned long failed_writes = 0; int mmp_update_interval = le16_to_cpu(es->s_mmp_update_interval); unsigned mmp_check_interval; unsigned long last_update_time; unsigned long diff; int retval = 0; mmp_block = le64_to_cpu(es->s_mmp_block); mmp = (struct mmp_struct *)(bh->b_data); mmp->mmp_time = cpu_to_le64(ktime_get_real_seconds()); /* * Start with the higher mmp_check_interval and reduce it if * the MMP block is being updated on time. */ mmp_check_interval = max(EXT4_MMP_CHECK_MULT * mmp_update_interval, EXT4_MMP_MIN_CHECK_INTERVAL); mmp->mmp_check_interval = cpu_to_le16(mmp_check_interval); memcpy(mmp->mmp_nodename, init_utsname()->nodename, sizeof(mmp->mmp_nodename)); while (!kthread_should_stop() && !ext4_forced_shutdown(sb)) { if (!ext4_has_feature_mmp(sb)) { ext4_warning(sb, "kmmpd being stopped since MMP feature" " has been disabled."); goto wait_to_exit; } if (++seq > EXT4_MMP_SEQ_MAX) seq = 1; mmp->mmp_seq = cpu_to_le32(seq); mmp->mmp_time = cpu_to_le64(ktime_get_real_seconds()); last_update_time = jiffies; retval = write_mmp_block(sb, bh); /* * Don't spew too many error messages. Print one every * (s_mmp_update_interval * 60) seconds. */ if (retval) { if ((failed_writes % 60) == 0) { ext4_error_err(sb, -retval, "Error writing to MMP block"); } failed_writes++; } diff = jiffies - last_update_time; if (diff < mmp_update_interval * HZ) schedule_timeout_interruptible(mmp_update_interval * HZ - diff); /* * We need to make sure that more than mmp_check_interval * seconds have not passed since writing. If that has happened * we need to check if the MMP block is as we left it. */ diff = jiffies - last_update_time; if (diff > mmp_check_interval * HZ) { struct buffer_head *bh_check = NULL; struct mmp_struct *mmp_check; retval = read_mmp_block(sb, &bh_check, mmp_block); if (retval) { ext4_error_err(sb, -retval, "error reading MMP data: %d", retval); goto wait_to_exit; } mmp_check = (struct mmp_struct *)(bh_check->b_data); if (mmp->mmp_seq != mmp_check->mmp_seq || memcmp(mmp->mmp_nodename, mmp_check->mmp_nodename, sizeof(mmp->mmp_nodename))) { dump_mmp_msg(sb, mmp_check, "Error while updating MMP info. " "The filesystem seems to have been" " multiply mounted."); ext4_error_err(sb, EBUSY, "abort"); put_bh(bh_check); retval = -EBUSY; goto wait_to_exit; } put_bh(bh_check); } /* * Adjust the mmp_check_interval depending on how much time * it took for the MMP block to be written. */ mmp_check_interval = max(min(EXT4_MMP_CHECK_MULT * diff / HZ, EXT4_MMP_MAX_CHECK_INTERVAL), EXT4_MMP_MIN_CHECK_INTERVAL); mmp->mmp_check_interval = cpu_to_le16(mmp_check_interval); } /* * Unmount seems to be clean. */ mmp->mmp_seq = cpu_to_le32(EXT4_MMP_SEQ_CLEAN); mmp->mmp_time = cpu_to_le64(ktime_get_real_seconds()); retval = write_mmp_block(sb, bh); wait_to_exit: while (!kthread_should_stop()) { set_current_state(TASK_INTERRUPTIBLE); if (!kthread_should_stop()) schedule(); } set_current_state(TASK_RUNNING); return retval; } void ext4_stop_mmpd(struct ext4_sb_info *sbi) { if (sbi->s_mmp_tsk) { kthread_stop(sbi->s_mmp_tsk); brelse(sbi->s_mmp_bh); sbi->s_mmp_tsk = NULL; } } /* * Get a random new sequence number but make sure it is not greater than * EXT4_MMP_SEQ_MAX. */ static unsigned int mmp_new_seq(void) { return get_random_u32_below(EXT4_MMP_SEQ_MAX + 1); } /* * Protect the filesystem from being mounted more than once. */ int ext4_multi_mount_protect(struct super_block *sb, ext4_fsblk_t mmp_block) { struct ext4_super_block *es = EXT4_SB(sb)->s_es; struct buffer_head *bh = NULL; struct mmp_struct *mmp = NULL; u32 seq; unsigned int mmp_check_interval = le16_to_cpu(es->s_mmp_update_interval); unsigned int wait_time = 0; int retval; if (mmp_block < le32_to_cpu(es->s_first_data_block) || mmp_block >= ext4_blocks_count(es)) { ext4_warning(sb, "Invalid MMP block in superblock"); retval = -EINVAL; goto failed; } retval = read_mmp_block(sb, &bh, mmp_block); if (retval) goto failed; mmp = (struct mmp_struct *)(bh->b_data); if (mmp_check_interval < EXT4_MMP_MIN_CHECK_INTERVAL) mmp_check_interval = EXT4_MMP_MIN_CHECK_INTERVAL; /* * If check_interval in MMP block is larger, use that instead of * update_interval from the superblock. */ if (le16_to_cpu(mmp->mmp_check_interval) > mmp_check_interval) mmp_check_interval = le16_to_cpu(mmp->mmp_check_interval); seq = le32_to_cpu(mmp->mmp_seq); if (seq == EXT4_MMP_SEQ_CLEAN) goto skip; if (seq == EXT4_MMP_SEQ_FSCK) { dump_mmp_msg(sb, mmp, "fsck is running on the filesystem"); retval = -EBUSY; goto failed; } wait_time = min(mmp_check_interval * 2 + 1, mmp_check_interval + 60); /* Print MMP interval if more than 20 secs. */ if (wait_time > EXT4_MMP_MIN_CHECK_INTERVAL * 4) ext4_warning(sb, "MMP interval %u higher than expected, please" " wait.\n", wait_time * 2); if (schedule_timeout_interruptible(HZ * wait_time) != 0) { ext4_warning(sb, "MMP startup interrupted, failing mount\n"); retval = -ETIMEDOUT; goto failed; } retval = read_mmp_block(sb, &bh, mmp_block); if (retval) goto failed; mmp = (struct mmp_struct *)(bh->b_data); if (seq != le32_to_cpu(mmp->mmp_seq)) { dump_mmp_msg(sb, mmp, "Device is already active on another node."); retval = -EBUSY; goto failed; } skip: /* * write a new random sequence number. */ seq = mmp_new_seq(); mmp->mmp_seq = cpu_to_le32(seq); /* * On mount / remount we are protected against fs freezing (by s_umount * semaphore) and grabbing freeze protection upsets lockdep */ retval = write_mmp_block_thawed(sb, bh); if (retval) goto failed; /* * wait for MMP interval and check mmp_seq. */ if (schedule_timeout_interruptible(HZ * wait_time) != 0) { ext4_warning(sb, "MMP startup interrupted, failing mount"); retval = -ETIMEDOUT; goto failed; } retval = read_mmp_block(sb, &bh, mmp_block); if (retval) goto failed; mmp = (struct mmp_struct *)(bh->b_data); if (seq != le32_to_cpu(mmp->mmp_seq)) { dump_mmp_msg(sb, mmp, "Device is already active on another node."); retval = -EBUSY; goto failed; } EXT4_SB(sb)->s_mmp_bh = bh; BUILD_BUG_ON(sizeof(mmp->mmp_bdevname) < BDEVNAME_SIZE); snprintf(mmp->mmp_bdevname, sizeof(mmp->mmp_bdevname), "%pg", bh->b_bdev); /* * Start a kernel thread to update the MMP block periodically. */ EXT4_SB(sb)->s_mmp_tsk = kthread_run(kmmpd, sb, "kmmpd-%.*s", (int)sizeof(mmp->mmp_bdevname), mmp->mmp_bdevname); if (IS_ERR(EXT4_SB(sb)->s_mmp_tsk)) { EXT4_SB(sb)->s_mmp_tsk = NULL; ext4_warning(sb, "Unable to create kmmpd thread for %s.", sb->s_id); retval = -ENOMEM; goto failed; } return 0; failed: brelse(bh); return retval; } |
3 9 6 3 5 5 5 3 2 2 2 2 5 1 1 1 1 3 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 150 5 4 1 2 3 5 1 10 4 12 6 13 14 14 7 8 8 7 9 4 8 9 2 2 2 2 2 2 134 139 135 5 5 5 4 5 5 2 2 2 2 6 3 4 132 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 | // SPDX-License-Identifier: GPL-2.0 /* * gendisk handling * * Portions Copyright (C) 2020 Christoph Hellwig */ #include <linux/module.h> #include <linux/ctype.h> #include <linux/fs.h> #include <linux/kdev_t.h> #include <linux/kernel.h> #include <linux/blkdev.h> #include <linux/backing-dev.h> #include <linux/init.h> #include <linux/spinlock.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> #include <linux/slab.h> #include <linux/kmod.h> #include <linux/major.h> #include <linux/mutex.h> #include <linux/idr.h> #include <linux/log2.h> #include <linux/pm_runtime.h> #include <linux/badblocks.h> #include <linux/part_stat.h> #include <linux/blktrace_api.h> #include "blk-throttle.h" #include "blk.h" #include "blk-mq-sched.h" #include "blk-rq-qos.h" #include "blk-cgroup.h" static struct kobject *block_depr; /* * Unique, monotonically increasing sequential number associated with block * devices instances (i.e. incremented each time a device is attached). * Associating uevents with block devices in userspace is difficult and racy: * the uevent netlink socket is lossy, and on slow and overloaded systems has * a very high latency. * Block devices do not have exclusive owners in userspace, any process can set * one up (e.g. loop devices). Moreover, device names can be reused (e.g. loop0 * can be reused again and again). * A userspace process setting up a block device and watching for its events * cannot thus reliably tell whether an event relates to the device it just set * up or another earlier instance with the same name. * This sequential number allows userspace processes to solve this problem, and * uniquely associate an uevent to the lifetime to a device. */ static atomic64_t diskseq; /* for extended dynamic devt allocation, currently only one major is used */ #define NR_EXT_DEVT (1 << MINORBITS) static DEFINE_IDA(ext_devt_ida); void set_capacity(struct gendisk *disk, sector_t sectors) { bdev_set_nr_sectors(disk->part0, sectors); } EXPORT_SYMBOL(set_capacity); /* * Set disk capacity and notify if the size is not currently zero and will not * be set to zero. Returns true if a uevent was sent, otherwise false. */ bool set_capacity_and_notify(struct gendisk *disk, sector_t size) { sector_t capacity = get_capacity(disk); char *envp[] = { "RESIZE=1", NULL }; set_capacity(disk, size); /* * Only print a message and send a uevent if the gendisk is user visible * and alive. This avoids spamming the log and udev when setting the * initial capacity during probing. */ if (size == capacity || !disk_live(disk) || (disk->flags & GENHD_FL_HIDDEN)) return false; pr_info("%s: detected capacity change from %lld to %lld\n", disk->disk_name, capacity, size); /* * Historically we did not send a uevent for changes to/from an empty * device. */ if (!capacity || !size) return false; kobject_uevent_env(&disk_to_dev(disk)->kobj, KOBJ_CHANGE, envp); return true; } EXPORT_SYMBOL_GPL(set_capacity_and_notify); static void part_stat_read_all(struct block_device *part, struct disk_stats *stat) { int cpu; memset(stat, 0, sizeof(struct disk_stats)); for_each_possible_cpu(cpu) { struct disk_stats *ptr = per_cpu_ptr(part->bd_stats, cpu); int group; for (group = 0; group < NR_STAT_GROUPS; group++) { stat->nsecs[group] += ptr->nsecs[group]; stat->sectors[group] += ptr->sectors[group]; stat->ios[group] += ptr->ios[group]; stat->merges[group] += ptr->merges[group]; } stat->io_ticks += ptr->io_ticks; } } static unsigned int part_in_flight(struct block_device *part) { unsigned int inflight = 0; int cpu; for_each_possible_cpu(cpu) { inflight += part_stat_local_read_cpu(part, in_flight[0], cpu) + part_stat_local_read_cpu(part, in_flight[1], cpu); } if ((int)inflight < 0) inflight = 0; return inflight; } static void part_in_flight_rw(struct block_device *part, unsigned int inflight[2]) { int cpu; inflight[0] = 0; inflight[1] = 0; for_each_possible_cpu(cpu) { inflight[0] += part_stat_local_read_cpu(part, in_flight[0], cpu); inflight[1] += part_stat_local_read_cpu(part, in_flight[1], cpu); } if ((int)inflight[0] < 0) inflight[0] = 0; if ((int)inflight[1] < 0) inflight[1] = 0; } /* * Can be deleted altogether. Later. * */ #define BLKDEV_MAJOR_HASH_SIZE 255 static struct blk_major_name { struct blk_major_name *next; int major; char name[16]; #ifdef CONFIG_BLOCK_LEGACY_AUTOLOAD void (*probe)(dev_t devt); #endif } *major_names[BLKDEV_MAJOR_HASH_SIZE]; static DEFINE_MUTEX(major_names_lock); static DEFINE_SPINLOCK(major_names_spinlock); /* index in the above - for now: assume no multimajor ranges */ static inline int major_to_index(unsigned major) { return major % BLKDEV_MAJOR_HASH_SIZE; } #ifdef CONFIG_PROC_FS void blkdev_show(struct seq_file *seqf, off_t offset) { struct blk_major_name *dp; spin_lock(&major_names_spinlock); for (dp = major_names[major_to_index(offset)]; dp; dp = dp->next) if (dp->major == offset) seq_printf(seqf, "%3d %s\n", dp->major, dp->name); spin_unlock(&major_names_spinlock); } #endif /* CONFIG_PROC_FS */ /** * __register_blkdev - register a new block device * * @major: the requested major device number [1..BLKDEV_MAJOR_MAX-1]. If * @major = 0, try to allocate any unused major number. * @name: the name of the new block device as a zero terminated string * @probe: pre-devtmpfs / pre-udev callback used to create disks when their * pre-created device node is accessed. When a probe call uses * add_disk() and it fails the driver must cleanup resources. This * interface may soon be removed. * * The @name must be unique within the system. * * The return value depends on the @major input parameter: * * - if a major device number was requested in range [1..BLKDEV_MAJOR_MAX-1] * then the function returns zero on success, or a negative error code * - if any unused major number was requested with @major = 0 parameter * then the return value is the allocated major number in range * [1..BLKDEV_MAJOR_MAX-1] or a negative error code otherwise * * See Documentation/admin-guide/devices.txt for the list of allocated * major numbers. * * Use register_blkdev instead for any new code. */ int __register_blkdev(unsigned int major, const char *name, void (*probe)(dev_t devt)) { struct blk_major_name **n, *p; int index, ret = 0; mutex_lock(&major_names_lock); /* temporary */ if (major == 0) { for (index = ARRAY_SIZE(major_names)-1; index > 0; index--) { if (major_names[index] == NULL) break; } if (index == 0) { printk("%s: failed to get major for %s\n", __func__, name); ret = -EBUSY; goto out; } major = index; ret = major; } if (major >= BLKDEV_MAJOR_MAX) { pr_err("%s: major requested (%u) is greater than the maximum (%u) for %s\n", __func__, major, BLKDEV_MAJOR_MAX-1, name); ret = -EINVAL; goto out; } p = kmalloc(sizeof(struct blk_major_name), GFP_KERNEL); if (p == NULL) { ret = -ENOMEM; goto out; } p->major = major; #ifdef CONFIG_BLOCK_LEGACY_AUTOLOAD p->probe = probe; #endif strscpy(p->name, name, sizeof(p->name)); p->next = NULL; index = major_to_index(major); spin_lock(&major_names_spinlock); for (n = &major_names[index]; *n; n = &(*n)->next) { if ((*n)->major == major) break; } if (!*n) *n = p; else ret = -EBUSY; spin_unlock(&major_names_spinlock); if (ret < 0) { printk("register_blkdev: cannot get major %u for %s\n", major, name); kfree(p); } out: mutex_unlock(&major_names_lock); return ret; } EXPORT_SYMBOL(__register_blkdev); void unregister_blkdev(unsigned int major, const char *name) { struct blk_major_name **n; struct blk_major_name *p = NULL; int index = major_to_index(major); mutex_lock(&major_names_lock); spin_lock(&major_names_spinlock); for (n = &major_names[index]; *n; n = &(*n)->next) if ((*n)->major == major) break; if (!*n || strcmp((*n)->name, name)) { WARN_ON(1); } else { p = *n; *n = p->next; } spin_unlock(&major_names_spinlock); mutex_unlock(&major_names_lock); kfree(p); } EXPORT_SYMBOL(unregister_blkdev); int blk_alloc_ext_minor(void) { int idx; idx = ida_alloc_range(&ext_devt_ida, 0, NR_EXT_DEVT - 1, GFP_KERNEL); if (idx == -ENOSPC) return -EBUSY; return idx; } void blk_free_ext_minor(unsigned int minor) { ida_free(&ext_devt_ida, minor); } void disk_uevent(struct gendisk *disk, enum kobject_action action) { struct block_device *part; unsigned long idx; rcu_read_lock(); xa_for_each(&disk->part_tbl, idx, part) { if (bdev_is_partition(part) && !bdev_nr_sectors(part)) continue; if (!kobject_get_unless_zero(&part->bd_device.kobj)) continue; rcu_read_unlock(); kobject_uevent(bdev_kobj(part), action); put_device(&part->bd_device); rcu_read_lock(); } rcu_read_unlock(); } EXPORT_SYMBOL_GPL(disk_uevent); int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode) { struct file *file; int ret = 0; if (disk->flags & (GENHD_FL_NO_PART | GENHD_FL_HIDDEN)) return -EINVAL; if (test_bit(GD_SUPPRESS_PART_SCAN, &disk->state)) return -EINVAL; if (disk->open_partitions) return -EBUSY; /* * If the device is opened exclusively by current thread already, it's * safe to scan partitons, otherwise, use bd_prepare_to_claim() to * synchronize with other exclusive openers and other partition * scanners. */ if (!(mode & BLK_OPEN_EXCL)) { ret = bd_prepare_to_claim(disk->part0, disk_scan_partitions, NULL); if (ret) return ret; } set_bit(GD_NEED_PART_SCAN, &disk->state); file = bdev_file_open_by_dev(disk_devt(disk), mode & ~BLK_OPEN_EXCL, NULL, NULL); if (IS_ERR(file)) ret = PTR_ERR(file); else fput(file); /* * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set, * and this will cause that re-assemble partitioned raid device will * creat partition for underlying disk. */ clear_bit(GD_NEED_PART_SCAN, &disk->state); if (!(mode & BLK_OPEN_EXCL)) bd_abort_claiming(disk->part0, disk_scan_partitions); return ret; } /** * device_add_disk - add disk information to kernel list * @parent: parent device for the disk * @disk: per-device partitioning information * @groups: Additional per-device sysfs groups * * This function registers the partitioning information in @disk * with the kernel. */ int __must_check device_add_disk(struct device *parent, struct gendisk *disk, const struct attribute_group **groups) { struct device *ddev = disk_to_dev(disk); int ret; /* Only makes sense for bio-based to set ->poll_bio */ if (queue_is_mq(disk->queue) && disk->fops->poll_bio) return -EINVAL; /* * The disk queue should now be all set with enough information about * the device for the elevator code to pick an adequate default * elevator if one is needed, that is, for devices requesting queue * registration. */ elevator_init_mq(disk->queue); /* Mark bdev as having a submit_bio, if needed */ disk->part0->bd_has_submit_bio = disk->fops->submit_bio != NULL; /* * If the driver provides an explicit major number it also must provide * the number of minors numbers supported, and those will be used to * setup the gendisk. * Otherwise just allocate the device numbers for both the whole device * and all partitions from the extended dev_t space. */ ret = -EINVAL; if (disk->major) { if (WARN_ON(!disk->minors)) goto out_exit_elevator; if (disk->minors > DISK_MAX_PARTS) { pr_err("block: can't allocate more than %d partitions\n", DISK_MAX_PARTS); disk->minors = DISK_MAX_PARTS; } if (disk->first_minor > MINORMASK || disk->minors > MINORMASK + 1 || disk->first_minor + disk->minors > MINORMASK + 1) goto out_exit_elevator; } else { if (WARN_ON(disk->minors)) goto out_exit_elevator; ret = blk_alloc_ext_minor(); if (ret < 0) goto out_exit_elevator; disk->major = BLOCK_EXT_MAJOR; disk->first_minor = ret; } /* delay uevents, until we scanned partition table */ dev_set_uevent_suppress(ddev, 1); ddev->parent = parent; ddev->groups = groups; dev_set_name(ddev, "%s", disk->disk_name); if (!(disk->flags & GENHD_FL_HIDDEN)) ddev->devt = MKDEV(disk->major, disk->first_minor); ret = device_add(ddev); if (ret) goto out_free_ext_minor; ret = disk_alloc_events(disk); if (ret) goto out_device_del; ret = sysfs_create_link(block_depr, &ddev->kobj, kobject_name(&ddev->kobj)); if (ret) goto out_device_del; /* * avoid probable deadlock caused by allocating memory with * GFP_KERNEL in runtime_resume callback of its all ancestor * devices */ pm_runtime_set_memalloc_noio(ddev, true); disk->part0->bd_holder_dir = kobject_create_and_add("holders", &ddev->kobj); if (!disk->part0->bd_holder_dir) { ret = -ENOMEM; goto out_del_block_link; } disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj); if (!disk->slave_dir) { ret = -ENOMEM; goto out_put_holder_dir; } ret = blk_register_queue(disk); if (ret) goto out_put_slave_dir; if (!(disk->flags & GENHD_FL_HIDDEN)) { ret = bdi_register(disk->bdi, "%u:%u", disk->major, disk->first_minor); if (ret) goto out_unregister_queue; bdi_set_owner(disk->bdi, ddev); ret = sysfs_create_link(&ddev->kobj, &disk->bdi->dev->kobj, "bdi"); if (ret) goto out_unregister_bdi; /* Make sure the first partition scan will be proceed */ if (get_capacity(disk) && !(disk->flags & GENHD_FL_NO_PART) && !test_bit(GD_SUPPRESS_PART_SCAN, &disk->state)) set_bit(GD_NEED_PART_SCAN, &disk->state); bdev_add(disk->part0, ddev->devt); if (get_capacity(disk)) disk_scan_partitions(disk, BLK_OPEN_READ); /* * Announce the disk and partitions after all partitions are * created. (for hidden disks uevents remain suppressed forever) */ dev_set_uevent_suppress(ddev, 0); disk_uevent(disk, KOBJ_ADD); } else { /* * Even if the block_device for a hidden gendisk is not * registered, it needs to have a valid bd_dev so that the * freeing of the dynamic major works. */ disk->part0->bd_dev = MKDEV(disk->major, disk->first_minor); } disk_update_readahead(disk); disk_add_events(disk); set_bit(GD_ADDED, &disk->state); return 0; out_unregister_bdi: if (!(disk->flags & GENHD_FL_HIDDEN)) bdi_unregister(disk->bdi); out_unregister_queue: blk_unregister_queue(disk); rq_qos_exit(disk->queue); out_put_slave_dir: kobject_put(disk->slave_dir); disk->slave_dir = NULL; out_put_holder_dir: kobject_put(disk->part0->bd_holder_dir); out_del_block_link: sysfs_remove_link(block_depr, dev_name(ddev)); pm_runtime_set_memalloc_noio(ddev, false); out_device_del: device_del(ddev); out_free_ext_minor: if (disk->major == BLOCK_EXT_MAJOR) blk_free_ext_minor(disk->first_minor); out_exit_elevator: if (disk->queue->elevator) elevator_exit(disk->queue); return ret; } EXPORT_SYMBOL(device_add_disk); static void blk_report_disk_dead(struct gendisk *disk, bool surprise) { struct block_device *bdev; unsigned long idx; /* * On surprise disk removal, bdev_mark_dead() may call into file * systems below. Make it clear that we're expecting to not hold * disk->open_mutex. */ lockdep_assert_not_held(&disk->open_mutex); rcu_read_lock(); xa_for_each(&disk->part_tbl, idx, bdev) { if (!kobject_get_unless_zero(&bdev->bd_device.kobj)) continue; rcu_read_unlock(); bdev_mark_dead(bdev, surprise); put_device(&bdev->bd_device); rcu_read_lock(); } rcu_read_unlock(); } static void __blk_mark_disk_dead(struct gendisk *disk) { /* * Fail any new I/O. */ if (test_and_set_bit(GD_DEAD, &disk->state)) return; if (test_bit(GD_OWNS_QUEUE, &disk->state)) blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue); /* * Stop buffered writers from dirtying pages that can't be written out. */ set_capacity(disk, 0); /* * Prevent new I/O from crossing bio_queue_enter(). */ blk_queue_start_drain(disk->queue); } /** * blk_mark_disk_dead - mark a disk as dead * @disk: disk to mark as dead * * Mark as disk as dead (e.g. surprise removed) and don't accept any new I/O * to this disk. */ void blk_mark_disk_dead(struct gendisk *disk) { __blk_mark_disk_dead(disk); blk_report_disk_dead(disk, true); } EXPORT_SYMBOL_GPL(blk_mark_disk_dead); /** * del_gendisk - remove the gendisk * @disk: the struct gendisk to remove * * Removes the gendisk and all its associated resources. This deletes the * partitions associated with the gendisk, and unregisters the associated * request_queue. * * This is the counter to the respective __device_add_disk() call. * * The final removal of the struct gendisk happens when its refcount reaches 0 * with put_disk(), which should be called after del_gendisk(), if * __device_add_disk() was used. * * Drivers exist which depend on the release of the gendisk to be synchronous, * it should not be deferred. * * Context: can sleep */ void del_gendisk(struct gendisk *disk) { struct request_queue *q = disk->queue; struct block_device *part; unsigned long idx; might_sleep(); if (WARN_ON_ONCE(!disk_live(disk) && !(disk->flags & GENHD_FL_HIDDEN))) return; disk_del_events(disk); /* * Prevent new openers by unlinked the bdev inode. */ mutex_lock(&disk->open_mutex); xa_for_each(&disk->part_tbl, idx, part) remove_inode_hash(part->bd_inode); mutex_unlock(&disk->open_mutex); /* * Tell the file system to write back all dirty data and shut down if * it hasn't been notified earlier. */ if (!test_bit(GD_DEAD, &disk->state)) blk_report_disk_dead(disk, false); __blk_mark_disk_dead(disk); /* * Drop all partitions now that the disk is marked dead. */ mutex_lock(&disk->open_mutex); xa_for_each_start(&disk->part_tbl, idx, part, 1) drop_partition(part); mutex_unlock(&disk->open_mutex); if (!(disk->flags & GENHD_FL_HIDDEN)) { sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi"); /* * Unregister bdi before releasing device numbers (as they can * get reused and we'd get clashes in sysfs). */ bdi_unregister(disk->bdi); } blk_unregister_queue(disk); kobject_put(disk->part0->bd_holder_dir); kobject_put(disk->slave_dir); disk->slave_dir = NULL; part_stat_set_all(disk->part0, 0); disk->part0->bd_stamp = 0; sysfs_remove_link(block_depr, dev_name(disk_to_dev(disk))); pm_runtime_set_memalloc_noio(disk_to_dev(disk), false); device_del(disk_to_dev(disk)); blk_mq_freeze_queue_wait(q); blk_throtl_cancel_bios(disk); blk_sync_queue(q); blk_flush_integrity(); if (queue_is_mq(q)) blk_mq_cancel_work_sync(q); blk_mq_quiesce_queue(q); if (q->elevator) { mutex_lock(&q->sysfs_lock); elevator_exit(q); mutex_unlock(&q->sysfs_lock); } rq_qos_exit(q); blk_mq_unquiesce_queue(q); /* * If the disk does not own the queue, allow using passthrough requests * again. Else leave the queue frozen to fail all I/O. */ if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { blk_queue_flag_clear(QUEUE_FLAG_INIT_DONE, q); __blk_mq_unfreeze_queue(q, true); } else { if (queue_is_mq(q)) blk_mq_exit_queue(q); } } EXPORT_SYMBOL(del_gendisk); /** * invalidate_disk - invalidate the disk * @disk: the struct gendisk to invalidate * * A helper to invalidates the disk. It will clean the disk's associated * buffer/page caches and reset its internal states so that the disk * can be reused by the drivers. * * Context: can sleep */ void invalidate_disk(struct gendisk *disk) { struct block_device *bdev = disk->part0; invalidate_bdev(bdev); bdev->bd_inode->i_mapping->wb_err = 0; set_capacity(disk, 0); } EXPORT_SYMBOL(invalidate_disk); /* sysfs access to bad-blocks list. */ static ssize_t disk_badblocks_show(struct device *dev, struct device_attribute *attr, char *page) { struct gendisk *disk = dev_to_disk(dev); if (!disk->bb) return sprintf(page, "\n"); return badblocks_show(disk->bb, page, 0); } static ssize_t disk_badblocks_store(struct device *dev, struct device_attribute *attr, const char *page, size_t len) { struct gendisk *disk = dev_to_disk(dev); if (!disk->bb) return -ENXIO; return badblocks_store(disk->bb, page, len, 0); } #ifdef CONFIG_BLOCK_LEGACY_AUTOLOAD void blk_request_module(dev_t devt) { unsigned int major = MAJOR(devt); struct blk_major_name **n; mutex_lock(&major_names_lock); for (n = &major_names[major_to_index(major)]; *n; n = &(*n)->next) { if ((*n)->major == major && (*n)->probe) { (*n)->probe(devt); mutex_unlock(&major_names_lock); return; } } mutex_unlock(&major_names_lock); if (request_module("block-major-%d-%d", MAJOR(devt), MINOR(devt)) > 0) /* Make old-style 2.4 aliases work */ request_module("block-major-%d", MAJOR(devt)); } #endif /* CONFIG_BLOCK_LEGACY_AUTOLOAD */ #ifdef CONFIG_PROC_FS /* iterator */ static void *disk_seqf_start(struct seq_file *seqf, loff_t *pos) { loff_t skip = *pos; struct class_dev_iter *iter; struct device *dev; iter = kmalloc(sizeof(*iter), GFP_KERNEL); if (!iter) return ERR_PTR(-ENOMEM); seqf->private = iter; class_dev_iter_init(iter, &block_class, NULL, &disk_type); do { dev = class_dev_iter_next(iter); if (!dev) return NULL; } while (skip--); return dev_to_disk(dev); } static void *disk_seqf_next(struct seq_file *seqf, void *v, loff_t *pos) { struct device *dev; (*pos)++; dev = class_dev_iter_next(seqf->private); if (dev) return dev_to_disk(dev); return NULL; } static void disk_seqf_stop(struct seq_file *seqf, void *v) { struct class_dev_iter *iter = seqf->private; /* stop is called even after start failed :-( */ if (iter) { class_dev_iter_exit(iter); kfree(iter); seqf->private = NULL; } } static void *show_partition_start(struct seq_file *seqf, loff_t *pos) { void *p; p = disk_seqf_start(seqf, pos); if (!IS_ERR_OR_NULL(p) && !*pos) seq_puts(seqf, "major minor #blocks name\n\n"); return p; } static int show_partition(struct seq_file *seqf, void *v) { struct gendisk *sgp = v; struct block_device *part; unsigned long idx; if (!get_capacity(sgp) || (sgp->flags & GENHD_FL_HIDDEN)) return 0; rcu_read_lock(); xa_for_each(&sgp->part_tbl, idx, part) { if (!bdev_nr_sectors(part)) continue; seq_printf(seqf, "%4d %7d %10llu %pg\n", MAJOR(part->bd_dev), MINOR(part->bd_dev), bdev_nr_sectors(part) >> 1, part); } rcu_read_unlock(); return 0; } static const struct seq_operations partitions_op = { .start = show_partition_start, .next = disk_seqf_next, .stop = disk_seqf_stop, .show = show_partition }; #endif static int __init genhd_device_init(void) { int error; error = class_register(&block_class); if (unlikely(error)) return error; blk_dev_init(); register_blkdev(BLOCK_EXT_MAJOR, "blkext"); /* create top-level block dir */ block_depr = kobject_create_and_add("block", NULL); return 0; } subsys_initcall(genhd_device_init); static ssize_t disk_range_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sprintf(buf, "%d\n", disk->minors); } static ssize_t disk_ext_range_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sprintf(buf, "%d\n", (disk->flags & GENHD_FL_NO_PART) ? 1 : DISK_MAX_PARTS); } static ssize_t disk_removable_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sprintf(buf, "%d\n", (disk->flags & GENHD_FL_REMOVABLE ? 1 : 0)); } static ssize_t disk_hidden_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sprintf(buf, "%d\n", (disk->flags & GENHD_FL_HIDDEN ? 1 : 0)); } static ssize_t disk_ro_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sprintf(buf, "%d\n", get_disk_ro(disk) ? 1 : 0); } ssize_t part_size_show(struct device *dev, struct device_attribute *attr, char *buf) { return sprintf(buf, "%llu\n", bdev_nr_sectors(dev_to_bdev(dev))); } ssize_t part_stat_show(struct device *dev, struct device_attribute *attr, char *buf) { struct block_device *bdev = dev_to_bdev(dev); struct request_queue *q = bdev_get_queue(bdev); struct disk_stats stat; unsigned int inflight; if (queue_is_mq(q)) inflight = blk_mq_in_flight(q, bdev); else inflight = part_in_flight(bdev); if (inflight) { part_stat_lock(); update_io_ticks(bdev, jiffies, true); part_stat_unlock(); } part_stat_read_all(bdev, &stat); return sprintf(buf, "%8lu %8lu %8llu %8u " "%8lu %8lu %8llu %8u " "%8u %8u %8u " "%8lu %8lu %8llu %8u " "%8lu %8u" "\n", stat.ios[STAT_READ], stat.merges[STAT_READ], (unsigned long long)stat.sectors[STAT_READ], (unsigned int)div_u64(stat.nsecs[STAT_READ], NSEC_PER_MSEC), stat.ios[STAT_WRITE], stat.merges[STAT_WRITE], (unsigned long long)stat.sectors[STAT_WRITE], (unsigned int)div_u64(stat.nsecs[STAT_WRITE], NSEC_PER_MSEC), inflight, jiffies_to_msecs(stat.io_ticks), (unsigned int)div_u64(stat.nsecs[STAT_READ] + stat.nsecs[STAT_WRITE] + stat.nsecs[STAT_DISCARD] + stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC), stat.ios[STAT_DISCARD], stat.merges[STAT_DISCARD], (unsigned long long)stat.sectors[STAT_DISCARD], (unsigned int)div_u64(stat.nsecs[STAT_DISCARD], NSEC_PER_MSEC), stat.ios[STAT_FLUSH], (unsigned int)div_u64(stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC)); } ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr, char *buf) { struct block_device *bdev = dev_to_bdev(dev); struct request_queue *q = bdev_get_queue(bdev); unsigned int inflight[2]; if (queue_is_mq(q)) blk_mq_in_flight_rw(q, bdev, inflight); else part_in_flight_rw(bdev, inflight); return sprintf(buf, "%8u %8u\n", inflight[0], inflight[1]); } static ssize_t disk_capability_show(struct device *dev, struct device_attribute *attr, char *buf) { dev_warn_once(dev, "the capability attribute has been deprecated.\n"); return sprintf(buf, "0\n"); } static ssize_t disk_alignment_offset_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sprintf(buf, "%d\n", bdev_alignment_offset(disk->part0)); } static ssize_t disk_discard_alignment_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sprintf(buf, "%d\n", bdev_alignment_offset(disk->part0)); } static ssize_t diskseq_show(struct device *dev, struct device_attribute *attr, char *buf) { struct gendisk *disk = dev_to_disk(dev); return sprintf(buf, "%llu\n", disk->diskseq); } static DEVICE_ATTR(range, 0444, disk_range_show, NULL); static DEVICE_ATTR(ext_range, 0444, disk_ext_range_show, NULL); static DEVICE_ATTR(removable, 0444, disk_removable_show, NULL); static DEVICE_ATTR(hidden, 0444, disk_hidden_show, NULL); static DEVICE_ATTR(ro, 0444, disk_ro_show, NULL); static DEVICE_ATTR(size, 0444, part_size_show, NULL); static DEVICE_ATTR(alignment_offset, 0444, disk_alignment_offset_show, NULL); static DEVICE_ATTR(discard_alignment, 0444, disk_discard_alignment_show, NULL); static DEVICE_ATTR(capability, 0444, disk_capability_show, NULL); static DEVICE_ATTR(stat, 0444, part_stat_show, NULL); static DEVICE_ATTR(inflight, 0444, part_inflight_show, NULL); static DEVICE_ATTR(badblocks, 0644, disk_badblocks_show, disk_badblocks_store); static DEVICE_ATTR(diskseq, 0444, diskseq_show, NULL); #ifdef CONFIG_FAIL_MAKE_REQUEST ssize_t part_fail_show(struct device *dev, struct device_attribute *attr, char *buf) { return sprintf(buf, "%d\n", dev_to_bdev(dev)->bd_make_it_fail); } ssize_t part_fail_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { int i; if (count > 0 && sscanf(buf, "%d", &i) > 0) dev_to_bdev(dev)->bd_make_it_fail = i; return count; } static struct device_attribute dev_attr_fail = __ATTR(make-it-fail, 0644, part_fail_show, part_fail_store); #endif /* CONFIG_FAIL_MAKE_REQUEST */ #ifdef CONFIG_FAIL_IO_TIMEOUT static struct device_attribute dev_attr_fail_timeout = __ATTR(io-timeout-fail, 0644, part_timeout_show, part_timeout_store); #endif static struct attribute *disk_attrs[] = { &dev_attr_range.attr, &dev_attr_ext_range.attr, &dev_attr_removable.attr, &dev_attr_hidden.attr, &dev_attr_ro.attr, &dev_attr_size.attr, &dev_attr_alignment_offset.attr, &dev_attr_discard_alignment.attr, &dev_attr_capability.attr, &dev_attr_stat.attr, &dev_attr_inflight.attr, &dev_attr_badblocks.attr, &dev_attr_events.attr, &dev_attr_events_async.attr, &dev_attr_events_poll_msecs.attr, &dev_attr_diskseq.attr, #ifdef CONFIG_FAIL_MAKE_REQUEST &dev_attr_fail.attr, #endif #ifdef CONFIG_FAIL_IO_TIMEOUT &dev_attr_fail_timeout.attr, #endif NULL }; static umode_t disk_visible(struct kobject *kobj, struct attribute *a, int n) { struct device *dev = container_of(kobj, typeof(*dev), kobj); struct gendisk *disk = dev_to_disk(dev); if (a == &dev_attr_badblocks.attr && !disk->bb) return 0; return a->mode; } static struct attribute_group disk_attr_group = { .attrs = disk_attrs, .is_visible = disk_visible, }; static const struct attribute_group *disk_attr_groups[] = { &disk_attr_group, #ifdef CONFIG_BLK_DEV_IO_TRACE &blk_trace_attr_group, #endif #ifdef CONFIG_BLK_DEV_INTEGRITY &blk_integrity_attr_group, #endif NULL }; /** * disk_release - releases all allocated resources of the gendisk * @dev: the device representing this disk * * This function releases all allocated resources of the gendisk. * * Drivers which used __device_add_disk() have a gendisk with a request_queue * assigned. Since the request_queue sits on top of the gendisk for these * drivers we also call blk_put_queue() for them, and we expect the * request_queue refcount to reach 0 at this point, and so the request_queue * will also be freed prior to the disk. * * Context: can sleep */ static void disk_release(struct device *dev) { struct gendisk *disk = dev_to_disk(dev); might_sleep(); WARN_ON_ONCE(disk_live(disk)); blk_trace_remove(disk->queue); /* * To undo the all initialization from blk_mq_init_allocated_queue in * case of a probe failure where add_disk is never called we have to * call blk_mq_exit_queue here. We can't do this for the more common * teardown case (yet) as the tagset can be gone by the time the disk * is released once it was added. */ if (queue_is_mq(disk->queue) && test_bit(GD_OWNS_QUEUE, &disk->state) && !test_bit(GD_ADDED, &disk->state)) blk_mq_exit_queue(disk->queue); blkcg_exit_disk(disk); bioset_exit(&disk->bio_split); disk_release_events(disk); kfree(disk->random); disk_free_zone_bitmaps(disk); xa_destroy(&disk->part_tbl); disk->queue->disk = NULL; blk_put_queue(disk->queue); if (test_bit(GD_ADDED, &disk->state) && disk->fops->free_disk) disk->fops->free_disk(disk); iput(disk->part0->bd_inode); /* frees the disk */ } static int block_uevent(const struct device *dev, struct kobj_uevent_env *env) { const struct gendisk *disk = dev_to_disk(dev); return add_uevent_var(env, "DISKSEQ=%llu", disk->diskseq); } const struct class block_class = { .name = "block", .dev_uevent = block_uevent, }; static char *block_devnode(const struct device *dev, umode_t *mode, kuid_t *uid, kgid_t *gid) { struct gendisk *disk = dev_to_disk(dev); if (disk->fops->devnode) return disk->fops->devnode(disk, mode); return NULL; } const struct device_type disk_type = { .name = "disk", .groups = disk_attr_groups, .release = disk_release, .devnode = block_devnode, }; #ifdef CONFIG_PROC_FS /* * aggregate disk stat collector. Uses the same stats that the sysfs * entries do, above, but makes them available through one seq_file. * * The output looks suspiciously like /proc/partitions with a bunch of * extra fields. */ static int diskstats_show(struct seq_file *seqf, void *v) { struct gendisk *gp = v; struct block_device *hd; unsigned int inflight; struct disk_stats stat; unsigned long idx; /* if (&disk_to_dev(gp)->kobj.entry == block_class.devices.next) seq_puts(seqf, "major minor name" " rio rmerge rsect ruse wio wmerge " "wsect wuse running use aveq" "\n\n"); */ rcu_read_lock(); xa_for_each(&gp->part_tbl, idx, hd) { if (bdev_is_partition(hd) && !bdev_nr_sectors(hd)) continue; if (queue_is_mq(gp->queue)) inflight = blk_mq_in_flight(gp->queue, hd); else inflight = part_in_flight(hd); if (inflight) { part_stat_lock(); update_io_ticks(hd, jiffies, true); part_stat_unlock(); } part_stat_read_all(hd, &stat); seq_printf(seqf, "%4d %7d %pg " "%lu %lu %lu %u " "%lu %lu %lu %u " "%u %u %u " "%lu %lu %lu %u " "%lu %u" "\n", MAJOR(hd->bd_dev), MINOR(hd->bd_dev), hd, stat.ios[STAT_READ], stat.merges[STAT_READ], stat.sectors[STAT_READ], (unsigned int)div_u64(stat.nsecs[STAT_READ], NSEC_PER_MSEC), stat.ios[STAT_WRITE], stat.merges[STAT_WRITE], stat.sectors[STAT_WRITE], (unsigned int)div_u64(stat.nsecs[STAT_WRITE], NSEC_PER_MSEC), inflight, jiffies_to_msecs(stat.io_ticks), (unsigned int)div_u64(stat.nsecs[STAT_READ] + stat.nsecs[STAT_WRITE] + stat.nsecs[STAT_DISCARD] + stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC), stat.ios[STAT_DISCARD], stat.merges[STAT_DISCARD], stat.sectors[STAT_DISCARD], (unsigned int)div_u64(stat.nsecs[STAT_DISCARD], NSEC_PER_MSEC), stat.ios[STAT_FLUSH], (unsigned int)div_u64(stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC) ); } rcu_read_unlock(); return 0; } static const struct seq_operations diskstats_op = { .start = disk_seqf_start, .next = disk_seqf_next, .stop = disk_seqf_stop, .show = diskstats_show }; static int __init proc_genhd_init(void) { proc_create_seq("diskstats", 0, NULL, &diskstats_op); proc_create_seq("partitions", 0, NULL, &partitions_op); return 0; } module_init(proc_genhd_init); #endif /* CONFIG_PROC_FS */ dev_t part_devt(struct gendisk *disk, u8 partno) { struct block_device *part; dev_t devt = 0; rcu_read_lock(); part = xa_load(&disk->part_tbl, partno); if (part) devt = part->bd_dev; rcu_read_unlock(); return devt; } struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id, struct lock_class_key *lkclass) { struct gendisk *disk; disk = kzalloc_node(sizeof(struct gendisk), GFP_KERNEL, node_id); if (!disk) return NULL; if (bioset_init(&disk->bio_split, BIO_POOL_SIZE, 0, 0)) goto out_free_disk; disk->bdi = bdi_alloc(node_id); if (!disk->bdi) goto out_free_bioset; /* bdev_alloc() might need the queue, set before the first call */ disk->queue = q; disk->part0 = bdev_alloc(disk, 0); if (!disk->part0) goto out_free_bdi; disk->node_id = node_id; mutex_init(&disk->open_mutex); xa_init(&disk->part_tbl); if (xa_insert(&disk->part_tbl, 0, disk->part0, GFP_KERNEL)) goto out_destroy_part_tbl; if (blkcg_init_disk(disk)) goto out_erase_part0; rand_initialize_disk(disk); disk_to_dev(disk)->class = &block_class; disk_to_dev(disk)->type = &disk_type; device_initialize(disk_to_dev(disk)); inc_diskseq(disk); q->disk = disk; lockdep_init_map(&disk->lockdep_map, "(bio completion)", lkclass, 0); #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED INIT_LIST_HEAD(&disk->slave_bdevs); #endif return disk; out_erase_part0: xa_erase(&disk->part_tbl, 0); out_destroy_part_tbl: xa_destroy(&disk->part_tbl); disk->part0->bd_disk = NULL; iput(disk->part0->bd_inode); out_free_bdi: bdi_put(disk->bdi); out_free_bioset: bioset_exit(&disk->bio_split); out_free_disk: kfree(disk); return NULL; } struct gendisk *__blk_alloc_disk(struct queue_limits *lim, int node, struct lock_class_key *lkclass) { struct queue_limits default_lim = { }; struct request_queue *q; struct gendisk *disk; q = blk_alloc_queue(lim ? lim : &default_lim, node); if (IS_ERR(q)) return ERR_CAST(q); disk = __alloc_disk_node(q, node, lkclass); if (!disk) { blk_put_queue(q); return ERR_PTR(-ENOMEM); } set_bit(GD_OWNS_QUEUE, &disk->state); return disk; } EXPORT_SYMBOL(__blk_alloc_disk); /** * put_disk - decrements the gendisk refcount * @disk: the struct gendisk to decrement the refcount for * * This decrements the refcount for the struct gendisk. When this reaches 0 * we'll have disk_release() called. * * Note: for blk-mq disk put_disk must be called before freeing the tag_set * when handling probe errors (that is before add_disk() is called). * * Context: Any context, but the last reference must not be dropped from * atomic context. */ void put_disk(struct gendisk *disk) { if (disk) put_device(disk_to_dev(disk)); } EXPORT_SYMBOL(put_disk); static void set_disk_ro_uevent(struct gendisk *gd, int ro) { char event[] = "DISK_RO=1"; char *envp[] = { event, NULL }; if (!ro) event[8] = '0'; kobject_uevent_env(&disk_to_dev(gd)->kobj, KOBJ_CHANGE, envp); } /** * set_disk_ro - set a gendisk read-only * @disk: gendisk to operate on * @read_only: %true to set the disk read-only, %false set the disk read/write * * This function is used to indicate whether a given disk device should have its * read-only flag set. set_disk_ro() is typically used by device drivers to * indicate whether the underlying physical device is write-protected. */ void set_disk_ro(struct gendisk *disk, bool read_only) { if (read_only) { if (test_and_set_bit(GD_READ_ONLY, &disk->state)) return; } else { if (!test_and_clear_bit(GD_READ_ONLY, &disk->state)) return; } set_disk_ro_uevent(disk, read_only); } EXPORT_SYMBOL(set_disk_ro); void inc_diskseq(struct gendisk *disk) { disk->diskseq = atomic64_inc_return(&diskseq); } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | // SPDX-License-Identifier: GPL-2.0 #ifndef _LINUX_MM_SLOT_H #define _LINUX_MM_SLOT_H #include <linux/hashtable.h> #include <linux/slab.h> /* * struct mm_slot - hash lookup from mm to mm_slot * @hash: link to the mm_slots hash list * @mm_node: link into the mm_slots list * @mm: the mm that this information is valid for */ struct mm_slot { struct hlist_node hash; struct list_head mm_node; struct mm_struct *mm; }; #define mm_slot_entry(ptr, type, member) \ container_of(ptr, type, member) static inline void *mm_slot_alloc(struct kmem_cache *cache) { if (!cache) /* initialization failed */ return NULL; return kmem_cache_zalloc(cache, GFP_KERNEL); } static inline void mm_slot_free(struct kmem_cache *cache, void *objp) { kmem_cache_free(cache, objp); } #define mm_slot_lookup(_hashtable, _mm) \ ({ \ struct mm_slot *tmp_slot, *mm_slot = NULL; \ \ hash_for_each_possible(_hashtable, tmp_slot, hash, (unsigned long)_mm) \ if (_mm == tmp_slot->mm) { \ mm_slot = tmp_slot; \ break; \ } \ \ mm_slot; \ }) #define mm_slot_insert(_hashtable, _mm, _mm_slot) \ ({ \ _mm_slot->mm = _mm; \ hash_add(_hashtable, &_mm_slot->hash, (unsigned long)_mm); \ }) #endif /* _LINUX_MM_SLOT_H */ |
33 172 114 20 5 3 2 3 132 31 22 33 133 44 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 | /* SPDX-License-Identifier: GPL-2.0 */ #undef TRACE_SYSTEM #define TRACE_SYSTEM io_uring #if !defined(_TRACE_IO_URING_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_IO_URING_H #include <linux/tracepoint.h> #include <uapi/linux/io_uring.h> #include <linux/io_uring_types.h> #include <linux/io_uring.h> struct io_wq_work; /** * io_uring_create - called after a new io_uring context was prepared * * @fd: corresponding file descriptor * @ctx: pointer to a ring context structure * @sq_entries: actual SQ size * @cq_entries: actual CQ size * @flags: SQ ring flags, provided to io_uring_setup(2) * * Allows to trace io_uring creation and provide pointer to a context, that can * be used later to find correlated events. */ TRACE_EVENT(io_uring_create, TP_PROTO(int fd, void *ctx, u32 sq_entries, u32 cq_entries, u32 flags), TP_ARGS(fd, ctx, sq_entries, cq_entries, flags), TP_STRUCT__entry ( __field( int, fd ) __field( void *, ctx ) __field( u32, sq_entries ) __field( u32, cq_entries ) __field( u32, flags ) ), TP_fast_assign( __entry->fd = fd; __entry->ctx = ctx; __entry->sq_entries = sq_entries; __entry->cq_entries = cq_entries; __entry->flags = flags; ), TP_printk("ring %p, fd %d sq size %d, cq size %d, flags 0x%x", __entry->ctx, __entry->fd, __entry->sq_entries, __entry->cq_entries, __entry->flags) ); /** * io_uring_register - called after a buffer/file/eventfd was successfully * registered for a ring * * @ctx: pointer to a ring context structure * @opcode: describes which operation to perform * @nr_user_files: number of registered files * @nr_user_bufs: number of registered buffers * @ret: return code * * Allows to trace fixed files/buffers, that could be registered to * avoid an overhead of getting references to them for every operation. This * event, together with io_uring_file_get, can provide a full picture of how * much overhead one can reduce via fixing. */ TRACE_EVENT(io_uring_register, TP_PROTO(void *ctx, unsigned opcode, unsigned nr_files, unsigned nr_bufs, long ret), TP_ARGS(ctx, opcode, nr_files, nr_bufs, ret), TP_STRUCT__entry ( __field( void *, ctx ) __field( unsigned, opcode ) __field( unsigned, nr_files) __field( unsigned, nr_bufs ) __field( long, ret ) ), TP_fast_assign( __entry->ctx = ctx; __entry->opcode = opcode; __entry->nr_files = nr_files; __entry->nr_bufs = nr_bufs; __entry->ret = ret; ), TP_printk("ring %p, opcode %d, nr_user_files %d, nr_user_bufs %d, " "ret %ld", __entry->ctx, __entry->opcode, __entry->nr_files, __entry->nr_bufs, __entry->ret) ); /** * io_uring_file_get - called before getting references to an SQE file * * @req: pointer to a submitted request * @fd: SQE file descriptor * * Allows to trace out how often an SQE file reference is obtained, which can * help figuring out if it makes sense to use fixed files, or check that fixed * files are used correctly. */ TRACE_EVENT(io_uring_file_get, TP_PROTO(struct io_kiocb *req, int fd), TP_ARGS(req, fd), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( u64, user_data ) __field( int, fd ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->user_data = req->cqe.user_data; __entry->fd = fd; ), TP_printk("ring %p, req %p, user_data 0x%llx, fd %d", __entry->ctx, __entry->req, __entry->user_data, __entry->fd) ); /** * io_uring_queue_async_work - called before submitting a new async work * * @req: pointer to a submitted request * @rw: type of workqueue, hashed or normal * * Allows to trace asynchronous work submission. */ TRACE_EVENT(io_uring_queue_async_work, TP_PROTO(struct io_kiocb *req, int rw), TP_ARGS(req, rw), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( u64, user_data ) __field( u8, opcode ) __field( unsigned long long, flags ) __field( struct io_wq_work *, work ) __field( int, rw ) __string( op_str, io_uring_get_opcode(req->opcode) ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->user_data = req->cqe.user_data; __entry->flags = (__force unsigned long long) req->flags; __entry->opcode = req->opcode; __entry->work = &req->work; __entry->rw = rw; __assign_str(op_str, io_uring_get_opcode(req->opcode)); ), TP_printk("ring %p, request %p, user_data 0x%llx, opcode %s, flags 0x%llx, %s queue, work %p", __entry->ctx, __entry->req, __entry->user_data, __get_str(op_str), __entry->flags, __entry->rw ? "hashed" : "normal", __entry->work) ); /** * io_uring_defer - called when an io_uring request is deferred * * @req: pointer to a deferred request * * Allows to track deferred requests, to get an insight about what requests are * not started immediately. */ TRACE_EVENT(io_uring_defer, TP_PROTO(struct io_kiocb *req), TP_ARGS(req), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( unsigned long long, data ) __field( u8, opcode ) __string( op_str, io_uring_get_opcode(req->opcode) ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->data = req->cqe.user_data; __entry->opcode = req->opcode; __assign_str(op_str, io_uring_get_opcode(req->opcode)); ), TP_printk("ring %p, request %p, user_data 0x%llx, opcode %s", __entry->ctx, __entry->req, __entry->data, __get_str(op_str)) ); /** * io_uring_link - called before the io_uring request added into link_list of * another request * * @req: pointer to a linked request * @target_req: pointer to a previous request, that would contain @req * * Allows to track linked requests, to understand dependencies between requests * and how does it influence their execution flow. */ TRACE_EVENT(io_uring_link, TP_PROTO(struct io_kiocb *req, struct io_kiocb *target_req), TP_ARGS(req, target_req), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( void *, target_req ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->target_req = target_req; ), TP_printk("ring %p, request %p linked after %p", __entry->ctx, __entry->req, __entry->target_req) ); /** * io_uring_cqring_wait - called before start waiting for an available CQE * * @ctx: pointer to a ring context structure * @min_events: minimal number of events to wait for * * Allows to track waiting for CQE, so that we can e.g. troubleshoot * situations, when an application wants to wait for an event, that never * comes. */ TRACE_EVENT(io_uring_cqring_wait, TP_PROTO(void *ctx, int min_events), TP_ARGS(ctx, min_events), TP_STRUCT__entry ( __field( void *, ctx ) __field( int, min_events ) ), TP_fast_assign( __entry->ctx = ctx; __entry->min_events = min_events; ), TP_printk("ring %p, min_events %d", __entry->ctx, __entry->min_events) ); /** * io_uring_fail_link - called before failing a linked request * * @req: request, which links were cancelled * @link: cancelled link * * Allows to track linked requests cancellation, to see not only that some work * was cancelled, but also which request was the reason. */ TRACE_EVENT(io_uring_fail_link, TP_PROTO(struct io_kiocb *req, struct io_kiocb *link), TP_ARGS(req, link), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( unsigned long long, user_data ) __field( u8, opcode ) __field( void *, link ) __string( op_str, io_uring_get_opcode(req->opcode) ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->user_data = req->cqe.user_data; __entry->opcode = req->opcode; __entry->link = link; __assign_str(op_str, io_uring_get_opcode(req->opcode)); ), TP_printk("ring %p, request %p, user_data 0x%llx, opcode %s, link %p", __entry->ctx, __entry->req, __entry->user_data, __get_str(op_str), __entry->link) ); /** * io_uring_complete - called when completing an SQE * * @ctx: pointer to a ring context structure * @req: pointer to a submitted request * @user_data: user data associated with the request * @res: result of the request * @cflags: completion flags * @extra1: extra 64-bit data for CQE32 * @extra2: extra 64-bit data for CQE32 * */ TRACE_EVENT(io_uring_complete, TP_PROTO(void *ctx, void *req, u64 user_data, int res, unsigned cflags, u64 extra1, u64 extra2), TP_ARGS(ctx, req, user_data, res, cflags, extra1, extra2), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( u64, user_data ) __field( int, res ) __field( unsigned, cflags ) __field( u64, extra1 ) __field( u64, extra2 ) ), TP_fast_assign( __entry->ctx = ctx; __entry->req = req; __entry->user_data = user_data; __entry->res = res; __entry->cflags = cflags; __entry->extra1 = extra1; __entry->extra2 = extra2; ), TP_printk("ring %p, req %p, user_data 0x%llx, result %d, cflags 0x%x " "extra1 %llu extra2 %llu ", __entry->ctx, __entry->req, __entry->user_data, __entry->res, __entry->cflags, (unsigned long long) __entry->extra1, (unsigned long long) __entry->extra2) ); /** * io_uring_submit_req - called before submitting a request * * @req: pointer to a submitted request * * Allows to track SQE submitting, to understand what was the source of it, SQ * thread or io_uring_enter call. */ TRACE_EVENT(io_uring_submit_req, TP_PROTO(struct io_kiocb *req), TP_ARGS(req), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( unsigned long long, user_data ) __field( u8, opcode ) __field( unsigned long long, flags ) __field( bool, sq_thread ) __string( op_str, io_uring_get_opcode(req->opcode) ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->user_data = req->cqe.user_data; __entry->opcode = req->opcode; __entry->flags = (__force unsigned long long) req->flags; __entry->sq_thread = req->ctx->flags & IORING_SETUP_SQPOLL; __assign_str(op_str, io_uring_get_opcode(req->opcode)); ), TP_printk("ring %p, req %p, user_data 0x%llx, opcode %s, flags 0x%llx, " "sq_thread %d", __entry->ctx, __entry->req, __entry->user_data, __get_str(op_str), __entry->flags, __entry->sq_thread) ); /* * io_uring_poll_arm - called after arming a poll wait if successful * * @req: pointer to the armed request * @mask: request poll events mask * @events: registered events of interest * * Allows to track which fds are waiting for and what are the events of * interest. */ TRACE_EVENT(io_uring_poll_arm, TP_PROTO(struct io_kiocb *req, int mask, int events), TP_ARGS(req, mask, events), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( unsigned long long, user_data ) __field( u8, opcode ) __field( int, mask ) __field( int, events ) __string( op_str, io_uring_get_opcode(req->opcode) ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->user_data = req->cqe.user_data; __entry->opcode = req->opcode; __entry->mask = mask; __entry->events = events; __assign_str(op_str, io_uring_get_opcode(req->opcode)); ), TP_printk("ring %p, req %p, user_data 0x%llx, opcode %s, mask 0x%x, events 0x%x", __entry->ctx, __entry->req, __entry->user_data, __get_str(op_str), __entry->mask, __entry->events) ); /* * io_uring_task_add - called after adding a task * * @req: pointer to request * @mask: request poll events mask * */ TRACE_EVENT(io_uring_task_add, TP_PROTO(struct io_kiocb *req, int mask), TP_ARGS(req, mask), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( unsigned long long, user_data ) __field( u8, opcode ) __field( int, mask ) __string( op_str, io_uring_get_opcode(req->opcode) ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->user_data = req->cqe.user_data; __entry->opcode = req->opcode; __entry->mask = mask; __assign_str(op_str, io_uring_get_opcode(req->opcode)); ), TP_printk("ring %p, req %p, user_data 0x%llx, opcode %s, mask %x", __entry->ctx, __entry->req, __entry->user_data, __get_str(op_str), __entry->mask) ); /* * io_uring_req_failed - called when an sqe is errored dring submission * * @sqe: pointer to the io_uring_sqe that failed * @req: pointer to request * @error: error it failed with * * Allows easier diagnosing of malformed requests in production systems. */ TRACE_EVENT(io_uring_req_failed, TP_PROTO(const struct io_uring_sqe *sqe, struct io_kiocb *req, int error), TP_ARGS(sqe, req, error), TP_STRUCT__entry ( __field( void *, ctx ) __field( void *, req ) __field( unsigned long long, user_data ) __field( u8, opcode ) __field( u8, flags ) __field( u8, ioprio ) __field( u64, off ) __field( u64, addr ) __field( u32, len ) __field( u32, op_flags ) __field( u16, buf_index ) __field( u16, personality ) __field( u32, file_index ) __field( u64, pad1 ) __field( u64, addr3 ) __field( int, error ) __string( op_str, io_uring_get_opcode(sqe->opcode) ) ), TP_fast_assign( __entry->ctx = req->ctx; __entry->req = req; __entry->user_data = sqe->user_data; __entry->opcode = sqe->opcode; __entry->flags = sqe->flags; __entry->ioprio = sqe->ioprio; __entry->off = sqe->off; __entry->addr = sqe->addr; __entry->len = sqe->len; __entry->op_flags = sqe->poll32_events; __entry->buf_index = sqe->buf_index; __entry->personality = sqe->personality; __entry->file_index = sqe->file_index; __entry->pad1 = sqe->__pad2[0]; __entry->addr3 = sqe->addr3; __entry->error = error; __assign_str(op_str, io_uring_get_opcode(sqe->opcode)); ), TP_printk("ring %p, req %p, user_data 0x%llx, " "opcode %s, flags 0x%x, prio=%d, off=%llu, addr=%llu, " "len=%u, rw_flags=0x%x, buf_index=%d, " "personality=%d, file_index=%d, pad=0x%llx, addr3=%llx, " "error=%d", __entry->ctx, __entry->req, __entry->user_data, __get_str(op_str), __entry->flags, __entry->ioprio, (unsigned long long)__entry->off, (unsigned long long) __entry->addr, __entry->len, __entry->op_flags, __entry->buf_index, __entry->personality, __entry->file_index, (unsigned long long) __entry->pad1, (unsigned long long) __entry->addr3, __entry->error) ); /* * io_uring_cqe_overflow - a CQE overflowed * * @ctx: pointer to a ring context structure * @user_data: user data associated with the request * @res: CQE result * @cflags: CQE flags * @ocqe: pointer to the overflow cqe (if available) * */ TRACE_EVENT(io_uring_cqe_overflow, TP_PROTO(void *ctx, unsigned long long user_data, s32 res, u32 cflags, void *ocqe), TP_ARGS(ctx, user_data, res, cflags, ocqe), TP_STRUCT__entry ( __field( void *, ctx ) __field( unsigned long long, user_data ) __field( s32, res ) __field( u32, cflags ) __field( void *, ocqe ) ), TP_fast_assign( __entry->ctx = ctx; __entry->user_data = user_data; __entry->res = res; __entry->cflags = cflags; __entry->ocqe = ocqe; ), TP_printk("ring %p, user_data 0x%llx, res %d, cflags 0x%x, " "overflow_cqe %p", __entry->ctx, __entry->user_data, __entry->res, __entry->cflags, __entry->ocqe) ); /* * io_uring_task_work_run - ran task work * * @tctx: pointer to a io_uring_task * @count: how many functions it ran * */ TRACE_EVENT(io_uring_task_work_run, TP_PROTO(void *tctx, unsigned int count), TP_ARGS(tctx, count), TP_STRUCT__entry ( __field( void *, tctx ) __field( unsigned int, count ) ), TP_fast_assign( __entry->tctx = tctx; __entry->count = count; ), TP_printk("tctx %p, count %u", __entry->tctx, __entry->count) ); TRACE_EVENT(io_uring_short_write, TP_PROTO(void *ctx, u64 fpos, u64 wanted, u64 got), TP_ARGS(ctx, fpos, wanted, got), TP_STRUCT__entry( __field(void *, ctx) __field(u64, fpos) __field(u64, wanted) __field(u64, got) ), TP_fast_assign( __entry->ctx = ctx; __entry->fpos = fpos; __entry->wanted = wanted; __entry->got = got; ), TP_printk("ring %p, fpos %lld, wanted %lld, got %lld", __entry->ctx, __entry->fpos, __entry->wanted, __entry->got) ); /* * io_uring_local_work_run - ran ring local task work * * @tctx: pointer to a io_uring_ctx * @count: how many functions it ran * @loops: how many loops it ran * */ TRACE_EVENT(io_uring_local_work_run, TP_PROTO(void *ctx, int count, unsigned int loops), TP_ARGS(ctx, count, loops), TP_STRUCT__entry ( __field(void *, ctx ) __field(int, count ) __field(unsigned int, loops ) ), TP_fast_assign( __entry->ctx = ctx; __entry->count = count; __entry->loops = loops; ), TP_printk("ring %p, count %d, loops %u", __entry->ctx, __entry->count, __entry->loops) ); #endif /* _TRACE_IO_URING_H */ /* This part must be outside protection */ #include <trace/define_trace.h> |
2 8 11 11 6 6 13 13 2 1 1 1 10 1 9 2 9 9 1 1 2 2 2 2 2 14 1 1 13 12 2 1 13 8 9 6 5 4 9 10 10 9 8 7 8 8 8 1 2 3 3 3 18 17 18 32 34 2 2 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 | // SPDX-License-Identifier: GPL-2.0-or-later /* * ALSA sequencer Timer * Copyright (c) 1998-1999 by Frank van de Pol <fvdpol@coil.demon.nl> * Jaroslav Kysela <perex@perex.cz> */ #include <sound/core.h> #include <linux/slab.h> #include "seq_timer.h" #include "seq_queue.h" #include "seq_info.h" /* allowed sequencer timer frequencies, in Hz */ #define MIN_FREQUENCY 10 #define MAX_FREQUENCY 6250 #define DEFAULT_FREQUENCY 1000 #define SKEW_BASE 0x10000 /* 16bit shift */ static void snd_seq_timer_set_tick_resolution(struct snd_seq_timer *tmr) { if (tmr->tempo < 1000000) tmr->tick.resolution = (tmr->tempo * 1000) / tmr->ppq; else { /* might overflow.. */ unsigned int s; s = tmr->tempo % tmr->ppq; s = (s * 1000) / tmr->ppq; tmr->tick.resolution = (tmr->tempo / tmr->ppq) * 1000; tmr->tick.resolution += s; } if (tmr->tick.resolution <= 0) tmr->tick.resolution = 1; snd_seq_timer_update_tick(&tmr->tick, 0); } /* create new timer (constructor) */ struct snd_seq_timer *snd_seq_timer_new(void) { struct snd_seq_timer *tmr; tmr = kzalloc(sizeof(*tmr), GFP_KERNEL); if (!tmr) return NULL; spin_lock_init(&tmr->lock); /* reset setup to defaults */ snd_seq_timer_defaults(tmr); /* reset time */ snd_seq_timer_reset(tmr); return tmr; } /* delete timer (destructor) */ void snd_seq_timer_delete(struct snd_seq_timer **tmr) { struct snd_seq_timer *t = *tmr; *tmr = NULL; if (t == NULL) { pr_debug("ALSA: seq: snd_seq_timer_delete() called with NULL timer\n"); return; } t->running = 0; /* reset time */ snd_seq_timer_stop(t); snd_seq_timer_reset(t); kfree(t); } void snd_seq_timer_defaults(struct snd_seq_timer * tmr) { guard(spinlock_irqsave)(&tmr->lock); /* setup defaults */ tmr->ppq = 96; /* 96 PPQ */ tmr->tempo = 500000; /* 120 BPM */ snd_seq_timer_set_tick_resolution(tmr); tmr->running = 0; tmr->type = SNDRV_SEQ_TIMER_ALSA; tmr->alsa_id.dev_class = seq_default_timer_class; tmr->alsa_id.dev_sclass = seq_default_timer_sclass; tmr->alsa_id.card = seq_default_timer_card; tmr->alsa_id.device = seq_default_timer_device; tmr->alsa_id.subdevice = seq_default_timer_subdevice; tmr->preferred_resolution = seq_default_timer_resolution; tmr->skew = tmr->skew_base = SKEW_BASE; } static void seq_timer_reset(struct snd_seq_timer *tmr) { /* reset time & songposition */ tmr->cur_time.tv_sec = 0; tmr->cur_time.tv_nsec = 0; tmr->tick.cur_tick = 0; tmr->tick.fraction = 0; } void snd_seq_timer_reset(struct snd_seq_timer *tmr) { guard(spinlock_irqsave)(&tmr->lock); seq_timer_reset(tmr); } /* called by timer interrupt routine. the period time since previous invocation is passed */ static void snd_seq_timer_interrupt(struct snd_timer_instance *timeri, unsigned long resolution, unsigned long ticks) { struct snd_seq_queue *q = timeri->callback_data; struct snd_seq_timer *tmr; if (q == NULL) return; tmr = q->timer; if (tmr == NULL) return; scoped_guard(spinlock_irqsave, &tmr->lock) { if (!tmr->running) return; resolution *= ticks; if (tmr->skew != tmr->skew_base) { /* FIXME: assuming skew_base = 0x10000 */ resolution = (resolution >> 16) * tmr->skew + (((resolution & 0xffff) * tmr->skew) >> 16); } /* update timer */ snd_seq_inc_time_nsec(&tmr->cur_time, resolution); /* calculate current tick */ snd_seq_timer_update_tick(&tmr->tick, resolution); /* register actual time of this timer update */ ktime_get_ts64(&tmr->last_update); } /* check queues and dispatch events */ snd_seq_check_queue(q, 1, 0); } /* set current tempo */ int snd_seq_timer_set_tempo(struct snd_seq_timer * tmr, int tempo) { if (snd_BUG_ON(!tmr)) return -EINVAL; if (tempo <= 0) return -EINVAL; guard(spinlock_irqsave)(&tmr->lock); if ((unsigned int)tempo != tmr->tempo) { tmr->tempo = tempo; snd_seq_timer_set_tick_resolution(tmr); } return 0; } /* set current tempo and ppq in a shot */ int snd_seq_timer_set_tempo_ppq(struct snd_seq_timer *tmr, int tempo, int ppq) { int changed; if (snd_BUG_ON(!tmr)) return -EINVAL; if (tempo <= 0 || ppq <= 0) return -EINVAL; guard(spinlock_irqsave)(&tmr->lock); if (tmr->running && (ppq != tmr->ppq)) { /* refuse to change ppq on running timers */ /* because it will upset the song position (ticks) */ pr_debug("ALSA: seq: cannot change ppq of a running timer\n"); return -EBUSY; } changed = (tempo != tmr->tempo) || (ppq != tmr->ppq); tmr->tempo = tempo; tmr->ppq = ppq; if (changed) snd_seq_timer_set_tick_resolution(tmr); return 0; } /* set current tick position */ int snd_seq_timer_set_position_tick(struct snd_seq_timer *tmr, snd_seq_tick_time_t position) { if (snd_BUG_ON(!tmr)) return -EINVAL; guard(spinlock_irqsave)(&tmr->lock); tmr->tick.cur_tick = position; tmr->tick.fraction = 0; return 0; } /* set current real-time position */ int snd_seq_timer_set_position_time(struct snd_seq_timer *tmr, snd_seq_real_time_t position) { if (snd_BUG_ON(!tmr)) return -EINVAL; snd_seq_sanity_real_time(&position); guard(spinlock_irqsave)(&tmr->lock); tmr->cur_time = position; return 0; } /* set timer skew */ int snd_seq_timer_set_skew(struct snd_seq_timer *tmr, unsigned int skew, unsigned int base) { if (snd_BUG_ON(!tmr)) return -EINVAL; /* FIXME */ if (base != SKEW_BASE) { pr_debug("ALSA: seq: invalid skew base 0x%x\n", base); return -EINVAL; } guard(spinlock_irqsave)(&tmr->lock); tmr->skew = skew; return 0; } int snd_seq_timer_open(struct snd_seq_queue *q) { struct snd_timer_instance *t; struct snd_seq_timer *tmr; char str[32]; int err; tmr = q->timer; if (snd_BUG_ON(!tmr)) return -EINVAL; if (tmr->timeri) return -EBUSY; sprintf(str, "sequencer queue %i", q->queue); if (tmr->type != SNDRV_SEQ_TIMER_ALSA) /* standard ALSA timer */ return -EINVAL; if (tmr->alsa_id.dev_class != SNDRV_TIMER_CLASS_SLAVE) tmr->alsa_id.dev_sclass = SNDRV_TIMER_SCLASS_SEQUENCER; t = snd_timer_instance_new(str); if (!t) return -ENOMEM; t->callback = snd_seq_timer_interrupt; t->callback_data = q; t->flags |= SNDRV_TIMER_IFLG_AUTO; err = snd_timer_open(t, &tmr->alsa_id, q->queue); if (err < 0 && tmr->alsa_id.dev_class != SNDRV_TIMER_CLASS_SLAVE) { if (tmr->alsa_id.dev_class != SNDRV_TIMER_CLASS_GLOBAL || tmr->alsa_id.device != SNDRV_TIMER_GLOBAL_SYSTEM) { struct snd_timer_id tid; memset(&tid, 0, sizeof(tid)); tid.dev_class = SNDRV_TIMER_CLASS_GLOBAL; tid.dev_sclass = SNDRV_TIMER_SCLASS_SEQUENCER; tid.card = -1; tid.device = SNDRV_TIMER_GLOBAL_SYSTEM; err = snd_timer_open(t, &tid, q->queue); } } if (err < 0) { pr_err("ALSA: seq fatal error: cannot create timer (%i)\n", err); snd_timer_instance_free(t); return err; } scoped_guard(spinlock_irq, &tmr->lock) { if (tmr->timeri) err = -EBUSY; else tmr->timeri = t; } if (err < 0) { snd_timer_close(t); snd_timer_instance_free(t); return err; } return 0; } int snd_seq_timer_close(struct snd_seq_queue *q) { struct snd_seq_timer *tmr; struct snd_timer_instance *t; tmr = q->timer; if (snd_BUG_ON(!tmr)) return -EINVAL; scoped_guard(spinlock_irq, &tmr->lock) { t = tmr->timeri; tmr->timeri = NULL; } if (t) { snd_timer_close(t); snd_timer_instance_free(t); } return 0; } static int seq_timer_stop(struct snd_seq_timer *tmr) { if (! tmr->timeri) return -EINVAL; if (!tmr->running) return 0; tmr->running = 0; snd_timer_pause(tmr->timeri); return 0; } int snd_seq_timer_stop(struct snd_seq_timer *tmr) { guard(spinlock_irqsave)(&tmr->lock); return seq_timer_stop(tmr); } static int initialize_timer(struct snd_seq_timer *tmr) { struct snd_timer *t; unsigned long freq; t = tmr->timeri->timer; if (!t) return -EINVAL; freq = tmr->preferred_resolution; if (!freq) freq = DEFAULT_FREQUENCY; else if (freq < MIN_FREQUENCY) freq = MIN_FREQUENCY; else if (freq > MAX_FREQUENCY) freq = MAX_FREQUENCY; tmr->ticks = 1; if (!(t->hw.flags & SNDRV_TIMER_HW_SLAVE)) { unsigned long r = snd_timer_resolution(tmr->timeri); if (r) { tmr->ticks = (unsigned int)(1000000000uL / (r * freq)); if (! tmr->ticks) tmr->ticks = 1; } } tmr->initialized = 1; return 0; } static int seq_timer_start(struct snd_seq_timer *tmr) { if (! tmr->timeri) return -EINVAL; if (tmr->running) seq_timer_stop(tmr); seq_timer_reset(tmr); if (initialize_timer(tmr) < 0) return -EINVAL; snd_timer_start(tmr->timeri, tmr->ticks); tmr->running = 1; ktime_get_ts64(&tmr->last_update); return 0; } int snd_seq_timer_start(struct snd_seq_timer *tmr) { guard(spinlock_irqsave)(&tmr->lock); return seq_timer_start(tmr); } static int seq_timer_continue(struct snd_seq_timer *tmr) { if (! tmr->timeri) return -EINVAL; if (tmr->running) return -EBUSY; if (! tmr->initialized) { seq_timer_reset(tmr); if (initialize_timer(tmr) < 0) return -EINVAL; } snd_timer_start(tmr->timeri, tmr->ticks); tmr->running = 1; ktime_get_ts64(&tmr->last_update); return 0; } int snd_seq_timer_continue(struct snd_seq_timer *tmr) { guard(spinlock_irqsave)(&tmr->lock); return seq_timer_continue(tmr); } /* return current 'real' time. use timeofday() to get better granularity. */ snd_seq_real_time_t snd_seq_timer_get_cur_time(struct snd_seq_timer *tmr, bool adjust_ktime) { snd_seq_real_time_t cur_time; guard(spinlock_irqsave)(&tmr->lock); cur_time = tmr->cur_time; if (adjust_ktime && tmr->running) { struct timespec64 tm; ktime_get_ts64(&tm); tm = timespec64_sub(tm, tmr->last_update); cur_time.tv_nsec += tm.tv_nsec; cur_time.tv_sec += tm.tv_sec; snd_seq_sanity_real_time(&cur_time); } return cur_time; } /* TODO: use interpolation on tick queue (will only be useful for very high PPQ values) */ snd_seq_tick_time_t snd_seq_timer_get_cur_tick(struct snd_seq_timer *tmr) { guard(spinlock_irqsave)(&tmr->lock); return tmr->tick.cur_tick; } #ifdef CONFIG_SND_PROC_FS /* exported to seq_info.c */ void snd_seq_info_timer_read(struct snd_info_entry *entry, struct snd_info_buffer *buffer) { int idx; struct snd_seq_queue *q; struct snd_seq_timer *tmr; struct snd_timer_instance *ti; unsigned long resolution; for (idx = 0; idx < SNDRV_SEQ_MAX_QUEUES; idx++) { q = queueptr(idx); if (q == NULL) continue; scoped_guard(mutex, &q->timer_mutex) { tmr = q->timer; if (!tmr) break; ti = tmr->timeri; if (!ti) break; snd_iprintf(buffer, "Timer for queue %i : %s\n", q->queue, ti->timer->name); resolution = snd_timer_resolution(ti) * tmr->ticks; snd_iprintf(buffer, " Period time : %lu.%09lu\n", resolution / 1000000000, resolution % 1000000000); snd_iprintf(buffer, " Skew : %u / %u\n", tmr->skew, tmr->skew_base); } queuefree(q); } } #endif /* CONFIG_SND_PROC_FS */ |
2 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 | // SPDX-License-Identifier: GPL-2.0-or-later /* DataCenter TCP (DCTCP) congestion control. * * http://simula.stanford.edu/~alizade/Site/DCTCP.html * * This is an implementation of DCTCP over Reno, an enhancement to the * TCP congestion control algorithm designed for data centers. DCTCP * leverages Explicit Congestion Notification (ECN) in the network to * provide multi-bit feedback to the end hosts. DCTCP's goal is to meet * the following three data center transport requirements: * * - High burst tolerance (incast due to partition/aggregate) * - Low latency (short flows, queries) * - High throughput (continuous data updates, large file transfers) * with commodity shallow buffered switches * * The algorithm is described in detail in the following two papers: * * 1) Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, * Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan: * "Data Center TCP (DCTCP)", Data Center Networks session * Proc. ACM SIGCOMM, New Delhi, 2010. * http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp-final.pdf * * 2) Mohammad Alizadeh, Adel Javanmard, and Balaji Prabhakar: * "Analysis of DCTCP: Stability, Convergence, and Fairness" * Proc. ACM SIGMETRICS, San Jose, 2011. * http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp_analysis-full.pdf * * Initial prototype from Abdul Kabbani, Masato Yasuda and Mohammad Alizadeh. * * Authors: * * Daniel Borkmann <dborkman@redhat.com> * Florian Westphal <fw@strlen.de> * Glenn Judd <glenn.judd@morganstanley.com> */ #include <linux/btf.h> #include <linux/btf_ids.h> #include <linux/module.h> #include <linux/mm.h> #include <net/tcp.h> #include <linux/inet_diag.h> #include "tcp_dctcp.h" #define DCTCP_MAX_ALPHA 1024U struct dctcp { u32 old_delivered; u32 old_delivered_ce; u32 prior_rcv_nxt; u32 dctcp_alpha; u32 next_seq; u32 ce_state; u32 loss_cwnd; struct tcp_plb_state plb; }; static unsigned int dctcp_shift_g __read_mostly = 4; /* g = 1/2^4 */ module_param(dctcp_shift_g, uint, 0644); MODULE_PARM_DESC(dctcp_shift_g, "parameter g for updating dctcp_alpha"); static unsigned int dctcp_alpha_on_init __read_mostly = DCTCP_MAX_ALPHA; module_param(dctcp_alpha_on_init, uint, 0644); MODULE_PARM_DESC(dctcp_alpha_on_init, "parameter for initial alpha value"); static struct tcp_congestion_ops dctcp_reno; static void dctcp_reset(const struct tcp_sock *tp, struct dctcp *ca) { ca->next_seq = tp->snd_nxt; ca->old_delivered = tp->delivered; ca->old_delivered_ce = tp->delivered_ce; } __bpf_kfunc static void dctcp_init(struct sock *sk) { const struct tcp_sock *tp = tcp_sk(sk); if ((tp->ecn_flags & TCP_ECN_OK) || (sk->sk_state == TCP_LISTEN || sk->sk_state == TCP_CLOSE)) { struct dctcp *ca = inet_csk_ca(sk); ca->prior_rcv_nxt = tp->rcv_nxt; ca->dctcp_alpha = min(dctcp_alpha_on_init, DCTCP_MAX_ALPHA); ca->loss_cwnd = 0; ca->ce_state = 0; dctcp_reset(tp, ca); tcp_plb_init(sk, &ca->plb); return; } /* No ECN support? Fall back to Reno. Also need to clear * ECT from sk since it is set during 3WHS for DCTCP. */ inet_csk(sk)->icsk_ca_ops = &dctcp_reno; INET_ECN_dontxmit(sk); } __bpf_kfunc static u32 dctcp_ssthresh(struct sock *sk) { struct dctcp *ca = inet_csk_ca(sk); struct tcp_sock *tp = tcp_sk(sk); ca->loss_cwnd = tcp_snd_cwnd(tp); return max(tcp_snd_cwnd(tp) - ((tcp_snd_cwnd(tp) * ca->dctcp_alpha) >> 11U), 2U); } __bpf_kfunc static void dctcp_update_alpha(struct sock *sk, u32 flags) { const struct tcp_sock *tp = tcp_sk(sk); struct dctcp *ca = inet_csk_ca(sk); /* Expired RTT */ if (!before(tp->snd_una, ca->next_seq)) { u32 delivered = tp->delivered - ca->old_delivered; u32 delivered_ce = tp->delivered_ce - ca->old_delivered_ce; u32 alpha = ca->dctcp_alpha; u32 ce_ratio = 0; if (delivered > 0) { /* dctcp_alpha keeps EWMA of fraction of ECN marked * packets. Because of EWMA smoothing, PLB reaction can * be slow so we use ce_ratio which is an instantaneous * measure of congestion. ce_ratio is the fraction of * ECN marked packets in the previous RTT. */ if (delivered_ce > 0) ce_ratio = (delivered_ce << TCP_PLB_SCALE) / delivered; tcp_plb_update_state(sk, &ca->plb, (int)ce_ratio); tcp_plb_check_rehash(sk, &ca->plb); } /* alpha = (1 - g) * alpha + g * F */ alpha -= min_not_zero(alpha, alpha >> dctcp_shift_g); if (delivered_ce) { /* If dctcp_shift_g == 1, a 32bit value would overflow * after 8 M packets. */ delivered_ce <<= (10 - dctcp_shift_g); delivered_ce /= max(1U, delivered); alpha = min(alpha + delivered_ce, DCTCP_MAX_ALPHA); } /* dctcp_alpha can be read from dctcp_get_info() without * synchro, so we ask compiler to not use dctcp_alpha * as a temporary variable in prior operations. */ WRITE_ONCE(ca->dctcp_alpha, alpha); dctcp_reset(tp, ca); } } static void dctcp_react_to_loss(struct sock *sk) { struct dctcp *ca = inet_csk_ca(sk); struct tcp_sock *tp = tcp_sk(sk); ca->loss_cwnd = tcp_snd_cwnd(tp); tp->snd_ssthresh = max(tcp_snd_cwnd(tp) >> 1U, 2U); } __bpf_kfunc static void dctcp_state(struct sock *sk, u8 new_state) { if (new_state == TCP_CA_Recovery && new_state != inet_csk(sk)->icsk_ca_state) dctcp_react_to_loss(sk); /* We handle RTO in dctcp_cwnd_event to ensure that we perform only * one loss-adjustment per RTT. */ } __bpf_kfunc static void dctcp_cwnd_event(struct sock *sk, enum tcp_ca_event ev) { struct dctcp *ca = inet_csk_ca(sk); switch (ev) { case CA_EVENT_ECN_IS_CE: case CA_EVENT_ECN_NO_CE: dctcp_ece_ack_update(sk, ev, &ca->prior_rcv_nxt, &ca->ce_state); break; case CA_EVENT_LOSS: tcp_plb_update_state_upon_rto(sk, &ca->plb); dctcp_react_to_loss(sk); break; case CA_EVENT_TX_START: tcp_plb_check_rehash(sk, &ca->plb); /* Maybe rehash when inflight is 0 */ break; default: /* Don't care for the rest. */ break; } } static size_t dctcp_get_info(struct sock *sk, u32 ext, int *attr, union tcp_cc_info *info) { const struct dctcp *ca = inet_csk_ca(sk); const struct tcp_sock *tp = tcp_sk(sk); /* Fill it also in case of VEGASINFO due to req struct limits. * We can still correctly retrieve it later. */ if (ext & (1 << (INET_DIAG_DCTCPINFO - 1)) || ext & (1 << (INET_DIAG_VEGASINFO - 1))) { memset(&info->dctcp, 0, sizeof(info->dctcp)); if (inet_csk(sk)->icsk_ca_ops != &dctcp_reno) { info->dctcp.dctcp_enabled = 1; info->dctcp.dctcp_ce_state = (u16) ca->ce_state; info->dctcp.dctcp_alpha = ca->dctcp_alpha; info->dctcp.dctcp_ab_ecn = tp->mss_cache * (tp->delivered_ce - ca->old_delivered_ce); info->dctcp.dctcp_ab_tot = tp->mss_cache * (tp->delivered - ca->old_delivered); } *attr = INET_DIAG_DCTCPINFO; return sizeof(info->dctcp); } return 0; } __bpf_kfunc static u32 dctcp_cwnd_undo(struct sock *sk) { const struct dctcp *ca = inet_csk_ca(sk); struct tcp_sock *tp = tcp_sk(sk); return max(tcp_snd_cwnd(tp), ca->loss_cwnd); } static struct tcp_congestion_ops dctcp __read_mostly = { .init = dctcp_init, .in_ack_event = dctcp_update_alpha, .cwnd_event = dctcp_cwnd_event, .ssthresh = dctcp_ssthresh, .cong_avoid = tcp_reno_cong_avoid, .undo_cwnd = dctcp_cwnd_undo, .set_state = dctcp_state, .get_info = dctcp_get_info, .flags = TCP_CONG_NEEDS_ECN, .owner = THIS_MODULE, .name = "dctcp", }; static struct tcp_congestion_ops dctcp_reno __read_mostly = { .ssthresh = tcp_reno_ssthresh, .cong_avoid = tcp_reno_cong_avoid, .undo_cwnd = tcp_reno_undo_cwnd, .get_info = dctcp_get_info, .owner = THIS_MODULE, .name = "dctcp-reno", }; BTF_KFUNCS_START(tcp_dctcp_check_kfunc_ids) #ifdef CONFIG_X86 #ifdef CONFIG_DYNAMIC_FTRACE BTF_ID_FLAGS(func, dctcp_init) BTF_ID_FLAGS(func, dctcp_update_alpha) BTF_ID_FLAGS(func, dctcp_cwnd_event) BTF_ID_FLAGS(func, dctcp_ssthresh) BTF_ID_FLAGS(func, dctcp_cwnd_undo) BTF_ID_FLAGS(func, dctcp_state) #endif #endif BTF_KFUNCS_END(tcp_dctcp_check_kfunc_ids) static const struct btf_kfunc_id_set tcp_dctcp_kfunc_set = { .owner = THIS_MODULE, .set = &tcp_dctcp_check_kfunc_ids, }; static int __init dctcp_register(void) { int ret; BUILD_BUG_ON(sizeof(struct dctcp) > ICSK_CA_PRIV_SIZE); ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &tcp_dctcp_kfunc_set); if (ret < 0) return ret; return tcp_register_congestion_control(&dctcp); } static void __exit dctcp_unregister(void) { tcp_unregister_congestion_control(&dctcp); } module_init(dctcp_register); module_exit(dctcp_unregister); MODULE_AUTHOR("Daniel Borkmann <dborkman@redhat.com>"); MODULE_AUTHOR("Florian Westphal <fw@strlen.de>"); MODULE_AUTHOR("Glenn Judd <glenn.judd@morganstanley.com>"); MODULE_LICENSE("GPL v2"); MODULE_DESCRIPTION("DataCenter TCP (DCTCP)"); |
751 751 710 172 161 736 138 769 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _BLK_CGROUP_PRIVATE_H #define _BLK_CGROUP_PRIVATE_H /* * block cgroup private header * * Based on ideas and code from CFQ, CFS and BFQ: * Copyright (C) 2003 Jens Axboe <axboe@kernel.dk> * * Copyright (C) 2008 Fabio Checconi <fabio@gandalf.sssup.it> * Paolo Valente <paolo.valente@unimore.it> * * Copyright (C) 2009 Vivek Goyal <vgoyal@redhat.com> * Nauman Rafique <nauman@google.com> */ #include <linux/blk-cgroup.h> #include <linux/cgroup.h> #include <linux/kthread.h> #include <linux/blk-mq.h> #include <linux/llist.h> #include "blk.h" struct blkcg_gq; struct blkg_policy_data; /* percpu_counter batch for blkg_[rw]stats, per-cpu drift doesn't matter */ #define BLKG_STAT_CPU_BATCH (INT_MAX / 2) #ifdef CONFIG_BLK_CGROUP enum blkg_iostat_type { BLKG_IOSTAT_READ, BLKG_IOSTAT_WRITE, BLKG_IOSTAT_DISCARD, BLKG_IOSTAT_NR, }; struct blkg_iostat { u64 bytes[BLKG_IOSTAT_NR]; u64 ios[BLKG_IOSTAT_NR]; }; struct blkg_iostat_set { struct u64_stats_sync sync; struct blkcg_gq *blkg; struct llist_node lnode; int lqueued; /* queued in llist */ struct blkg_iostat cur; struct blkg_iostat last; }; /* association between a blk cgroup and a request queue */ struct blkcg_gq { /* Pointer to the associated request_queue */ struct request_queue *q; struct list_head q_node; struct hlist_node blkcg_node; struct blkcg *blkcg; /* all non-root blkcg_gq's are guaranteed to have access to parent */ struct blkcg_gq *parent; /* reference count */ struct percpu_ref refcnt; /* is this blkg online? protected by both blkcg and q locks */ bool online; struct blkg_iostat_set __percpu *iostat_cpu; struct blkg_iostat_set iostat; struct blkg_policy_data *pd[BLKCG_MAX_POLS]; #ifdef CONFIG_BLK_CGROUP_PUNT_BIO spinlock_t async_bio_lock; struct bio_list async_bios; #endif union { struct work_struct async_bio_work; struct work_struct free_work; }; atomic_t use_delay; atomic64_t delay_nsec; atomic64_t delay_start; u64 last_delay; int last_use; struct rcu_head rcu_head; }; struct blkcg { struct cgroup_subsys_state css; spinlock_t lock; refcount_t online_pin; struct radix_tree_root blkg_tree; struct blkcg_gq __rcu *blkg_hint; struct hlist_head blkg_list; struct blkcg_policy_data *cpd[BLKCG_MAX_POLS]; struct list_head all_blkcgs_node; /* * List of updated percpu blkg_iostat_set's since the last flush. */ struct llist_head __percpu *lhead; #ifdef CONFIG_BLK_CGROUP_FC_APPID char fc_app_id[FC_APPID_LEN]; #endif #ifdef CONFIG_CGROUP_WRITEBACK struct list_head cgwb_list; #endif }; static inline struct blkcg *css_to_blkcg(struct cgroup_subsys_state *css) { return css ? container_of(css, struct blkcg, css) : NULL; } /* * A blkcg_gq (blkg) is association between a block cgroup (blkcg) and a * request_queue (q). This is used by blkcg policies which need to track * information per blkcg - q pair. * * There can be multiple active blkcg policies and each blkg:policy pair is * represented by a blkg_policy_data which is allocated and freed by each * policy's pd_alloc/free_fn() methods. A policy can allocate private data * area by allocating larger data structure which embeds blkg_policy_data * at the beginning. */ struct blkg_policy_data { /* the blkg and policy id this per-policy data belongs to */ struct blkcg_gq *blkg; int plid; bool online; }; /* * Policies that need to keep per-blkcg data which is independent from any * request_queue associated to it should implement cpd_alloc/free_fn() * methods. A policy can allocate private data area by allocating larger * data structure which embeds blkcg_policy_data at the beginning. * cpd_init() is invoked to let each policy handle per-blkcg data. */ struct blkcg_policy_data { /* the blkcg and policy id this per-policy data belongs to */ struct blkcg *blkcg; int plid; }; typedef struct blkcg_policy_data *(blkcg_pol_alloc_cpd_fn)(gfp_t gfp); typedef void (blkcg_pol_init_cpd_fn)(struct blkcg_policy_data *cpd); typedef void (blkcg_pol_free_cpd_fn)(struct blkcg_policy_data *cpd); typedef void (blkcg_pol_bind_cpd_fn)(struct blkcg_policy_data *cpd); typedef struct blkg_policy_data *(blkcg_pol_alloc_pd_fn)(struct gendisk *disk, struct blkcg *blkcg, gfp_t gfp); typedef void (blkcg_pol_init_pd_fn)(struct blkg_policy_data *pd); typedef void (blkcg_pol_online_pd_fn)(struct blkg_policy_data *pd); typedef void (blkcg_pol_offline_pd_fn)(struct blkg_policy_data *pd); typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd); typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkg_policy_data *pd); typedef void (blkcg_pol_stat_pd_fn)(struct blkg_policy_data *pd, struct seq_file *s); struct blkcg_policy { int plid; /* cgroup files for the policy */ struct cftype *dfl_cftypes; struct cftype *legacy_cftypes; /* operations */ blkcg_pol_alloc_cpd_fn *cpd_alloc_fn; blkcg_pol_free_cpd_fn *cpd_free_fn; blkcg_pol_alloc_pd_fn *pd_alloc_fn; blkcg_pol_init_pd_fn *pd_init_fn; blkcg_pol_online_pd_fn *pd_online_fn; blkcg_pol_offline_pd_fn *pd_offline_fn; blkcg_pol_free_pd_fn *pd_free_fn; blkcg_pol_reset_pd_stats_fn *pd_reset_stats_fn; blkcg_pol_stat_pd_fn *pd_stat_fn; }; extern struct blkcg blkcg_root; extern bool blkcg_debug_stats; int blkcg_init_disk(struct gendisk *disk); void blkcg_exit_disk(struct gendisk *disk); /* Blkio controller policy registration */ int blkcg_policy_register(struct blkcg_policy *pol); void blkcg_policy_unregister(struct blkcg_policy *pol); int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol); void blkcg_deactivate_policy(struct gendisk *disk, const struct blkcg_policy *pol); const char *blkg_dev_name(struct blkcg_gq *blkg); void blkcg_print_blkgs(struct seq_file *sf, struct blkcg *blkcg, u64 (*prfill)(struct seq_file *, struct blkg_policy_data *, int), const struct blkcg_policy *pol, int data, bool show_total); u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v); struct blkg_conf_ctx { char *input; char *body; struct block_device *bdev; struct blkcg_gq *blkg; }; void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input); int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx); int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, struct blkg_conf_ctx *ctx); void blkg_conf_exit(struct blkg_conf_ctx *ctx); /** * bio_issue_as_root_blkg - see if this bio needs to be issued as root blkg * @return: true if this bio needs to be submitted with the root blkg context. * * In order to avoid priority inversions we sometimes need to issue a bio as if * it were attached to the root blkg, and then backcharge to the actual owning * blkg. The idea is we do bio_blkcg_css() to look up the actual context for * the bio and attach the appropriate blkg to the bio. Then we call this helper * and if it is true run with the root blkg for that queue and then do any * backcharging to the originating cgroup once the io is complete. */ static inline bool bio_issue_as_root_blkg(struct bio *bio) { return (bio->bi_opf & (REQ_META | REQ_SWAP)) != 0; } /** * blkg_lookup - lookup blkg for the specified blkcg - q pair * @blkcg: blkcg of interest * @q: request_queue of interest * * Lookup blkg for the @blkcg - @q pair. * Must be called in a RCU critical section. */ static inline struct blkcg_gq *blkg_lookup(struct blkcg *blkcg, struct request_queue *q) { struct blkcg_gq *blkg; if (blkcg == &blkcg_root) return q->root_blkg; blkg = rcu_dereference_check(blkcg->blkg_hint, lockdep_is_held(&q->queue_lock)); if (blkg && blkg->q == q) return blkg; blkg = radix_tree_lookup(&blkcg->blkg_tree, q->id); if (blkg && blkg->q != q) blkg = NULL; return blkg; } /** * blkg_to_pdata - get policy private data * @blkg: blkg of interest * @pol: policy of interest * * Return pointer to private data associated with the @blkg-@pol pair. */ static inline struct blkg_policy_data *blkg_to_pd(struct blkcg_gq *blkg, struct blkcg_policy *pol) { return blkg ? blkg->pd[pol->plid] : NULL; } static inline struct blkcg_policy_data *blkcg_to_cpd(struct blkcg *blkcg, struct blkcg_policy *pol) { return blkcg ? blkcg->cpd[pol->plid] : NULL; } /** * pdata_to_blkg - get blkg associated with policy private data * @pd: policy private data of interest * * @pd is policy private data. Determine the blkg it's associated with. */ static inline struct blkcg_gq *pd_to_blkg(struct blkg_policy_data *pd) { return pd ? pd->blkg : NULL; } static inline struct blkcg *cpd_to_blkcg(struct blkcg_policy_data *cpd) { return cpd ? cpd->blkcg : NULL; } /** * blkg_path - format cgroup path of blkg * @blkg: blkg of interest * @buf: target buffer * @buflen: target buffer length * * Format the path of the cgroup of @blkg into @buf. */ static inline int blkg_path(struct blkcg_gq *blkg, char *buf, int buflen) { return cgroup_path(blkg->blkcg->css.cgroup, buf, buflen); } /** * blkg_get - get a blkg reference * @blkg: blkg to get * * The caller should be holding an existing reference. */ static inline void blkg_get(struct blkcg_gq *blkg) { percpu_ref_get(&blkg->refcnt); } /** * blkg_tryget - try and get a blkg reference * @blkg: blkg to get * * This is for use when doing an RCU lookup of the blkg. We may be in the midst * of freeing this blkg, so we can only use it if the refcnt is not zero. */ static inline bool blkg_tryget(struct blkcg_gq *blkg) { return blkg && percpu_ref_tryget(&blkg->refcnt); } /** * blkg_put - put a blkg reference * @blkg: blkg to put */ static inline void blkg_put(struct blkcg_gq *blkg) { percpu_ref_put(&blkg->refcnt); } /** * blkg_for_each_descendant_pre - pre-order walk of a blkg's descendants * @d_blkg: loop cursor pointing to the current descendant * @pos_css: used for iteration * @p_blkg: target blkg to walk descendants of * * Walk @c_blkg through the descendants of @p_blkg. Must be used with RCU * read locked. If called under either blkcg or queue lock, the iteration * is guaranteed to include all and only online blkgs. The caller may * update @pos_css by calling css_rightmost_descendant() to skip subtree. * @p_blkg is included in the iteration and the first node to be visited. */ #define blkg_for_each_descendant_pre(d_blkg, pos_css, p_blkg) \ css_for_each_descendant_pre((pos_css), &(p_blkg)->blkcg->css) \ if (((d_blkg) = blkg_lookup(css_to_blkcg(pos_css), \ (p_blkg)->q))) /** * blkg_for_each_descendant_post - post-order walk of a blkg's descendants * @d_blkg: loop cursor pointing to the current descendant * @pos_css: used for iteration * @p_blkg: target blkg to walk descendants of * * Similar to blkg_for_each_descendant_pre() but performs post-order * traversal instead. Synchronization rules are the same. @p_blkg is * included in the iteration and the last node to be visited. */ #define blkg_for_each_descendant_post(d_blkg, pos_css, p_blkg) \ css_for_each_descendant_post((pos_css), &(p_blkg)->blkcg->css) \ if (((d_blkg) = blkg_lookup(css_to_blkcg(pos_css), \ (p_blkg)->q))) static inline void blkcg_bio_issue_init(struct bio *bio) { bio_issue_init(&bio->bi_issue, bio_sectors(bio)); } static inline void blkcg_use_delay(struct blkcg_gq *blkg) { if (WARN_ON_ONCE(atomic_read(&blkg->use_delay) < 0)) return; if (atomic_add_return(1, &blkg->use_delay) == 1) atomic_inc(&blkg->blkcg->css.cgroup->congestion_count); } static inline int blkcg_unuse_delay(struct blkcg_gq *blkg) { int old = atomic_read(&blkg->use_delay); if (WARN_ON_ONCE(old < 0)) return 0; if (old == 0) return 0; /* * We do this song and dance because we can race with somebody else * adding or removing delay. If we just did an atomic_dec we'd end up * negative and we'd already be in trouble. We need to subtract 1 and * then check to see if we were the last delay so we can drop the * congestion count on the cgroup. */ while (old && !atomic_try_cmpxchg(&blkg->use_delay, &old, old - 1)) ; if (old == 0) return 0; if (old == 1) atomic_dec(&blkg->blkcg->css.cgroup->congestion_count); return 1; } /** * blkcg_set_delay - Enable allocator delay mechanism with the specified delay amount * @blkg: target blkg * @delay: delay duration in nsecs * * When enabled with this function, the delay is not decayed and must be * explicitly cleared with blkcg_clear_delay(). Must not be mixed with * blkcg_[un]use_delay() and blkcg_add_delay() usages. */ static inline void blkcg_set_delay(struct blkcg_gq *blkg, u64 delay) { int old = atomic_read(&blkg->use_delay); /* We only want 1 person setting the congestion count for this blkg. */ if (!old && atomic_try_cmpxchg(&blkg->use_delay, &old, -1)) atomic_inc(&blkg->blkcg->css.cgroup->congestion_count); atomic64_set(&blkg->delay_nsec, delay); } /** * blkcg_clear_delay - Disable allocator delay mechanism * @blkg: target blkg * * Disable use_delay mechanism. See blkcg_set_delay(). */ static inline void blkcg_clear_delay(struct blkcg_gq *blkg) { int old = atomic_read(&blkg->use_delay); /* We only want 1 person clearing the congestion count for this blkg. */ if (old && atomic_try_cmpxchg(&blkg->use_delay, &old, 0)) atomic_dec(&blkg->blkcg->css.cgroup->congestion_count); } /** * blk_cgroup_mergeable - Determine whether to allow or disallow merges * @rq: request to merge into * @bio: bio to merge * * @bio and @rq should belong to the same cgroup and their issue_as_root should * match. The latter is necessary as we don't want to throttle e.g. a metadata * update because it happens to be next to a regular IO. */ static inline bool blk_cgroup_mergeable(struct request *rq, struct bio *bio) { return rq->bio->bi_blkg == bio->bi_blkg && bio_issue_as_root_blkg(rq->bio) == bio_issue_as_root_blkg(bio); } void blk_cgroup_bio_start(struct bio *bio); void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta); #else /* CONFIG_BLK_CGROUP */ struct blkg_policy_data { }; struct blkcg_policy_data { }; struct blkcg_policy { }; struct blkcg { }; static inline struct blkcg_gq *blkg_lookup(struct blkcg *blkcg, void *key) { return NULL; } static inline int blkcg_init_disk(struct gendisk *disk) { return 0; } static inline void blkcg_exit_disk(struct gendisk *disk) { } static inline int blkcg_policy_register(struct blkcg_policy *pol) { return 0; } static inline void blkcg_policy_unregister(struct blkcg_policy *pol) { } static inline int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol) { return 0; } static inline void blkcg_deactivate_policy(struct gendisk *disk, const struct blkcg_policy *pol) { } static inline struct blkg_policy_data *blkg_to_pd(struct blkcg_gq *blkg, struct blkcg_policy *pol) { return NULL; } static inline struct blkcg_gq *pd_to_blkg(struct blkg_policy_data *pd) { return NULL; } static inline char *blkg_path(struct blkcg_gq *blkg) { return NULL; } static inline void blkg_get(struct blkcg_gq *blkg) { } static inline void blkg_put(struct blkcg_gq *blkg) { } static inline void blkcg_bio_issue_init(struct bio *bio) { } static inline void blk_cgroup_bio_start(struct bio *bio) { } static inline bool blk_cgroup_mergeable(struct request *rq, struct bio *bio) { return true; } #define blk_queue_for_each_rl(rl, q) \ for ((rl) = &(q)->root_rl; (rl); (rl) = NULL) #endif /* CONFIG_BLK_CGROUP */ #endif /* _BLK_CGROUP_PRIVATE_H */ |
4 4 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 | /* * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include <linux/security.h> #include <linux/completion.h> #include <linux/list.h> #include <rdma/ib_verbs.h> #include <rdma/ib_cache.h> #include "core_priv.h" #include "mad_priv.h" static LIST_HEAD(mad_agent_list); /* Lock to protect mad_agent_list */ static DEFINE_SPINLOCK(mad_agent_list_lock); static struct pkey_index_qp_list *get_pkey_idx_qp_list(struct ib_port_pkey *pp) { struct pkey_index_qp_list *pkey = NULL; struct pkey_index_qp_list *tmp_pkey; struct ib_device *dev = pp->sec->dev; spin_lock(&dev->port_data[pp->port_num].pkey_list_lock); list_for_each_entry (tmp_pkey, &dev->port_data[pp->port_num].pkey_list, pkey_index_list) { if (tmp_pkey->pkey_index == pp->pkey_index) { pkey = tmp_pkey; break; } } spin_unlock(&dev->port_data[pp->port_num].pkey_list_lock); return pkey; } static int get_pkey_and_subnet_prefix(struct ib_port_pkey *pp, u16 *pkey, u64 *subnet_prefix) { struct ib_device *dev = pp->sec->dev; int ret; ret = ib_get_cached_pkey(dev, pp->port_num, pp->pkey_index, pkey); if (ret) return ret; ib_get_cached_subnet_prefix(dev, pp->port_num, subnet_prefix); return ret; } static int enforce_qp_pkey_security(u16 pkey, u64 subnet_prefix, struct ib_qp_security *qp_sec) { struct ib_qp_security *shared_qp_sec; int ret; ret = security_ib_pkey_access(qp_sec->security, subnet_prefix, pkey); if (ret) return ret; list_for_each_entry(shared_qp_sec, &qp_sec->shared_qp_list, shared_qp_list) { ret = security_ib_pkey_access(shared_qp_sec->security, subnet_prefix, pkey); if (ret) return ret; } return 0; } /* The caller of this function must hold the QP security * mutex of the QP of the security structure in *pps. * * It takes separate ports_pkeys and security structure * because in some cases the pps will be for a new settings * or the pps will be for the real QP and security structure * will be for a shared QP. */ static int check_qp_port_pkey_settings(struct ib_ports_pkeys *pps, struct ib_qp_security *sec) { u64 subnet_prefix; u16 pkey; int ret = 0; if (!pps) return 0; if (pps->main.state != IB_PORT_PKEY_NOT_VALID) { ret = get_pkey_and_subnet_prefix(&pps->main, &pkey, &subnet_prefix); if (ret) return ret; ret = enforce_qp_pkey_security(pkey, subnet_prefix, sec); if (ret) return ret; } if (pps->alt.state != IB_PORT_PKEY_NOT_VALID) { ret = get_pkey_and_subnet_prefix(&pps->alt, &pkey, &subnet_prefix); if (ret) return ret; ret = enforce_qp_pkey_security(pkey, subnet_prefix, sec); } return ret; } /* The caller of this function must hold the QP security * mutex. */ static void qp_to_error(struct ib_qp_security *sec) { struct ib_qp_security *shared_qp_sec; struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR }; struct ib_event event = { .event = IB_EVENT_QP_FATAL }; /* If the QP is in the process of being destroyed * the qp pointer in the security structure is * undefined. It cannot be modified now. */ if (sec->destroying) return; ib_modify_qp(sec->qp, &attr, IB_QP_STATE); if (sec->qp->event_handler && sec->qp->qp_context) { event.element.qp = sec->qp; sec->qp->event_handler(&event, sec->qp->qp_context); } list_for_each_entry(shared_qp_sec, &sec->shared_qp_list, shared_qp_list) { struct ib_qp *qp = shared_qp_sec->qp; if (qp->event_handler && qp->qp_context) { event.element.qp = qp; event.device = qp->device; qp->event_handler(&event, qp->qp_context); } } } static inline void check_pkey_qps(struct pkey_index_qp_list *pkey, struct ib_device *device, u32 port_num, u64 subnet_prefix) { struct ib_port_pkey *pp, *tmp_pp; bool comp; LIST_HEAD(to_error_list); u16 pkey_val; if (!ib_get_cached_pkey(device, port_num, pkey->pkey_index, &pkey_val)) { spin_lock(&pkey->qp_list_lock); list_for_each_entry(pp, &pkey->qp_list, qp_list) { if (atomic_read(&pp->sec->error_list_count)) continue; if (enforce_qp_pkey_security(pkey_val, subnet_prefix, pp->sec)) { atomic_inc(&pp->sec->error_list_count); list_add(&pp->to_error_list, &to_error_list); } } spin_unlock(&pkey->qp_list_lock); } list_for_each_entry_safe(pp, tmp_pp, &to_error_list, to_error_list) { mutex_lock(&pp->sec->mutex); qp_to_error(pp->sec); list_del(&pp->to_error_list); atomic_dec(&pp->sec->error_list_count); comp = pp->sec->destroying; mutex_unlock(&pp->sec->mutex); if (comp) complete(&pp->sec->error_complete); } } /* The caller of this function must hold the QP security * mutex. */ static int port_pkey_list_insert(struct ib_port_pkey *pp) { struct pkey_index_qp_list *tmp_pkey; struct pkey_index_qp_list *pkey; struct ib_device *dev; u32 port_num = pp->port_num; int ret = 0; if (pp->state != IB_PORT_PKEY_VALID) return 0; dev = pp->sec->dev; pkey = get_pkey_idx_qp_list(pp); if (!pkey) { bool found = false; pkey = kzalloc(sizeof(*pkey), GFP_KERNEL); if (!pkey) return -ENOMEM; spin_lock(&dev->port_data[port_num].pkey_list_lock); /* Check for the PKey again. A racing process may * have created it. */ list_for_each_entry(tmp_pkey, &dev->port_data[port_num].pkey_list, pkey_index_list) { if (tmp_pkey->pkey_index == pp->pkey_index) { kfree(pkey); pkey = tmp_pkey; found = true; break; } } if (!found) { pkey->pkey_index = pp->pkey_index; spin_lock_init(&pkey->qp_list_lock); INIT_LIST_HEAD(&pkey->qp_list); list_add(&pkey->pkey_index_list, &dev->port_data[port_num].pkey_list); } spin_unlock(&dev->port_data[port_num].pkey_list_lock); } spin_lock(&pkey->qp_list_lock); list_add(&pp->qp_list, &pkey->qp_list); spin_unlock(&pkey->qp_list_lock); pp->state = IB_PORT_PKEY_LISTED; return ret; } /* The caller of this function must hold the QP security * mutex. */ static void port_pkey_list_remove(struct ib_port_pkey *pp) { struct pkey_index_qp_list *pkey; if (pp->state != IB_PORT_PKEY_LISTED) return; pkey = get_pkey_idx_qp_list(pp); spin_lock(&pkey->qp_list_lock); list_del(&pp->qp_list); spin_unlock(&pkey->qp_list_lock); /* The setting may still be valid, i.e. after * a destroy has failed for example. */ pp->state = IB_PORT_PKEY_VALID; } static void destroy_qp_security(struct ib_qp_security *sec) { security_ib_free_security(sec->security); kfree(sec->ports_pkeys); kfree(sec); } /* The caller of this function must hold the QP security * mutex. */ static struct ib_ports_pkeys *get_new_pps(const struct ib_qp *qp, const struct ib_qp_attr *qp_attr, int qp_attr_mask) { struct ib_ports_pkeys *new_pps; struct ib_ports_pkeys *qp_pps = qp->qp_sec->ports_pkeys; new_pps = kzalloc(sizeof(*new_pps), GFP_KERNEL); if (!new_pps) return NULL; if (qp_attr_mask & IB_QP_PORT) new_pps->main.port_num = qp_attr->port_num; else if (qp_pps) new_pps->main.port_num = qp_pps->main.port_num; if (qp_attr_mask & IB_QP_PKEY_INDEX) new_pps->main.pkey_index = qp_attr->pkey_index; else if (qp_pps) new_pps->main.pkey_index = qp_pps->main.pkey_index; if (((qp_attr_mask & IB_QP_PKEY_INDEX) && (qp_attr_mask & IB_QP_PORT)) || (qp_pps && qp_pps->main.state != IB_PORT_PKEY_NOT_VALID)) new_pps->main.state = IB_PORT_PKEY_VALID; if (qp_attr_mask & IB_QP_ALT_PATH) { new_pps->alt.port_num = qp_attr->alt_port_num; new_pps->alt.pkey_index = qp_attr->alt_pkey_index; new_pps->alt.state = IB_PORT_PKEY_VALID; } else if (qp_pps) { new_pps->alt.port_num = qp_pps->alt.port_num; new_pps->alt.pkey_index = qp_pps->alt.pkey_index; if (qp_pps->alt.state != IB_PORT_PKEY_NOT_VALID) new_pps->alt.state = IB_PORT_PKEY_VALID; } new_pps->main.sec = qp->qp_sec; new_pps->alt.sec = qp->qp_sec; return new_pps; } int ib_open_shared_qp_security(struct ib_qp *qp, struct ib_device *dev) { struct ib_qp *real_qp = qp->real_qp; int ret; ret = ib_create_qp_security(qp, dev); if (ret) return ret; if (!qp->qp_sec) return 0; mutex_lock(&real_qp->qp_sec->mutex); ret = check_qp_port_pkey_settings(real_qp->qp_sec->ports_pkeys, qp->qp_sec); if (ret) goto ret; if (qp != real_qp) list_add(&qp->qp_sec->shared_qp_list, &real_qp->qp_sec->shared_qp_list); ret: mutex_unlock(&real_qp->qp_sec->mutex); if (ret) destroy_qp_security(qp->qp_sec); return ret; } void ib_close_shared_qp_security(struct ib_qp_security *sec) { struct ib_qp *real_qp = sec->qp->real_qp; mutex_lock(&real_qp->qp_sec->mutex); list_del(&sec->shared_qp_list); mutex_unlock(&real_qp->qp_sec->mutex); destroy_qp_security(sec); } int ib_create_qp_security(struct ib_qp *qp, struct ib_device *dev) { unsigned int i; bool is_ib = false; int ret; rdma_for_each_port (dev, i) { is_ib = rdma_protocol_ib(dev, i); if (is_ib) break; } /* If this isn't an IB device don't create the security context */ if (!is_ib) return 0; qp->qp_sec = kzalloc(sizeof(*qp->qp_sec), GFP_KERNEL); if (!qp->qp_sec) return -ENOMEM; qp->qp_sec->qp = qp; qp->qp_sec->dev = dev; mutex_init(&qp->qp_sec->mutex); INIT_LIST_HEAD(&qp->qp_sec->shared_qp_list); atomic_set(&qp->qp_sec->error_list_count, 0); init_completion(&qp->qp_sec->error_complete); ret = security_ib_alloc_security(&qp->qp_sec->security); if (ret) { kfree(qp->qp_sec); qp->qp_sec = NULL; } return ret; } EXPORT_SYMBOL(ib_create_qp_security); void ib_destroy_qp_security_begin(struct ib_qp_security *sec) { /* Return if not IB */ if (!sec) return; mutex_lock(&sec->mutex); /* Remove the QP from the lists so it won't get added to * a to_error_list during the destroy process. */ if (sec->ports_pkeys) { port_pkey_list_remove(&sec->ports_pkeys->main); port_pkey_list_remove(&sec->ports_pkeys->alt); } /* If the QP is already in one or more of those lists * the destroying flag will ensure the to error flow * doesn't operate on an undefined QP. */ sec->destroying = true; /* Record the error list count to know how many completions * to wait for. */ sec->error_comps_pending = atomic_read(&sec->error_list_count); mutex_unlock(&sec->mutex); } void ib_destroy_qp_security_abort(struct ib_qp_security *sec) { int ret; int i; /* Return if not IB */ if (!sec) return; /* If a concurrent cache update is in progress this * QP security could be marked for an error state * transition. Wait for this to complete. */ for (i = 0; i < sec->error_comps_pending; i++) wait_for_completion(&sec->error_complete); mutex_lock(&sec->mutex); sec->destroying = false; /* Restore the position in the lists and verify * access is still allowed in case a cache update * occurred while attempting to destroy. * * Because these setting were listed already * and removed during ib_destroy_qp_security_begin * we know the pkey_index_qp_list for the PKey * already exists so port_pkey_list_insert won't fail. */ if (sec->ports_pkeys) { port_pkey_list_insert(&sec->ports_pkeys->main); port_pkey_list_insert(&sec->ports_pkeys->alt); } ret = check_qp_port_pkey_settings(sec->ports_pkeys, sec); if (ret) qp_to_error(sec); mutex_unlock(&sec->mutex); } void ib_destroy_qp_security_end(struct ib_qp_security *sec) { int i; /* Return if not IB */ if (!sec) return; /* If a concurrent cache update is occurring we must * wait until this QP security structure is processed * in the QP to error flow before destroying it because * the to_error_list is in use. */ for (i = 0; i < sec->error_comps_pending; i++) wait_for_completion(&sec->error_complete); destroy_qp_security(sec); } void ib_security_cache_change(struct ib_device *device, u32 port_num, u64 subnet_prefix) { struct pkey_index_qp_list *pkey; list_for_each_entry (pkey, &device->port_data[port_num].pkey_list, pkey_index_list) { check_pkey_qps(pkey, device, port_num, subnet_prefix); } } void ib_security_release_port_pkey_list(struct ib_device *device) { struct pkey_index_qp_list *pkey, *tmp_pkey; unsigned int i; rdma_for_each_port (device, i) { list_for_each_entry_safe(pkey, tmp_pkey, &device->port_data[i].pkey_list, pkey_index_list) { list_del(&pkey->pkey_index_list); kfree(pkey); } } } int ib_security_modify_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr, int qp_attr_mask, struct ib_udata *udata) { int ret = 0; struct ib_ports_pkeys *tmp_pps; struct ib_ports_pkeys *new_pps = NULL; struct ib_qp *real_qp = qp->real_qp; bool special_qp = (real_qp->qp_type == IB_QPT_SMI || real_qp->qp_type == IB_QPT_GSI || real_qp->qp_type >= IB_QPT_RESERVED1); bool pps_change = ((qp_attr_mask & (IB_QP_PKEY_INDEX | IB_QP_PORT)) || (qp_attr_mask & IB_QP_ALT_PATH)); WARN_ONCE((qp_attr_mask & IB_QP_PORT && rdma_protocol_ib(real_qp->device, qp_attr->port_num) && !real_qp->qp_sec), "%s: QP security is not initialized for IB QP: %u\n", __func__, real_qp->qp_num); /* The port/pkey settings are maintained only for the real QP. Open * handles on the real QP will be in the shared_qp_list. When * enforcing security on the real QP all the shared QPs will be * checked as well. */ if (pps_change && !special_qp && real_qp->qp_sec) { mutex_lock(&real_qp->qp_sec->mutex); new_pps = get_new_pps(real_qp, qp_attr, qp_attr_mask); if (!new_pps) { mutex_unlock(&real_qp->qp_sec->mutex); return -ENOMEM; } /* Add this QP to the lists for the new port * and pkey settings before checking for permission * in case there is a concurrent cache update * occurring. Walking the list for a cache change * doesn't acquire the security mutex unless it's * sending the QP to error. */ ret = port_pkey_list_insert(&new_pps->main); if (!ret) ret = port_pkey_list_insert(&new_pps->alt); if (!ret) ret = check_qp_port_pkey_settings(new_pps, real_qp->qp_sec); } if (!ret) ret = real_qp->device->ops.modify_qp(real_qp, qp_attr, qp_attr_mask, udata); if (new_pps) { /* Clean up the lists and free the appropriate * ports_pkeys structure. */ if (ret) { tmp_pps = new_pps; } else { tmp_pps = real_qp->qp_sec->ports_pkeys; real_qp->qp_sec->ports_pkeys = new_pps; } if (tmp_pps) { port_pkey_list_remove(&tmp_pps->main); port_pkey_list_remove(&tmp_pps->alt); } kfree(tmp_pps); mutex_unlock(&real_qp->qp_sec->mutex); } return ret; } static int ib_security_pkey_access(struct ib_device *dev, u32 port_num, u16 pkey_index, void *sec) { u64 subnet_prefix; u16 pkey; int ret; if (!rdma_protocol_ib(dev, port_num)) return 0; ret = ib_get_cached_pkey(dev, port_num, pkey_index, &pkey); if (ret) return ret; ib_get_cached_subnet_prefix(dev, port_num, &subnet_prefix); return security_ib_pkey_access(sec, subnet_prefix, pkey); } void ib_mad_agent_security_change(void) { struct ib_mad_agent *ag; spin_lock(&mad_agent_list_lock); list_for_each_entry(ag, &mad_agent_list, mad_agent_sec_list) WRITE_ONCE(ag->smp_allowed, !security_ib_endport_manage_subnet(ag->security, dev_name(&ag->device->dev), ag->port_num)); spin_unlock(&mad_agent_list_lock); } int ib_mad_agent_security_setup(struct ib_mad_agent *agent, enum ib_qp_type qp_type) { int ret; if (!rdma_protocol_ib(agent->device, agent->port_num)) return 0; INIT_LIST_HEAD(&agent->mad_agent_sec_list); ret = security_ib_alloc_security(&agent->security); if (ret) return ret; if (qp_type != IB_QPT_SMI) return 0; spin_lock(&mad_agent_list_lock); ret = security_ib_endport_manage_subnet(agent->security, dev_name(&agent->device->dev), agent->port_num); if (ret) goto free_security; WRITE_ONCE(agent->smp_allowed, true); list_add(&agent->mad_agent_sec_list, &mad_agent_list); spin_unlock(&mad_agent_list_lock); return 0; free_security: spin_unlock(&mad_agent_list_lock); security_ib_free_security(agent->security); return ret; } void ib_mad_agent_security_cleanup(struct ib_mad_agent *agent) { if (!rdma_protocol_ib(agent->device, agent->port_num)) return; if (agent->qp->qp_type == IB_QPT_SMI) { spin_lock(&mad_agent_list_lock); list_del(&agent->mad_agent_sec_list); spin_unlock(&mad_agent_list_lock); } security_ib_free_security(agent->security); } int ib_mad_enforce_security(struct ib_mad_agent_private *map, u16 pkey_index) { if (!rdma_protocol_ib(map->agent.device, map->agent.port_num)) return 0; if (map->agent.qp->qp_type == IB_QPT_SMI) { if (!READ_ONCE(map->agent.smp_allowed)) return -EACCES; return 0; } return ib_security_pkey_access(map->agent.device, map->agent.port_num, pkey_index, map->agent.security); } |
4 4 4 4 4 41 18 4 14 1 7 8 8 2 6 4 15 8 9 2 7 73 9 10 1 2 2 1 2 1 3 1 1 2 1 1 1 1 2 2 2 5 5 16 16 2 2 6 6 9 8 9 10 2 2 1 1 5 5 5 5 1 1 1 1 2 1 3 87 90 93 15 44 94 45 44 90 93 3 1 2 4 1 3 1 3 22 22 8 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 | // SPDX-License-Identifier: GPL-2.0-only /* * V4L2 sub-device * * Copyright (C) 2010 Nokia Corporation * * Contact: Laurent Pinchart <laurent.pinchart@ideasonboard.com> * Sakari Ailus <sakari.ailus@iki.fi> */ #include <linux/export.h> #include <linux/ioctl.h> #include <linux/leds.h> #include <linux/mm.h> #include <linux/module.h> #include <linux/overflow.h> #include <linux/slab.h> #include <linux/string.h> #include <linux/types.h> #include <linux/version.h> #include <linux/videodev2.h> #include <media/v4l2-ctrls.h> #include <media/v4l2-device.h> #include <media/v4l2-event.h> #include <media/v4l2-fh.h> #include <media/v4l2-ioctl.h> #if defined(CONFIG_VIDEO_V4L2_SUBDEV_API) /* * The Streams API is an experimental feature. To use the Streams API, set * 'v4l2_subdev_enable_streams_api' to 1 below. */ static bool v4l2_subdev_enable_streams_api; #endif /* * Maximum stream ID is 63 for now, as we use u64 bitmask to represent a set * of streams. * * Note that V4L2_FRAME_DESC_ENTRY_MAX is related: V4L2_FRAME_DESC_ENTRY_MAX * restricts the total number of streams in a pad, although the stream ID is * not restricted. */ #define V4L2_SUBDEV_MAX_STREAM_ID 63 #include "v4l2-subdev-priv.h" #if defined(CONFIG_VIDEO_V4L2_SUBDEV_API) static int subdev_fh_init(struct v4l2_subdev_fh *fh, struct v4l2_subdev *sd) { struct v4l2_subdev_state *state; static struct lock_class_key key; state = __v4l2_subdev_state_alloc(sd, "fh->state->lock", &key); if (IS_ERR(state)) return PTR_ERR(state); fh->state = state; return 0; } static void subdev_fh_free(struct v4l2_subdev_fh *fh) { __v4l2_subdev_state_free(fh->state); fh->state = NULL; } static int subdev_open(struct file *file) { struct video_device *vdev = video_devdata(file); struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev); struct v4l2_subdev_fh *subdev_fh; int ret; subdev_fh = kzalloc(sizeof(*subdev_fh), GFP_KERNEL); if (subdev_fh == NULL) return -ENOMEM; ret = subdev_fh_init(subdev_fh, sd); if (ret) { kfree(subdev_fh); return ret; } v4l2_fh_init(&subdev_fh->vfh, vdev); v4l2_fh_add(&subdev_fh->vfh); file->private_data = &subdev_fh->vfh; if (sd->v4l2_dev->mdev && sd->entity.graph_obj.mdev->dev) { struct module *owner; owner = sd->entity.graph_obj.mdev->dev->driver->owner; if (!try_module_get(owner)) { ret = -EBUSY; goto err; } subdev_fh->owner = owner; } if (sd->internal_ops && sd->internal_ops->open) { ret = sd->internal_ops->open(sd, subdev_fh); if (ret < 0) goto err; } return 0; err: module_put(subdev_fh->owner); v4l2_fh_del(&subdev_fh->vfh); v4l2_fh_exit(&subdev_fh->vfh); subdev_fh_free(subdev_fh); kfree(subdev_fh); return ret; } static int subdev_close(struct file *file) { struct video_device *vdev = video_devdata(file); struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev); struct v4l2_fh *vfh = file->private_data; struct v4l2_subdev_fh *subdev_fh = to_v4l2_subdev_fh(vfh); if (sd->internal_ops && sd->internal_ops->close) sd->internal_ops->close(sd, subdev_fh); module_put(subdev_fh->owner); v4l2_fh_del(vfh); v4l2_fh_exit(vfh); subdev_fh_free(subdev_fh); kfree(subdev_fh); file->private_data = NULL; return 0; } #else /* CONFIG_VIDEO_V4L2_SUBDEV_API */ static int subdev_open(struct file *file) { return -ENODEV; } static int subdev_close(struct file *file) { return -ENODEV; } #endif /* CONFIG_VIDEO_V4L2_SUBDEV_API */ static inline int check_which(u32 which) { if (which != V4L2_SUBDEV_FORMAT_TRY && which != V4L2_SUBDEV_FORMAT_ACTIVE) return -EINVAL; return 0; } static inline int check_pad(struct v4l2_subdev *sd, u32 pad) { #if defined(CONFIG_MEDIA_CONTROLLER) if (sd->entity.num_pads) { if (pad >= sd->entity.num_pads) return -EINVAL; return 0; } #endif /* allow pad 0 on subdevices not registered as media entities */ if (pad > 0) return -EINVAL; return 0; } static int check_state(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, u32 which, u32 pad, u32 stream) { if (sd->flags & V4L2_SUBDEV_FL_STREAMS) { #if defined(CONFIG_VIDEO_V4L2_SUBDEV_API) if (!v4l2_subdev_state_get_format(state, pad, stream)) return -EINVAL; return 0; #else return -EINVAL; #endif } if (stream != 0) return -EINVAL; if (which == V4L2_SUBDEV_FORMAT_TRY && (!state || !state->pads)) return -EINVAL; return 0; } static inline int check_format(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_format *format) { if (!format) return -EINVAL; return check_which(format->which) ? : check_pad(sd, format->pad) ? : check_state(sd, state, format->which, format->pad, format->stream); } static int call_get_fmt(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_format *format) { return check_format(sd, state, format) ? : sd->ops->pad->get_fmt(sd, state, format); } static int call_set_fmt(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_format *format) { return check_format(sd, state, format) ? : sd->ops->pad->set_fmt(sd, state, format); } static int call_enum_mbus_code(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_mbus_code_enum *code) { if (!code) return -EINVAL; return check_which(code->which) ? : check_pad(sd, code->pad) ? : check_state(sd, state, code->which, code->pad, code->stream) ? : sd->ops->pad->enum_mbus_code(sd, state, code); } static int call_enum_frame_size(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_frame_size_enum *fse) { if (!fse) return -EINVAL; return check_which(fse->which) ? : check_pad(sd, fse->pad) ? : check_state(sd, state, fse->which, fse->pad, fse->stream) ? : sd->ops->pad->enum_frame_size(sd, state, fse); } static int call_enum_frame_interval(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_frame_interval_enum *fie) { if (!fie) return -EINVAL; return check_which(fie->which) ? : check_pad(sd, fie->pad) ? : check_state(sd, state, fie->which, fie->pad, fie->stream) ? : sd->ops->pad->enum_frame_interval(sd, state, fie); } static inline int check_selection(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_selection *sel) { if (!sel) return -EINVAL; return check_which(sel->which) ? : check_pad(sd, sel->pad) ? : check_state(sd, state, sel->which, sel->pad, sel->stream); } static int call_get_selection(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_selection *sel) { return check_selection(sd, state, sel) ? : sd->ops->pad->get_selection(sd, state, sel); } static int call_set_selection(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_selection *sel) { return check_selection(sd, state, sel) ? : sd->ops->pad->set_selection(sd, state, sel); } static inline int check_frame_interval(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_frame_interval *fi) { if (!fi) return -EINVAL; return check_which(fi->which) ? : check_pad(sd, fi->pad) ? : check_state(sd, state, fi->which, fi->pad, fi->stream); } static int call_get_frame_interval(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_frame_interval *fi) { return check_frame_interval(sd, state, fi) ? : sd->ops->pad->get_frame_interval(sd, state, fi); } static int call_set_frame_interval(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_frame_interval *fi) { return check_frame_interval(sd, state, fi) ? : sd->ops->pad->set_frame_interval(sd, state, fi); } static int call_get_frame_desc(struct v4l2_subdev *sd, unsigned int pad, struct v4l2_mbus_frame_desc *fd) { unsigned int i; int ret; memset(fd, 0, sizeof(*fd)); ret = sd->ops->pad->get_frame_desc(sd, pad, fd); if (ret) return ret; dev_dbg(sd->dev, "Frame descriptor on pad %u, type %s\n", pad, fd->type == V4L2_MBUS_FRAME_DESC_TYPE_PARALLEL ? "parallel" : fd->type == V4L2_MBUS_FRAME_DESC_TYPE_CSI2 ? "CSI-2" : "unknown"); for (i = 0; i < fd->num_entries; i++) { struct v4l2_mbus_frame_desc_entry *entry = &fd->entry[i]; char buf[20] = ""; if (fd->type == V4L2_MBUS_FRAME_DESC_TYPE_CSI2) WARN_ON(snprintf(buf, sizeof(buf), ", vc %u, dt 0x%02x", entry->bus.csi2.vc, entry->bus.csi2.dt) >= sizeof(buf)); dev_dbg(sd->dev, "\tstream %u, code 0x%04x, length %u, flags 0x%04x%s\n", entry->stream, entry->pixelcode, entry->length, entry->flags, buf); } return 0; } static inline int check_edid(struct v4l2_subdev *sd, struct v4l2_subdev_edid *edid) { if (!edid) return -EINVAL; if (edid->blocks && edid->edid == NULL) return -EINVAL; return check_pad(sd, edid->pad); } static int call_get_edid(struct v4l2_subdev *sd, struct v4l2_subdev_edid *edid) { return check_edid(sd, edid) ? : sd->ops->pad->get_edid(sd, edid); } static int call_set_edid(struct v4l2_subdev *sd, struct v4l2_subdev_edid *edid) { return check_edid(sd, edid) ? : sd->ops->pad->set_edid(sd, edid); } static int call_dv_timings_cap(struct v4l2_subdev *sd, struct v4l2_dv_timings_cap *cap) { if (!cap) return -EINVAL; return check_pad(sd, cap->pad) ? : sd->ops->pad->dv_timings_cap(sd, cap); } static int call_enum_dv_timings(struct v4l2_subdev *sd, struct v4l2_enum_dv_timings *dvt) { if (!dvt) return -EINVAL; return check_pad(sd, dvt->pad) ? : sd->ops->pad->enum_dv_timings(sd, dvt); } static int call_get_mbus_config(struct v4l2_subdev *sd, unsigned int pad, struct v4l2_mbus_config *config) { return check_pad(sd, pad) ? : sd->ops->pad->get_mbus_config(sd, pad, config); } static int call_s_stream(struct v4l2_subdev *sd, int enable) { int ret; /* * The .s_stream() operation must never be called to start or stop an * already started or stopped subdev. Catch offenders but don't return * an error yet to avoid regressions. * * As .s_stream() is mutually exclusive with the .enable_streams() and * .disable_streams() operation, we can use the enabled_streams field * to store the subdev streaming state. */ if (WARN_ON(!!sd->enabled_streams == !!enable)) return 0; #if IS_REACHABLE(CONFIG_LEDS_CLASS) if (!IS_ERR_OR_NULL(sd->privacy_led)) { if (enable) led_set_brightness(sd->privacy_led, sd->privacy_led->max_brightness); else led_set_brightness(sd->privacy_led, 0); } #endif ret = sd->ops->video->s_stream(sd, enable); if (!enable && ret < 0) { dev_warn(sd->dev, "disabling streaming failed (%d)\n", ret); ret = 0; } if (!ret) sd->enabled_streams = enable ? BIT(0) : 0; return ret; } #ifdef CONFIG_MEDIA_CONTROLLER /* * Create state-management wrapper for pad ops dealing with subdev state. The * wrapper handles the case where the caller does not provide the called * subdev's state. This should be removed when all the callers are fixed. */ #define DEFINE_STATE_WRAPPER(f, arg_type) \ static int call_##f##_state(struct v4l2_subdev *sd, \ struct v4l2_subdev_state *_state, \ arg_type *arg) \ { \ struct v4l2_subdev_state *state = _state; \ int ret; \ if (!_state) \ state = v4l2_subdev_lock_and_get_active_state(sd); \ ret = call_##f(sd, state, arg); \ if (!_state && state) \ v4l2_subdev_unlock_state(state); \ return ret; \ } #else /* CONFIG_MEDIA_CONTROLLER */ #define DEFINE_STATE_WRAPPER(f, arg_type) \ static int call_##f##_state(struct v4l2_subdev *sd, \ struct v4l2_subdev_state *state, \ arg_type *arg) \ { \ return call_##f(sd, state, arg); \ } #endif /* CONFIG_MEDIA_CONTROLLER */ DEFINE_STATE_WRAPPER(get_fmt, struct v4l2_subdev_format); DEFINE_STATE_WRAPPER(set_fmt, struct v4l2_subdev_format); DEFINE_STATE_WRAPPER(enum_mbus_code, struct v4l2_subdev_mbus_code_enum); DEFINE_STATE_WRAPPER(enum_frame_size, struct v4l2_subdev_frame_size_enum); DEFINE_STATE_WRAPPER(enum_frame_interval, struct v4l2_subdev_frame_interval_enum); DEFINE_STATE_WRAPPER(get_selection, struct v4l2_subdev_selection); DEFINE_STATE_WRAPPER(set_selection, struct v4l2_subdev_selection); static const struct v4l2_subdev_pad_ops v4l2_subdev_call_pad_wrappers = { .get_fmt = call_get_fmt_state, .set_fmt = call_set_fmt_state, .enum_mbus_code = call_enum_mbus_code_state, .enum_frame_size = call_enum_frame_size_state, .enum_frame_interval = call_enum_frame_interval_state, .get_selection = call_get_selection_state, .set_selection = call_set_selection_state, .get_frame_interval = call_get_frame_interval, .set_frame_interval = call_set_frame_interval, .get_edid = call_get_edid, .set_edid = call_set_edid, .dv_timings_cap = call_dv_timings_cap, .enum_dv_timings = call_enum_dv_timings, .get_frame_desc = call_get_frame_desc, .get_mbus_config = call_get_mbus_config, }; static const struct v4l2_subdev_video_ops v4l2_subdev_call_video_wrappers = { .s_stream = call_s_stream, }; const struct v4l2_subdev_ops v4l2_subdev_call_wrappers = { .pad = &v4l2_subdev_call_pad_wrappers, .video = &v4l2_subdev_call_video_wrappers, }; EXPORT_SYMBOL(v4l2_subdev_call_wrappers); #if defined(CONFIG_VIDEO_V4L2_SUBDEV_API) static struct v4l2_subdev_state * subdev_ioctl_get_state(struct v4l2_subdev *sd, struct v4l2_subdev_fh *subdev_fh, unsigned int cmd, void *arg) { u32 which; switch (cmd) { default: return NULL; case VIDIOC_SUBDEV_G_FMT: case VIDIOC_SUBDEV_S_FMT: which = ((struct v4l2_subdev_format *)arg)->which; break; case VIDIOC_SUBDEV_G_CROP: case VIDIOC_SUBDEV_S_CROP: which = ((struct v4l2_subdev_crop *)arg)->which; break; case VIDIOC_SUBDEV_ENUM_MBUS_CODE: which = ((struct v4l2_subdev_mbus_code_enum *)arg)->which; break; case VIDIOC_SUBDEV_ENUM_FRAME_SIZE: which = ((struct v4l2_subdev_frame_size_enum *)arg)->which; break; case VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL: which = ((struct v4l2_subdev_frame_interval_enum *)arg)->which; break; case VIDIOC_SUBDEV_G_SELECTION: case VIDIOC_SUBDEV_S_SELECTION: which = ((struct v4l2_subdev_selection *)arg)->which; break; case VIDIOC_SUBDEV_G_FRAME_INTERVAL: case VIDIOC_SUBDEV_S_FRAME_INTERVAL: { struct v4l2_subdev_frame_interval *fi = arg; if (!(subdev_fh->client_caps & V4L2_SUBDEV_CLIENT_CAP_INTERVAL_USES_WHICH)) fi->which = V4L2_SUBDEV_FORMAT_ACTIVE; which = fi->which; break; } case VIDIOC_SUBDEV_G_ROUTING: case VIDIOC_SUBDEV_S_ROUTING: which = ((struct v4l2_subdev_routing *)arg)->which; break; } return which == V4L2_SUBDEV_FORMAT_TRY ? subdev_fh->state : v4l2_subdev_get_unlocked_active_state(sd); } static long subdev_do_ioctl(struct file *file, unsigned int cmd, void *arg, struct v4l2_subdev_state *state) { struct video_device *vdev = video_devdata(file); struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev); struct v4l2_fh *vfh = file->private_data; struct v4l2_subdev_fh *subdev_fh = to_v4l2_subdev_fh(vfh); bool ro_subdev = test_bit(V4L2_FL_SUBDEV_RO_DEVNODE, &vdev->flags); bool streams_subdev = sd->flags & V4L2_SUBDEV_FL_STREAMS; bool client_supports_streams = subdev_fh->client_caps & V4L2_SUBDEV_CLIENT_CAP_STREAMS; int rval; /* * If the streams API is not enabled, remove V4L2_SUBDEV_CAP_STREAMS. * Remove this when the API is no longer experimental. */ if (!v4l2_subdev_enable_streams_api) streams_subdev = false; switch (cmd) { case VIDIOC_SUBDEV_QUERYCAP: { struct v4l2_subdev_capability *cap = arg; memset(cap->reserved, 0, sizeof(cap->reserved)); cap->version = LINUX_VERSION_CODE; cap->capabilities = (ro_subdev ? V4L2_SUBDEV_CAP_RO_SUBDEV : 0) | (streams_subdev ? V4L2_SUBDEV_CAP_STREAMS : 0); return 0; } case VIDIOC_QUERYCTRL: /* * TODO: this really should be folded into v4l2_queryctrl (this * currently returns -EINVAL for NULL control handlers). * However, v4l2_queryctrl() is still called directly by * drivers as well and until that has been addressed I believe * it is safer to do the check here. The same is true for the * other control ioctls below. */ if (!vfh->ctrl_handler) return -ENOTTY; return v4l2_queryctrl(vfh->ctrl_handler, arg); case VIDIOC_QUERY_EXT_CTRL: if (!vfh->ctrl_handler) return -ENOTTY; return v4l2_query_ext_ctrl(vfh->ctrl_handler, arg); case VIDIOC_QUERYMENU: if (!vfh->ctrl_handler) return -ENOTTY; return v4l2_querymenu(vfh->ctrl_handler, arg); case VIDIOC_G_CTRL: if (!vfh->ctrl_handler) return -ENOTTY; return v4l2_g_ctrl(vfh->ctrl_handler, arg); case VIDIOC_S_CTRL: if (!vfh->ctrl_handler) return -ENOTTY; return v4l2_s_ctrl(vfh, vfh->ctrl_handler, arg); case VIDIOC_G_EXT_CTRLS: if (!vfh->ctrl_handler) return -ENOTTY; return v4l2_g_ext_ctrls(vfh->ctrl_handler, vdev, sd->v4l2_dev->mdev, arg); case VIDIOC_S_EXT_CTRLS: if (!vfh->ctrl_handler) return -ENOTTY; return v4l2_s_ext_ctrls(vfh, vfh->ctrl_handler, vdev, sd->v4l2_dev->mdev, arg); case VIDIOC_TRY_EXT_CTRLS: if (!vfh->ctrl_handler) return -ENOTTY; return v4l2_try_ext_ctrls(vfh->ctrl_handler, vdev, sd->v4l2_dev->mdev, arg); case VIDIOC_DQEVENT: if (!(sd->flags & V4L2_SUBDEV_FL_HAS_EVENTS)) return -ENOIOCTLCMD; return v4l2_event_dequeue(vfh, arg, file->f_flags & O_NONBLOCK); case VIDIOC_SUBSCRIBE_EVENT: return v4l2_subdev_call(sd, core, subscribe_event, vfh, arg); case VIDIOC_UNSUBSCRIBE_EVENT: return v4l2_subdev_call(sd, core, unsubscribe_event, vfh, arg); #ifdef CONFIG_VIDEO_ADV_DEBUG case VIDIOC_DBG_G_REGISTER: { struct v4l2_dbg_register *p = arg; if (!capable(CAP_SYS_ADMIN)) return -EPERM; return v4l2_subdev_call(sd, core, g_register, p); } case VIDIOC_DBG_S_REGISTER: { struct v4l2_dbg_register *p = arg; if (!capable(CAP_SYS_ADMIN)) return -EPERM; return v4l2_subdev_call(sd, core, s_register, p); } case VIDIOC_DBG_G_CHIP_INFO: { struct v4l2_dbg_chip_info *p = arg; if (p->match.type != V4L2_CHIP_MATCH_SUBDEV || p->match.addr) return -EINVAL; if (sd->ops->core && sd->ops->core->s_register) p->flags |= V4L2_CHIP_FL_WRITABLE; if (sd->ops->core && sd->ops->core->g_register) p->flags |= V4L2_CHIP_FL_READABLE; strscpy(p->name, sd->name, sizeof(p->name)); return 0; } #endif case VIDIOC_LOG_STATUS: { int ret; pr_info("%s: ================= START STATUS =================\n", sd->name); ret = v4l2_subdev_call(sd, core, log_status); pr_info("%s: ================== END STATUS ==================\n", sd->name); return ret; } case VIDIOC_SUBDEV_G_FMT: { struct v4l2_subdev_format *format = arg; if (!client_supports_streams) format->stream = 0; memset(format->reserved, 0, sizeof(format->reserved)); memset(format->format.reserved, 0, sizeof(format->format.reserved)); return v4l2_subdev_call(sd, pad, get_fmt, state, format); } case VIDIOC_SUBDEV_S_FMT: { struct v4l2_subdev_format *format = arg; if (format->which != V4L2_SUBDEV_FORMAT_TRY && ro_subdev) return -EPERM; if (!client_supports_streams) format->stream = 0; memset(format->reserved, 0, sizeof(format->reserved)); memset(format->format.reserved, 0, sizeof(format->format.reserved)); return v4l2_subdev_call(sd, pad, set_fmt, state, format); } case VIDIOC_SUBDEV_G_CROP: { struct v4l2_subdev_crop *crop = arg; struct v4l2_subdev_selection sel; if (!client_supports_streams) crop->stream = 0; memset(crop->reserved, 0, sizeof(crop->reserved)); memset(&sel, 0, sizeof(sel)); sel.which = crop->which; sel.pad = crop->pad; sel.target = V4L2_SEL_TGT_CROP; rval = v4l2_subdev_call( sd, pad, get_selection, state, &sel); crop->rect = sel.r; return rval; } case VIDIOC_SUBDEV_S_CROP: { struct v4l2_subdev_crop *crop = arg; struct v4l2_subdev_selection sel; if (crop->which != V4L2_SUBDEV_FORMAT_TRY && ro_subdev) return -EPERM; if (!client_supports_streams) crop->stream = 0; memset(crop->reserved, 0, sizeof(crop->reserved)); memset(&sel, 0, sizeof(sel)); sel.which = crop->which; sel.pad = crop->pad; sel.target = V4L2_SEL_TGT_CROP; sel.r = crop->rect; rval = v4l2_subdev_call( sd, pad, set_selection, state, &sel); crop->rect = sel.r; return rval; } case VIDIOC_SUBDEV_ENUM_MBUS_CODE: { struct v4l2_subdev_mbus_code_enum *code = arg; if (!client_supports_streams) code->stream = 0; memset(code->reserved, 0, sizeof(code->reserved)); return v4l2_subdev_call(sd, pad, enum_mbus_code, state, code); } case VIDIOC_SUBDEV_ENUM_FRAME_SIZE: { struct v4l2_subdev_frame_size_enum *fse = arg; if (!client_supports_streams) fse->stream = 0; memset(fse->reserved, 0, sizeof(fse->reserved)); return v4l2_subdev_call(sd, pad, enum_frame_size, state, fse); } case VIDIOC_SUBDEV_G_FRAME_INTERVAL: { struct v4l2_subdev_frame_interval *fi = arg; if (!client_supports_streams) fi->stream = 0; memset(fi->reserved, 0, sizeof(fi->reserved)); return v4l2_subdev_call(sd, pad, get_frame_interval, state, fi); } case VIDIOC_SUBDEV_S_FRAME_INTERVAL: { struct v4l2_subdev_frame_interval *fi = arg; if (!client_supports_streams) fi->stream = 0; if (fi->which != V4L2_SUBDEV_FORMAT_TRY && ro_subdev) return -EPERM; memset(fi->reserved, 0, sizeof(fi->reserved)); return v4l2_subdev_call(sd, pad, set_frame_interval, state, fi); } case VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL: { struct v4l2_subdev_frame_interval_enum *fie = arg; if (!client_supports_streams) fie->stream = 0; memset(fie->reserved, 0, sizeof(fie->reserved)); return v4l2_subdev_call(sd, pad, enum_frame_interval, state, fie); } case VIDIOC_SUBDEV_G_SELECTION: { struct v4l2_subdev_selection *sel = arg; if (!client_supports_streams) sel->stream = 0; memset(sel->reserved, 0, sizeof(sel->reserved)); return v4l2_subdev_call( sd, pad, get_selection, state, sel); } case VIDIOC_SUBDEV_S_SELECTION: { struct v4l2_subdev_selection *sel = arg; if (sel->which != V4L2_SUBDEV_FORMAT_TRY && ro_subdev) return -EPERM; if (!client_supports_streams) sel->stream = 0; memset(sel->reserved, 0, sizeof(sel->reserved)); return v4l2_subdev_call( sd, pad, set_selection, state, sel); } case VIDIOC_G_EDID: { struct v4l2_subdev_edid *edid = arg; return v4l2_subdev_call(sd, pad, get_edid, edid); } case VIDIOC_S_EDID: { struct v4l2_subdev_edid *edid = arg; return v4l2_subdev_call(sd, pad, set_edid, edid); } case VIDIOC_SUBDEV_DV_TIMINGS_CAP: { struct v4l2_dv_timings_cap *cap = arg; return v4l2_subdev_call(sd, pad, dv_timings_cap, cap); } case VIDIOC_SUBDEV_ENUM_DV_TIMINGS: { struct v4l2_enum_dv_timings *dvt = arg; return v4l2_subdev_call(sd, pad, enum_dv_timings, dvt); } case VIDIOC_SUBDEV_QUERY_DV_TIMINGS: return v4l2_subdev_call(sd, video, query_dv_timings, arg); case VIDIOC_SUBDEV_G_DV_TIMINGS: return v4l2_subdev_call(sd, video, g_dv_timings, arg); case VIDIOC_SUBDEV_S_DV_TIMINGS: if (ro_subdev) return -EPERM; return v4l2_subdev_call(sd, video, s_dv_timings, arg); case VIDIOC_SUBDEV_G_STD: return v4l2_subdev_call(sd, video, g_std, arg); case VIDIOC_SUBDEV_S_STD: { v4l2_std_id *std = arg; if (ro_subdev) return -EPERM; return v4l2_subdev_call(sd, video, s_std, *std); } case VIDIOC_SUBDEV_ENUMSTD: { struct v4l2_standard *p = arg; v4l2_std_id id; if (v4l2_subdev_call(sd, video, g_tvnorms, &id)) return -EINVAL; return v4l_video_std_enumstd(p, id); } case VIDIOC_SUBDEV_QUERYSTD: return v4l2_subdev_call(sd, video, querystd, arg); case VIDIOC_SUBDEV_G_ROUTING: { struct v4l2_subdev_routing *routing = arg; struct v4l2_subdev_krouting *krouting; if (!v4l2_subdev_enable_streams_api) return -ENOIOCTLCMD; if (!(sd->flags & V4L2_SUBDEV_FL_STREAMS)) return -ENOIOCTLCMD; memset(routing->reserved, 0, sizeof(routing->reserved)); krouting = &state->routing; if (routing->num_routes < krouting->num_routes) { routing->num_routes = krouting->num_routes; return -ENOSPC; } memcpy((struct v4l2_subdev_route *)(uintptr_t)routing->routes, krouting->routes, krouting->num_routes * sizeof(*krouting->routes)); routing->num_routes = krouting->num_routes; return 0; } case VIDIOC_SUBDEV_S_ROUTING: { struct v4l2_subdev_routing *routing = arg; struct v4l2_subdev_route *routes = (struct v4l2_subdev_route *)(uintptr_t)routing->routes; struct v4l2_subdev_krouting krouting = {}; unsigned int i; if (!v4l2_subdev_enable_streams_api) return -ENOIOCTLCMD; if (!(sd->flags & V4L2_SUBDEV_FL_STREAMS)) return -ENOIOCTLCMD; if (routing->which != V4L2_SUBDEV_FORMAT_TRY && ro_subdev) return -EPERM; memset(routing->reserved, 0, sizeof(routing->reserved)); for (i = 0; i < routing->num_routes; ++i) { const struct v4l2_subdev_route *route = &routes[i]; const struct media_pad *pads = sd->entity.pads; if (route->sink_stream > V4L2_SUBDEV_MAX_STREAM_ID || route->source_stream > V4L2_SUBDEV_MAX_STREAM_ID) return -EINVAL; if (route->sink_pad >= sd->entity.num_pads) return -EINVAL; if (!(pads[route->sink_pad].flags & MEDIA_PAD_FL_SINK)) return -EINVAL; if (route->source_pad >= sd->entity.num_pads) return -EINVAL; if (!(pads[route->source_pad].flags & MEDIA_PAD_FL_SOURCE)) return -EINVAL; } krouting.num_routes = routing->num_routes; krouting.routes = routes; return v4l2_subdev_call(sd, pad, set_routing, state, routing->which, &krouting); } case VIDIOC_SUBDEV_G_CLIENT_CAP: { struct v4l2_subdev_client_capability *client_cap = arg; client_cap->capabilities = subdev_fh->client_caps; return 0; } case VIDIOC_SUBDEV_S_CLIENT_CAP: { struct v4l2_subdev_client_capability *client_cap = arg; /* * Clear V4L2_SUBDEV_CLIENT_CAP_STREAMS if streams API is not * enabled. Remove this when streams API is no longer * experimental. */ if (!v4l2_subdev_enable_streams_api) client_cap->capabilities &= ~V4L2_SUBDEV_CLIENT_CAP_STREAMS; /* Filter out unsupported capabilities */ client_cap->capabilities &= (V4L2_SUBDEV_CLIENT_CAP_STREAMS | V4L2_SUBDEV_CLIENT_CAP_INTERVAL_USES_WHICH); subdev_fh->client_caps = client_cap->capabilities; return 0; } default: return v4l2_subdev_call(sd, core, ioctl, cmd, arg); } return 0; } static long subdev_do_ioctl_lock(struct file *file, unsigned int cmd, void *arg) { struct video_device *vdev = video_devdata(file); struct mutex *lock = vdev->lock; long ret = -ENODEV; if (lock && mutex_lock_interruptible(lock)) return -ERESTARTSYS; if (video_is_registered(vdev)) { struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev); struct v4l2_fh *vfh = file->private_data; struct v4l2_subdev_fh *subdev_fh = to_v4l2_subdev_fh(vfh); struct v4l2_subdev_state *state; state = subdev_ioctl_get_state(sd, subdev_fh, cmd, arg); if (state) v4l2_subdev_lock_state(state); ret = subdev_do_ioctl(file, cmd, arg, state); if (state) v4l2_subdev_unlock_state(state); } if (lock) mutex_unlock(lock); return ret; } static long subdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { return video_usercopy(file, cmd, arg, subdev_do_ioctl_lock); } #ifdef CONFIG_COMPAT static long subdev_compat_ioctl32(struct file *file, unsigned int cmd, unsigned long arg) { struct video_device *vdev = video_devdata(file); struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev); return v4l2_subdev_call(sd, core, compat_ioctl32, cmd, arg); } #endif #else /* CONFIG_VIDEO_V4L2_SUBDEV_API */ static long subdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { return -ENODEV; } #ifdef CONFIG_COMPAT static long subdev_compat_ioctl32(struct file *file, unsigned int cmd, unsigned long arg) { return -ENODEV; } #endif #endif /* CONFIG_VIDEO_V4L2_SUBDEV_API */ static __poll_t subdev_poll(struct file *file, poll_table *wait) { struct video_device *vdev = video_devdata(file); struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev); struct v4l2_fh *fh = file->private_data; if (!(sd->flags & V4L2_SUBDEV_FL_HAS_EVENTS)) return EPOLLERR; poll_wait(file, &fh->wait, wait); if (v4l2_event_pending(fh)) return EPOLLPRI; return 0; } const struct v4l2_file_operations v4l2_subdev_fops = { .owner = THIS_MODULE, .open = subdev_open, .unlocked_ioctl = subdev_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl32 = subdev_compat_ioctl32, #endif .release = subdev_close, .poll = subdev_poll, }; #ifdef CONFIG_MEDIA_CONTROLLER int v4l2_subdev_get_fwnode_pad_1_to_1(struct media_entity *entity, struct fwnode_endpoint *endpoint) { struct fwnode_handle *fwnode; struct v4l2_subdev *sd; if (!is_media_entity_v4l2_subdev(entity)) return -EINVAL; sd = media_entity_to_v4l2_subdev(entity); fwnode = fwnode_graph_get_port_parent(endpoint->local_fwnode); fwnode_handle_put(fwnode); if (device_match_fwnode(sd->dev, fwnode)) return endpoint->port; return -ENXIO; } EXPORT_SYMBOL_GPL(v4l2_subdev_get_fwnode_pad_1_to_1); int v4l2_subdev_link_validate_default(struct v4l2_subdev *sd, struct media_link *link, struct v4l2_subdev_format *source_fmt, struct v4l2_subdev_format *sink_fmt) { bool pass = true; /* The width, height and code must match. */ if (source_fmt->format.width != sink_fmt->format.width) { dev_dbg(sd->entity.graph_obj.mdev->dev, "%s: width does not match (source %u, sink %u)\n", __func__, source_fmt->format.width, sink_fmt->format.width); pass = false; } if (source_fmt->format.height != sink_fmt->format.height) { dev_dbg(sd->entity.graph_obj.mdev->dev, "%s: height does not match (source %u, sink %u)\n", __func__, source_fmt->format.height, sink_fmt->format.height); pass = false; } if (source_fmt->format.code != sink_fmt->format.code) { dev_dbg(sd->entity.graph_obj.mdev->dev, "%s: media bus code does not match (source 0x%8.8x, sink 0x%8.8x)\n", __func__, source_fmt->format.code, sink_fmt->format.code); pass = false; } /* The field order must match, or the sink field order must be NONE * to support interlaced hardware connected to bridges that support * progressive formats only. */ if (source_fmt->format.field != sink_fmt->format.field && sink_fmt->format.field != V4L2_FIELD_NONE) { dev_dbg(sd->entity.graph_obj.mdev->dev, "%s: field does not match (source %u, sink %u)\n", __func__, source_fmt->format.field, sink_fmt->format.field); pass = false; } if (pass) return 0; dev_dbg(sd->entity.graph_obj.mdev->dev, "%s: link was \"%s\":%u -> \"%s\":%u\n", __func__, link->source->entity->name, link->source->index, link->sink->entity->name, link->sink->index); return -EPIPE; } EXPORT_SYMBOL_GPL(v4l2_subdev_link_validate_default); static int v4l2_subdev_link_validate_get_format(struct media_pad *pad, u32 stream, struct v4l2_subdev_format *fmt, bool states_locked) { struct v4l2_subdev_state *state; struct v4l2_subdev *sd; int ret; if (!is_media_entity_v4l2_subdev(pad->entity)) { WARN(pad->entity->function != MEDIA_ENT_F_IO_V4L, "Driver bug! Wrong media entity type 0x%08x, entity %s\n", pad->entity->function, pad->entity->name); return -EINVAL; } sd = media_entity_to_v4l2_subdev(pad->entity); fmt->which = V4L2_SUBDEV_FORMAT_ACTIVE; fmt->pad = pad->index; fmt->stream = stream; if (states_locked) state = v4l2_subdev_get_locked_active_state(sd); else state = v4l2_subdev_lock_and_get_active_state(sd); ret = v4l2_subdev_call(sd, pad, get_fmt, state, fmt); if (!states_locked && state) v4l2_subdev_unlock_state(state); return ret; } #if defined(CONFIG_VIDEO_V4L2_SUBDEV_API) static void __v4l2_link_validate_get_streams(struct media_pad *pad, u64 *streams_mask, bool states_locked) { struct v4l2_subdev_route *route; struct v4l2_subdev_state *state; struct v4l2_subdev *subdev; subdev = media_entity_to_v4l2_subdev(pad->entity); *streams_mask = 0; if (states_locked) state = v4l2_subdev_get_locked_active_state(subdev); else state = v4l2_subdev_lock_and_get_active_state(subdev); if (WARN_ON(!state)) return; for_each_active_route(&state->routing, route) { u32 route_pad; u32 route_stream; if (pad->flags & MEDIA_PAD_FL_SOURCE) { route_pad = route->source_pad; route_stream = route->source_stream; } else { route_pad = route->sink_pad; route_stream = route->sink_stream; } if (route_pad != pad->index) continue; *streams_mask |= BIT_ULL(route_stream); } if (!states_locked) v4l2_subdev_unlock_state(state); } #endif /* CONFIG_VIDEO_V4L2_SUBDEV_API */ static void v4l2_link_validate_get_streams(struct media_pad *pad, u64 *streams_mask, bool states_locked) { struct v4l2_subdev *subdev = media_entity_to_v4l2_subdev(pad->entity); if (!(subdev->flags & V4L2_SUBDEV_FL_STREAMS)) { /* Non-streams subdevs have an implicit stream 0 */ *streams_mask = BIT_ULL(0); return; } #if defined(CONFIG_VIDEO_V4L2_SUBDEV_API) __v4l2_link_validate_get_streams(pad, streams_mask, states_locked); #else /* This shouldn't happen */ *streams_mask = 0; #endif } static int v4l2_subdev_link_validate_locked(struct media_link *link, bool states_locked) { struct v4l2_subdev *sink_subdev = media_entity_to_v4l2_subdev(link->sink->entity); struct device *dev = sink_subdev->entity.graph_obj.mdev->dev; u64 source_streams_mask; u64 sink_streams_mask; u64 dangling_sink_streams; u32 stream; int ret; dev_dbg(dev, "validating link \"%s\":%u -> \"%s\":%u\n", link->source->entity->name, link->source->index, link->sink->entity->name, link->sink->index); v4l2_link_validate_get_streams(link->source, &source_streams_mask, states_locked); v4l2_link_validate_get_streams(link->sink, &sink_streams_mask, states_locked); /* * It is ok to have more source streams than sink streams as extra * source streams can just be ignored by the receiver, but having extra * sink streams is an error as streams must have a source. */ dangling_sink_streams = (source_streams_mask ^ sink_streams_mask) & sink_streams_mask; if (dangling_sink_streams) { dev_err(dev, "Dangling sink streams: mask %#llx\n", dangling_sink_streams); return -EINVAL; } /* Validate source and sink stream formats */ for (stream = 0; stream < sizeof(sink_streams_mask) * 8; ++stream) { struct v4l2_subdev_format sink_fmt, source_fmt; if (!(sink_streams_mask & BIT_ULL(stream))) continue; dev_dbg(dev, "validating stream \"%s\":%u:%u -> \"%s\":%u:%u\n", link->source->entity->name, link->source->index, stream, link->sink->entity->name, link->sink->index, stream); ret = v4l2_subdev_link_validate_get_format(link->source, stream, &source_fmt, states_locked); if (ret < 0) { dev_dbg(dev, "Failed to get format for \"%s\":%u:%u (but that's ok)\n", link->source->entity->name, link->source->index, stream); continue; } ret = v4l2_subdev_link_validate_get_format(link->sink, stream, &sink_fmt, states_locked); if (ret < 0) { dev_dbg(dev, "Failed to get format for \"%s\":%u:%u (but that's ok)\n", link->sink->entity->name, link->sink->index, stream); continue; } /* TODO: add stream number to link_validate() */ ret = v4l2_subdev_call(sink_subdev, pad, link_validate, link, &source_fmt, &sink_fmt); if (!ret) continue; if (ret != -ENOIOCTLCMD) return ret; ret = v4l2_subdev_link_validate_default(sink_subdev, link, &source_fmt, &sink_fmt); if (ret) return ret; } return 0; } int v4l2_subdev_link_validate(struct media_link *link) { struct v4l2_subdev *source_sd, *sink_sd; struct v4l2_subdev_state *source_state, *sink_state; bool states_locked; int ret; if (!is_media_entity_v4l2_subdev(link->sink->entity) || !is_media_entity_v4l2_subdev(link->source->entity)) { pr_warn_once("%s of link '%s':%u->'%s':%u is not a V4L2 sub-device, driver bug!\n", !is_media_entity_v4l2_subdev(link->sink->entity) ? "sink" : "source", link->source->entity->name, link->source->index, link->sink->entity->name, link->sink->index); return 0; } sink_sd = media_entity_to_v4l2_subdev(link->sink->entity); source_sd = media_entity_to_v4l2_subdev(link->source->entity); sink_state = v4l2_subdev_get_unlocked_active_state(sink_sd); source_state = v4l2_subdev_get_unlocked_active_state(source_sd); states_locked = sink_state && source_state; if (states_locked) { v4l2_subdev_lock_state(sink_state); v4l2_subdev_lock_state(source_state); } ret = v4l2_subdev_link_validate_locked(link, states_locked); if (states_locked) { v4l2_subdev_unlock_state(sink_state); v4l2_subdev_unlock_state(source_state); } return ret; } EXPORT_SYMBOL_GPL(v4l2_subdev_link_validate); bool v4l2_subdev_has_pad_interdep(struct media_entity *entity, unsigned int pad0, unsigned int pad1) { struct v4l2_subdev *sd = media_entity_to_v4l2_subdev(entity); struct v4l2_subdev_krouting *routing; struct v4l2_subdev_state *state; unsigned int i; state = v4l2_subdev_lock_and_get_active_state(sd); routing = &state->routing; for (i = 0; i < routing->num_routes; ++i) { struct v4l2_subdev_route *route = &routing->routes[i]; if (!(route->flags & V4L2_SUBDEV_ROUTE_FL_ACTIVE)) continue; if ((route->sink_pad == pad0 && route->source_pad == pad1) || (route->source_pad == pad0 && route->sink_pad == pad1)) { v4l2_subdev_unlock_state(state); return true; } } v4l2_subdev_unlock_state(state); return false; } EXPORT_SYMBOL_GPL(v4l2_subdev_has_pad_interdep); struct v4l2_subdev_state * __v4l2_subdev_state_alloc(struct v4l2_subdev *sd, const char *lock_name, struct lock_class_key *lock_key) { struct v4l2_subdev_state *state; int ret; state = kzalloc(sizeof(*state), GFP_KERNEL); if (!state) return ERR_PTR(-ENOMEM); __mutex_init(&state->_lock, lock_name, lock_key); if (sd->state_lock) state->lock = sd->state_lock; else state->lock = &state->_lock; state->sd = sd; /* Drivers that support streams do not need the legacy pad config */ if (!(sd->flags & V4L2_SUBDEV_FL_STREAMS) && sd->entity.num_pads) { state->pads = kvcalloc(sd->entity.num_pads, sizeof(*state->pads), GFP_KERNEL); if (!state->pads) { ret = -ENOMEM; goto err; } } if (sd->internal_ops && sd->internal_ops->init_state) { /* * There can be no race at this point, but we lock the state * anyway to satisfy lockdep checks. */ v4l2_subdev_lock_state(state); ret = sd->internal_ops->init_state(sd, state); v4l2_subdev_unlock_state(state); if (ret) goto err; } return state; err: if (state && state->pads) kvfree(state->pads); kfree(state); return ERR_PTR(ret); } EXPORT_SYMBOL_GPL(__v4l2_subdev_state_alloc); void __v4l2_subdev_state_free(struct v4l2_subdev_state *state) { if (!state) return; mutex_destroy(&state->_lock); kfree(state->routing.routes); kvfree(state->stream_configs.configs); kvfree(state->pads); kfree(state); } EXPORT_SYMBOL_GPL(__v4l2_subdev_state_free); int __v4l2_subdev_init_finalize(struct v4l2_subdev *sd, const char *name, struct lock_class_key *key) { struct v4l2_subdev_state *state; state = __v4l2_subdev_state_alloc(sd, name, key); if (IS_ERR(state)) return PTR_ERR(state); sd->active_state = state; return 0; } EXPORT_SYMBOL_GPL(__v4l2_subdev_init_finalize); void v4l2_subdev_cleanup(struct v4l2_subdev *sd) { struct v4l2_async_subdev_endpoint *ase, *ase_tmp; __v4l2_subdev_state_free(sd->active_state); sd->active_state = NULL; /* Uninitialised sub-device, bail out here. */ if (!sd->async_subdev_endpoint_list.next) return; list_for_each_entry_safe(ase, ase_tmp, &sd->async_subdev_endpoint_list, async_subdev_endpoint_entry) { list_del(&ase->async_subdev_endpoint_entry); kfree(ase); } } EXPORT_SYMBOL_GPL(v4l2_subdev_cleanup); struct v4l2_mbus_framefmt * __v4l2_subdev_state_get_format(struct v4l2_subdev_state *state, unsigned int pad, u32 stream) { struct v4l2_subdev_stream_configs *stream_configs; unsigned int i; if (WARN_ON_ONCE(!state)) return NULL; if (state->pads) { if (stream) return NULL; if (pad >= state->sd->entity.num_pads) return NULL; return &state->pads[pad].format; } lockdep_assert_held(state->lock); stream_configs = &state->stream_configs; for (i = 0; i < stream_configs->num_configs; ++i) { if (stream_configs->configs[i].pad == pad && stream_configs->configs[i].stream == stream) return &stream_configs->configs[i].fmt; } return NULL; } EXPORT_SYMBOL_GPL(__v4l2_subdev_state_get_format); struct v4l2_rect * __v4l2_subdev_state_get_crop(struct v4l2_subdev_state *state, unsigned int pad, u32 stream) { struct v4l2_subdev_stream_configs *stream_configs; unsigned int i; if (WARN_ON_ONCE(!state)) return NULL; if (state->pads) { if (stream) return NULL; if (pad >= state->sd->entity.num_pads) return NULL; return &state->pads[pad].crop; } lockdep_assert_held(state->lock); stream_configs = &state->stream_configs; for (i = 0; i < stream_configs->num_configs; ++i) { if (stream_configs->configs[i].pad == pad && stream_configs->configs[i].stream == stream) return &stream_configs->configs[i].crop; } return NULL; } EXPORT_SYMBOL_GPL(__v4l2_subdev_state_get_crop); struct v4l2_rect * __v4l2_subdev_state_get_compose(struct v4l2_subdev_state *state, unsigned int pad, u32 stream) { struct v4l2_subdev_stream_configs *stream_configs; unsigned int i; if (WARN_ON_ONCE(!state)) return NULL; if (state->pads) { if (stream) return NULL; if (pad >= state->sd->entity.num_pads) return NULL; return &state->pads[pad].compose; } lockdep_assert_held(state->lock); stream_configs = &state->stream_configs; for (i = 0; i < stream_configs->num_configs; ++i) { if (stream_configs->configs[i].pad == pad && stream_configs->configs[i].stream == stream) return &stream_configs->configs[i].compose; } return NULL; } EXPORT_SYMBOL_GPL(__v4l2_subdev_state_get_compose); struct v4l2_fract * __v4l2_subdev_state_get_interval(struct v4l2_subdev_state *state, unsigned int pad, u32 stream) { struct v4l2_subdev_stream_configs *stream_configs; unsigned int i; if (WARN_ON(!state)) return NULL; lockdep_assert_held(state->lock); if (state->pads) { if (stream) return NULL; if (pad >= state->sd->entity.num_pads) return NULL; return &state->pads[pad].interval; } lockdep_assert_held(state->lock); stream_configs = &state->stream_configs; for (i = 0; i < stream_configs->num_configs; ++i) { if (stream_configs->configs[i].pad == pad && stream_configs->configs[i].stream == stream) return &stream_configs->configs[i].interval; } return NULL; } EXPORT_SYMBOL_GPL(__v4l2_subdev_state_get_interval); #if defined(CONFIG_VIDEO_V4L2_SUBDEV_API) static int v4l2_subdev_init_stream_configs(struct v4l2_subdev_stream_configs *stream_configs, const struct v4l2_subdev_krouting *routing) { struct v4l2_subdev_stream_configs new_configs = { 0 }; struct v4l2_subdev_route *route; u32 idx; /* Count number of formats needed */ for_each_active_route(routing, route) { /* * Each route needs a format on both ends of the route. */ new_configs.num_configs += 2; } if (new_configs.num_configs) { new_configs.configs = kvcalloc(new_configs.num_configs, sizeof(*new_configs.configs), GFP_KERNEL); if (!new_configs.configs) return -ENOMEM; } /* * Fill in the 'pad' and stream' value for each item in the array from * the routing table */ idx = 0; for_each_active_route(routing, route) { new_configs.configs[idx].pad = route->sink_pad; new_configs.configs[idx].stream = route->sink_stream; idx++; new_configs.configs[idx].pad = route->source_pad; new_configs.configs[idx].stream = route->source_stream; idx++; } kvfree(stream_configs->configs); *stream_configs = new_configs; return 0; } int v4l2_subdev_get_fmt(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_format *format) { struct v4l2_mbus_framefmt *fmt; fmt = v4l2_subdev_state_get_format(state, format->pad, format->stream); if (!fmt) return -EINVAL; format->format = *fmt; return 0; } EXPORT_SYMBOL_GPL(v4l2_subdev_get_fmt); int v4l2_subdev_get_frame_interval(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, struct v4l2_subdev_frame_interval *fi) { struct v4l2_fract *interval; interval = v4l2_subdev_state_get_interval(state, fi->pad, fi->stream); if (!interval) return -EINVAL; fi->interval = *interval; return 0; } EXPORT_SYMBOL_GPL(v4l2_subdev_get_frame_interval); int v4l2_subdev_set_routing(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, const struct v4l2_subdev_krouting *routing) { struct v4l2_subdev_krouting *dst = &state->routing; const struct v4l2_subdev_krouting *src = routing; struct v4l2_subdev_krouting new_routing = { 0 }; size_t bytes; int r; if (unlikely(check_mul_overflow((size_t)src->num_routes, sizeof(*src->routes), &bytes))) return -EOVERFLOW; lockdep_assert_held(state->lock); if (src->num_routes > 0) { new_routing.routes = kmemdup(src->routes, bytes, GFP_KERNEL); if (!new_routing.routes) return -ENOMEM; } new_routing.num_routes = src->num_routes; r = v4l2_subdev_init_stream_configs(&state->stream_configs, &new_routing); if (r) { kfree(new_routing.routes); return r; } kfree(dst->routes); *dst = new_routing; return 0; } EXPORT_SYMBOL_GPL(v4l2_subdev_set_routing); struct v4l2_subdev_route * __v4l2_subdev_next_active_route(const struct v4l2_subdev_krouting *routing, struct v4l2_subdev_route *route) { if (route) ++route; else route = &routing->routes[0]; for (; route < routing->routes + routing->num_routes; ++route) { if (!(route->flags & V4L2_SUBDEV_ROUTE_FL_ACTIVE)) continue; return route; } return NULL; } EXPORT_SYMBOL_GPL(__v4l2_subdev_next_active_route); int v4l2_subdev_set_routing_with_fmt(struct v4l2_subdev *sd, struct v4l2_subdev_state *state, const struct v4l2_subdev_krouting *routing, const struct v4l2_mbus_framefmt *fmt) { struct v4l2_subdev_stream_configs *stream_configs; unsigned int i; int ret; ret = v4l2_subdev_set_routing(sd, state, routing); if (ret) return ret; stream_configs = &state->stream_configs; for (i = 0; i < stream_configs->num_configs; ++i) stream_configs->configs[i].fmt = *fmt; return 0; } EXPORT_SYMBOL_GPL(v4l2_subdev_set_routing_with_fmt); int v4l2_subdev_routing_find_opposite_end(const struct v4l2_subdev_krouting *routing, u32 pad, u32 stream, u32 *other_pad, u32 *other_stream) { unsigned int i; for (i = 0; i < routing->num_routes; ++i) { struct v4l2_subdev_route *route = &routing->routes[i]; if (route->source_pad == pad && route->source_stream == stream) { if (other_pad) *other_pad = route->sink_pad; if (other_stream) *other_stream = route->sink_stream; return 0; } if (route->sink_pad == pad && route->sink_stream == stream) { if (other_pad) *other_pad = route->source_pad; if (other_stream) *other_stream = route->source_stream; return 0; } } return -EINVAL; } EXPORT_SYMBOL_GPL(v4l2_subdev_routing_find_opposite_end); struct v4l2_mbus_framefmt * v4l2_subdev_state_get_opposite_stream_format(struct v4l2_subdev_state *state, u32 pad, u32 stream) { u32 other_pad, other_stream; int ret; ret = v4l2_subdev_routing_find_opposite_end(&state->routing, pad, stream, &other_pad, &other_stream); if (ret) return NULL; return v4l2_subdev_state_get_format(state, other_pad, other_stream); } EXPORT_SYMBOL_GPL(v4l2_subdev_state_get_opposite_stream_format); u64 v4l2_subdev_state_xlate_streams(const struct v4l2_subdev_state *state, u32 pad0, u32 pad1, u64 *streams) { const struct v4l2_subdev_krouting *routing = &state->routing; struct v4l2_subdev_route *route; u64 streams0 = 0; u64 streams1 = 0; for_each_active_route(routing, route) { if (route->sink_pad == pad0 && route->source_pad == pad1 && (*streams & BIT_ULL(route->sink_stream))) { streams0 |= BIT_ULL(route->sink_stream); streams1 |= BIT_ULL(route->source_stream); } if (route->source_pad == pad0 && route->sink_pad == pad1 && (*streams & BIT_ULL(route->source_stream))) { streams0 |= BIT_ULL(route->source_stream); streams1 |= BIT_ULL(route->sink_stream); } } *streams = streams0; return streams1; } EXPORT_SYMBOL_GPL(v4l2_subdev_state_xlate_streams); int v4l2_subdev_routing_validate(struct v4l2_subdev *sd, const struct v4l2_subdev_krouting *routing, enum v4l2_subdev_routing_restriction disallow) { u32 *remote_pads = NULL; unsigned int i, j; int ret = -EINVAL; if (disallow & (V4L2_SUBDEV_ROUTING_NO_STREAM_MIX | V4L2_SUBDEV_ROUTING_NO_MULTIPLEXING)) { remote_pads = kcalloc(sd->entity.num_pads, sizeof(*remote_pads), GFP_KERNEL); if (!remote_pads) return -ENOMEM; for (i = 0; i < sd->entity.num_pads; ++i) remote_pads[i] = U32_MAX; } for (i = 0; i < routing->num_routes; ++i) { const struct v4l2_subdev_route *route = &routing->routes[i]; /* Validate the sink and source pad numbers. */ if (route->sink_pad >= sd->entity.num_pads || !(sd->entity.pads[route->sink_pad].flags & MEDIA_PAD_FL_SINK)) { dev_dbg(sd->dev, "route %u sink (%u) is not a sink pad\n", i, route->sink_pad); goto out; } if (route->source_pad >= sd->entity.num_pads || !(sd->entity.pads[route->source_pad].flags & MEDIA_PAD_FL_SOURCE)) { dev_dbg(sd->dev, "route %u source (%u) is not a source pad\n", i, route->source_pad); goto out; } /* * V4L2_SUBDEV_ROUTING_NO_SINK_STREAM_MIX: all streams from a * sink pad must be routed to a single source pad. */ if (disallow & V4L2_SUBDEV_ROUTING_NO_SINK_STREAM_MIX) { if (remote_pads[route->sink_pad] != U32_MAX && remote_pads[route->sink_pad] != route->source_pad) { dev_dbg(sd->dev, "route %u attempts to mix %s streams\n", i, "sink"); goto out; } } /* * V4L2_SUBDEV_ROUTING_NO_SOURCE_STREAM_MIX: all streams on a * source pad must originate from a single sink pad. */ if (disallow & V4L2_SUBDEV_ROUTING_NO_SOURCE_STREAM_MIX) { if (remote_pads[route->source_pad] != U32_MAX && remote_pads[route->source_pad] != route->sink_pad) { dev_dbg(sd->dev, "route %u attempts to mix %s streams\n", i, "source"); goto out; } } /* * V4L2_SUBDEV_ROUTING_NO_SINK_MULTIPLEXING: Pads on the sink * side can not do stream multiplexing, i.e. there can be only * a single stream in a sink pad. */ if (disallow & V4L2_SUBDEV_ROUTING_NO_SINK_MULTIPLEXING) { if (remote_pads[route->sink_pad] != U32_MAX) { dev_dbg(sd->dev, "route %u attempts to multiplex on %s pad %u\n", i, "sink", route->sink_pad); goto out; } } /* * V4L2_SUBDEV_ROUTING_NO_SOURCE_MULTIPLEXING: Pads on the * source side can not do stream multiplexing, i.e. there can * be only a single stream in a source pad. */ if (disallow & V4L2_SUBDEV_ROUTING_NO_SOURCE_MULTIPLEXING) { if (remote_pads[route->source_pad] != U32_MAX) { dev_dbg(sd->dev, "route %u attempts to multiplex on %s pad %u\n", i, "source", route->source_pad); goto out; } } if (remote_pads) { remote_pads[route->sink_pad] = route->source_pad; remote_pads[route->source_pad] = route->sink_pad; } for (j = i + 1; j < routing->num_routes; ++j) { const struct v4l2_subdev_route *r = &routing->routes[j]; /* * V4L2_SUBDEV_ROUTING_NO_1_TO_N: No two routes can * originate from the same (sink) stream. */ if ((disallow & V4L2_SUBDEV_ROUTING_NO_1_TO_N) && route->sink_pad == r->sink_pad && route->sink_stream == r->sink_stream) { dev_dbg(sd->dev, "routes %u and %u originate from same sink (%u/%u)\n", i, j, route->sink_pad, route->sink_stream); goto out; } /* * V4L2_SUBDEV_ROUTING_NO_N_TO_1: No two routes can end * at the same (source) stream. */ if ((disallow & V4L2_SUBDEV_ROUTING_NO_N_TO_1) && route->source_pad == r->source_pad && route->source_stream == r->source_stream) { dev_dbg(sd->dev, "routes %u and %u end at same source (%u/%u)\n", i, j, route->source_pad, route->source_stream); goto out; } } } ret = 0; out: kfree(remote_pads); return ret; } EXPORT_SYMBOL_GPL(v4l2_subdev_routing_validate); static int v4l2_subdev_enable_streams_fallback(struct v4l2_subdev *sd, u32 pad, u64 streams_mask) { struct device *dev = sd->entity.graph_obj.mdev->dev; unsigned int i; int ret; /* * The subdev doesn't implement pad-based stream enable, fall back * on the .s_stream() operation. This can only be done for subdevs that * have a single source pad, as sd->enabled_streams is global to the * subdev. */ if (!(sd->entity.pads[pad].flags & MEDIA_PAD_FL_SOURCE)) return -EOPNOTSUPP; for (i = 0; i < sd->entity.num_pads; ++i) { if (i != pad && sd->entity.pads[i].flags & MEDIA_PAD_FL_SOURCE) return -EOPNOTSUPP; } if (sd->enabled_streams & streams_mask) { dev_dbg(dev, "set of streams %#llx already enabled on %s:%u\n", streams_mask, sd->entity.name, pad); return -EALREADY; } /* Start streaming when the first streams are enabled. */ if (!sd->enabled_streams) { ret = v4l2_subdev_call(sd, video, s_stream, 1); if (ret) return ret; } sd->enabled_streams |= streams_mask; return 0; } int v4l2_subdev_enable_streams(struct v4l2_subdev *sd, u32 pad, u64 streams_mask) { struct device *dev = sd->entity.graph_obj.mdev->dev; struct v4l2_subdev_state *state; u64 found_streams = 0; unsigned int i; int ret; /* A few basic sanity checks first. */ if (pad >= sd->entity.num_pads) return -EINVAL; if (!streams_mask) return 0; /* Fallback on .s_stream() if .enable_streams() isn't available. */ if (!sd->ops->pad || !sd->ops->pad->enable_streams) return v4l2_subdev_enable_streams_fallback(sd, pad, streams_mask); state = v4l2_subdev_lock_and_get_active_state(sd); /* * Verify that the requested streams exist and that they are not * already enabled. */ for (i = 0; i < state->stream_configs.num_configs; ++i) { struct v4l2_subdev_stream_config *cfg = &state->stream_configs.configs[i]; if (cfg->pad != pad || !(streams_mask & BIT_ULL(cfg->stream))) continue; found_streams |= BIT_ULL(cfg->stream); if (cfg->enabled) { dev_dbg(dev, "stream %u already enabled on %s:%u\n", cfg->stream, sd->entity.name, pad); ret = -EALREADY; goto done; } } if (found_streams != streams_mask) { dev_dbg(dev, "streams 0x%llx not found on %s:%u\n", streams_mask & ~found_streams, sd->entity.name, pad); ret = -EINVAL; goto done; } dev_dbg(dev, "enable streams %u:%#llx\n", pad, streams_mask); /* Call the .enable_streams() operation. */ ret = v4l2_subdev_call(sd, pad, enable_streams, state, pad, streams_mask); if (ret) { dev_dbg(dev, "enable streams %u:%#llx failed: %d\n", pad, streams_mask, ret); goto done; } /* Mark the streams as enabled. */ for (i = 0; i < state->stream_configs.num_configs; ++i) { struct v4l2_subdev_stream_config *cfg = &state->stream_configs.configs[i]; if (cfg->pad == pad && (streams_mask & BIT_ULL(cfg->stream))) cfg->enabled = true; } done: v4l2_subdev_unlock_state(state); return ret; } EXPORT_SYMBOL_GPL(v4l2_subdev_enable_streams); static int v4l2_subdev_disable_streams_fallback(struct v4l2_subdev *sd, u32 pad, u64 streams_mask) { struct device *dev = sd->entity.graph_obj.mdev->dev; unsigned int i; int ret; /* * If the subdev doesn't implement pad-based stream enable, fall back * on the .s_stream() operation. This can only be done for subdevs that * have a single source pad, as sd->enabled_streams is global to the * subdev. */ if (!(sd->entity.pads[pad].flags & MEDIA_PAD_FL_SOURCE)) return -EOPNOTSUPP; for (i = 0; i < sd->entity.num_pads; ++i) { if (i != pad && sd->entity.pads[i].flags & MEDIA_PAD_FL_SOURCE) return -EOPNOTSUPP; } if ((sd->enabled_streams & streams_mask) != streams_mask) { dev_dbg(dev, "set of streams %#llx already disabled on %s:%u\n", streams_mask, sd->entity.name, pad); return -EALREADY; } /* Stop streaming when the last streams are disabled. */ if (!(sd->enabled_streams & ~streams_mask)) { ret = v4l2_subdev_call(sd, video, s_stream, 0); if (ret) return ret; } sd->enabled_streams &= ~streams_mask; return 0; } int v4l2_subdev_disable_streams(struct v4l2_subdev *sd, u32 pad, u64 streams_mask) { struct device *dev = sd->entity.graph_obj.mdev->dev; struct v4l2_subdev_state *state; u64 found_streams = 0; unsigned int i; int ret; /* A few basic sanity checks first. */ if (pad >= sd->entity.num_pads) return -EINVAL; if (!streams_mask) return 0; /* Fallback on .s_stream() if .disable_streams() isn't available. */ if (!sd->ops->pad || !sd->ops->pad->disable_streams) return v4l2_subdev_disable_streams_fallback(sd, pad, streams_mask); state = v4l2_subdev_lock_and_get_active_state(sd); /* * Verify that the requested streams exist and that they are not * already disabled. */ for (i = 0; i < state->stream_configs.num_configs; ++i) { struct v4l2_subdev_stream_config *cfg = &state->stream_configs.configs[i]; if (cfg->pad != pad || !(streams_mask & BIT_ULL(cfg->stream))) continue; found_streams |= BIT_ULL(cfg->stream); if (!cfg->enabled) { dev_dbg(dev, "stream %u already disabled on %s:%u\n", cfg->stream, sd->entity.name, pad); ret = -EALREADY; goto done; } } if (found_streams != streams_mask) { dev_dbg(dev, "streams 0x%llx not found on %s:%u\n", streams_mask & ~found_streams, sd->entity.name, pad); ret = -EINVAL; goto done; } dev_dbg(dev, "disable streams %u:%#llx\n", pad, streams_mask); /* Call the .disable_streams() operation. */ ret = v4l2_subdev_call(sd, pad, disable_streams, state, pad, streams_mask); if (ret) { dev_dbg(dev, "disable streams %u:%#llx failed: %d\n", pad, streams_mask, ret); goto done; } /* Mark the streams as disabled. */ for (i = 0; i < state->stream_configs.num_configs; ++i) { struct v4l2_subdev_stream_config *cfg = &state->stream_configs.configs[i]; if (cfg->pad == pad && (streams_mask & BIT_ULL(cfg->stream))) cfg->enabled = false; } done: v4l2_subdev_unlock_state(state); return ret; } EXPORT_SYMBOL_GPL(v4l2_subdev_disable_streams); int v4l2_subdev_s_stream_helper(struct v4l2_subdev *sd, int enable) { struct v4l2_subdev_state *state; struct v4l2_subdev_route *route; struct media_pad *pad; u64 source_mask = 0; int pad_index = -1; /* * Find the source pad. This helper is meant for subdevs that have a * single source pad, so failures shouldn't happen, but catch them * loudly nonetheless as they indicate a driver bug. */ media_entity_for_each_pad(&sd->entity, pad) { if (pad->flags & MEDIA_PAD_FL_SOURCE) { pad_index = pad->index; break; } } if (WARN_ON(pad_index == -1)) return -EINVAL; /* * As there's a single source pad, just collect all the source streams. */ state = v4l2_subdev_lock_and_get_active_state(sd); for_each_active_route(&state->routing, route) source_mask |= BIT_ULL(route->source_stream); v4l2_subdev_unlock_state(state); if (enable) return v4l2_subdev_enable_streams(sd, pad_index, source_mask); else return v4l2_subdev_disable_streams(sd, pad_index, source_mask); } EXPORT_SYMBOL_GPL(v4l2_subdev_s_stream_helper); #endif /* CONFIG_VIDEO_V4L2_SUBDEV_API */ #endif /* CONFIG_MEDIA_CONTROLLER */ void v4l2_subdev_init(struct v4l2_subdev *sd, const struct v4l2_subdev_ops *ops) { INIT_LIST_HEAD(&sd->list); BUG_ON(!ops); sd->ops = ops; sd->v4l2_dev = NULL; sd->flags = 0; sd->name[0] = '\0'; sd->grp_id = 0; sd->dev_priv = NULL; sd->host_priv = NULL; sd->privacy_led = NULL; INIT_LIST_HEAD(&sd->async_subdev_endpoint_list); #if defined(CONFIG_MEDIA_CONTROLLER) sd->entity.name = sd->name; sd->entity.obj_type = MEDIA_ENTITY_TYPE_V4L2_SUBDEV; sd->entity.function = MEDIA_ENT_F_V4L2_SUBDEV_UNKNOWN; #endif } EXPORT_SYMBOL(v4l2_subdev_init); void v4l2_subdev_notify_event(struct v4l2_subdev *sd, const struct v4l2_event *ev) { v4l2_event_queue(sd->devnode, ev); v4l2_subdev_notify(sd, V4L2_DEVICE_NOTIFY_EVENT, (void *)ev); } EXPORT_SYMBOL_GPL(v4l2_subdev_notify_event); int v4l2_subdev_get_privacy_led(struct v4l2_subdev *sd) { #if IS_REACHABLE(CONFIG_LEDS_CLASS) sd->privacy_led = led_get(sd->dev, "privacy-led"); if (IS_ERR(sd->privacy_led) && PTR_ERR(sd->privacy_led) != -ENOENT) return dev_err_probe(sd->dev, PTR_ERR(sd->privacy_led), "getting privacy LED\n"); if (!IS_ERR_OR_NULL(sd->privacy_led)) { mutex_lock(&sd->privacy_led->led_access); led_sysfs_disable(sd->privacy_led); led_trigger_remove(sd->privacy_led); led_set_brightness(sd->privacy_led, 0); mutex_unlock(&sd->privacy_led->led_access); } #endif return 0; } EXPORT_SYMBOL_GPL(v4l2_subdev_get_privacy_led); void v4l2_subdev_put_privacy_led(struct v4l2_subdev *sd) { #if IS_REACHABLE(CONFIG_LEDS_CLASS) if (!IS_ERR_OR_NULL(sd->privacy_led)) { mutex_lock(&sd->privacy_led->led_access); led_sysfs_enable(sd->privacy_led); mutex_unlock(&sd->privacy_led->led_access); led_put(sd->privacy_led); } #endif } EXPORT_SYMBOL_GPL(v4l2_subdev_put_privacy_led); |
473 178 1325 1214 1213 1 41 42 16 2 57 65 112 2 2 59 58 1 693 209 29 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 | /* SPDX-License-Identifier: GPL-2.0 */ #undef TRACE_SYSTEM #define TRACE_SYSTEM writeback #if !defined(_TRACE_WRITEBACK_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_WRITEBACK_H #include <linux/tracepoint.h> #include <linux/backing-dev.h> #include <linux/writeback.h> #define show_inode_state(state) \ __print_flags(state, "|", \ {I_DIRTY_SYNC, "I_DIRTY_SYNC"}, \ {I_DIRTY_DATASYNC, "I_DIRTY_DATASYNC"}, \ {I_DIRTY_PAGES, "I_DIRTY_PAGES"}, \ {I_NEW, "I_NEW"}, \ {I_WILL_FREE, "I_WILL_FREE"}, \ {I_FREEING, "I_FREEING"}, \ {I_CLEAR, "I_CLEAR"}, \ {I_SYNC, "I_SYNC"}, \ {I_DIRTY_TIME, "I_DIRTY_TIME"}, \ {I_REFERENCED, "I_REFERENCED"} \ ) /* enums need to be exported to user space */ #undef EM #undef EMe #define EM(a,b) TRACE_DEFINE_ENUM(a); #define EMe(a,b) TRACE_DEFINE_ENUM(a); #define WB_WORK_REASON \ EM( WB_REASON_BACKGROUND, "background") \ EM( WB_REASON_VMSCAN, "vmscan") \ EM( WB_REASON_SYNC, "sync") \ EM( WB_REASON_PERIODIC, "periodic") \ EM( WB_REASON_LAPTOP_TIMER, "laptop_timer") \ EM( WB_REASON_FS_FREE_SPACE, "fs_free_space") \ EM( WB_REASON_FORKER_THREAD, "forker_thread") \ EMe(WB_REASON_FOREIGN_FLUSH, "foreign_flush") WB_WORK_REASON /* * Now redefine the EM() and EMe() macros to map the enums to the strings * that will be printed in the output. */ #undef EM #undef EMe #define EM(a,b) { a, b }, #define EMe(a,b) { a, b } struct wb_writeback_work; DECLARE_EVENT_CLASS(writeback_folio_template, TP_PROTO(struct folio *folio, struct address_space *mapping), TP_ARGS(folio, mapping), TP_STRUCT__entry ( __array(char, name, 32) __field(ino_t, ino) __field(pgoff_t, index) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(mapping ? inode_to_bdi(mapping->host) : NULL), 32); __entry->ino = (mapping && mapping->host) ? mapping->host->i_ino : 0; __entry->index = folio->index; ), TP_printk("bdi %s: ino=%lu index=%lu", __entry->name, (unsigned long)__entry->ino, __entry->index ) ); DEFINE_EVENT(writeback_folio_template, writeback_dirty_folio, TP_PROTO(struct folio *folio, struct address_space *mapping), TP_ARGS(folio, mapping) ); DEFINE_EVENT(writeback_folio_template, folio_wait_writeback, TP_PROTO(struct folio *folio, struct address_space *mapping), TP_ARGS(folio, mapping) ); DECLARE_EVENT_CLASS(writeback_dirty_inode_template, TP_PROTO(struct inode *inode, int flags), TP_ARGS(inode, flags), TP_STRUCT__entry ( __array(char, name, 32) __field(ino_t, ino) __field(unsigned long, state) __field(unsigned long, flags) ), TP_fast_assign( struct backing_dev_info *bdi = inode_to_bdi(inode); /* may be called for files on pseudo FSes w/ unregistered bdi */ strscpy_pad(__entry->name, bdi_dev_name(bdi), 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->flags = flags; ), TP_printk("bdi %s: ino=%lu state=%s flags=%s", __entry->name, (unsigned long)__entry->ino, show_inode_state(__entry->state), show_inode_state(__entry->flags) ) ); DEFINE_EVENT(writeback_dirty_inode_template, writeback_mark_inode_dirty, TP_PROTO(struct inode *inode, int flags), TP_ARGS(inode, flags) ); DEFINE_EVENT(writeback_dirty_inode_template, writeback_dirty_inode_start, TP_PROTO(struct inode *inode, int flags), TP_ARGS(inode, flags) ); DEFINE_EVENT(writeback_dirty_inode_template, writeback_dirty_inode, TP_PROTO(struct inode *inode, int flags), TP_ARGS(inode, flags) ); #ifdef CREATE_TRACE_POINTS #ifdef CONFIG_CGROUP_WRITEBACK static inline ino_t __trace_wb_assign_cgroup(struct bdi_writeback *wb) { return cgroup_ino(wb->memcg_css->cgroup); } static inline ino_t __trace_wbc_assign_cgroup(struct writeback_control *wbc) { if (wbc->wb) return __trace_wb_assign_cgroup(wbc->wb); else return 1; } #else /* CONFIG_CGROUP_WRITEBACK */ static inline ino_t __trace_wb_assign_cgroup(struct bdi_writeback *wb) { return 1; } static inline ino_t __trace_wbc_assign_cgroup(struct writeback_control *wbc) { return 1; } #endif /* CONFIG_CGROUP_WRITEBACK */ #endif /* CREATE_TRACE_POINTS */ #ifdef CONFIG_CGROUP_WRITEBACK TRACE_EVENT(inode_foreign_history, TP_PROTO(struct inode *inode, struct writeback_control *wbc, unsigned int history), TP_ARGS(inode, wbc, history), TP_STRUCT__entry( __array(char, name, 32) __field(ino_t, ino) __field(ino_t, cgroup_ino) __field(unsigned int, history) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(inode_to_bdi(inode)), 32); __entry->ino = inode->i_ino; __entry->cgroup_ino = __trace_wbc_assign_cgroup(wbc); __entry->history = history; ), TP_printk("bdi %s: ino=%lu cgroup_ino=%lu history=0x%x", __entry->name, (unsigned long)__entry->ino, (unsigned long)__entry->cgroup_ino, __entry->history ) ); TRACE_EVENT(inode_switch_wbs, TP_PROTO(struct inode *inode, struct bdi_writeback *old_wb, struct bdi_writeback *new_wb), TP_ARGS(inode, old_wb, new_wb), TP_STRUCT__entry( __array(char, name, 32) __field(ino_t, ino) __field(ino_t, old_cgroup_ino) __field(ino_t, new_cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(old_wb->bdi), 32); __entry->ino = inode->i_ino; __entry->old_cgroup_ino = __trace_wb_assign_cgroup(old_wb); __entry->new_cgroup_ino = __trace_wb_assign_cgroup(new_wb); ), TP_printk("bdi %s: ino=%lu old_cgroup_ino=%lu new_cgroup_ino=%lu", __entry->name, (unsigned long)__entry->ino, (unsigned long)__entry->old_cgroup_ino, (unsigned long)__entry->new_cgroup_ino ) ); TRACE_EVENT(track_foreign_dirty, TP_PROTO(struct folio *folio, struct bdi_writeback *wb), TP_ARGS(folio, wb), TP_STRUCT__entry( __array(char, name, 32) __field(u64, bdi_id) __field(ino_t, ino) __field(unsigned int, memcg_id) __field(ino_t, cgroup_ino) __field(ino_t, page_cgroup_ino) ), TP_fast_assign( struct address_space *mapping = folio_mapping(folio); struct inode *inode = mapping ? mapping->host : NULL; strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32); __entry->bdi_id = wb->bdi->id; __entry->ino = inode ? inode->i_ino : 0; __entry->memcg_id = wb->memcg_css->id; __entry->cgroup_ino = __trace_wb_assign_cgroup(wb); __entry->page_cgroup_ino = cgroup_ino(folio_memcg(folio)->css.cgroup); ), TP_printk("bdi %s[%llu]: ino=%lu memcg_id=%u cgroup_ino=%lu page_cgroup_ino=%lu", __entry->name, __entry->bdi_id, (unsigned long)__entry->ino, __entry->memcg_id, (unsigned long)__entry->cgroup_ino, (unsigned long)__entry->page_cgroup_ino ) ); TRACE_EVENT(flush_foreign, TP_PROTO(struct bdi_writeback *wb, unsigned int frn_bdi_id, unsigned int frn_memcg_id), TP_ARGS(wb, frn_bdi_id, frn_memcg_id), TP_STRUCT__entry( __array(char, name, 32) __field(ino_t, cgroup_ino) __field(unsigned int, frn_bdi_id) __field(unsigned int, frn_memcg_id) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32); __entry->cgroup_ino = __trace_wb_assign_cgroup(wb); __entry->frn_bdi_id = frn_bdi_id; __entry->frn_memcg_id = frn_memcg_id; ), TP_printk("bdi %s: cgroup_ino=%lu frn_bdi_id=%u frn_memcg_id=%u", __entry->name, (unsigned long)__entry->cgroup_ino, __entry->frn_bdi_id, __entry->frn_memcg_id ) ); #endif DECLARE_EVENT_CLASS(writeback_write_inode_template, TP_PROTO(struct inode *inode, struct writeback_control *wbc), TP_ARGS(inode, wbc), TP_STRUCT__entry ( __array(char, name, 32) __field(ino_t, ino) __field(int, sync_mode) __field(ino_t, cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(inode_to_bdi(inode)), 32); __entry->ino = inode->i_ino; __entry->sync_mode = wbc->sync_mode; __entry->cgroup_ino = __trace_wbc_assign_cgroup(wbc); ), TP_printk("bdi %s: ino=%lu sync_mode=%d cgroup_ino=%lu", __entry->name, (unsigned long)__entry->ino, __entry->sync_mode, (unsigned long)__entry->cgroup_ino ) ); DEFINE_EVENT(writeback_write_inode_template, writeback_write_inode_start, TP_PROTO(struct inode *inode, struct writeback_control *wbc), TP_ARGS(inode, wbc) ); DEFINE_EVENT(writeback_write_inode_template, writeback_write_inode, TP_PROTO(struct inode *inode, struct writeback_control *wbc), TP_ARGS(inode, wbc) ); DECLARE_EVENT_CLASS(writeback_work_class, TP_PROTO(struct bdi_writeback *wb, struct wb_writeback_work *work), TP_ARGS(wb, work), TP_STRUCT__entry( __array(char, name, 32) __field(long, nr_pages) __field(dev_t, sb_dev) __field(int, sync_mode) __field(int, for_kupdate) __field(int, range_cyclic) __field(int, for_background) __field(int, reason) __field(ino_t, cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32); __entry->nr_pages = work->nr_pages; __entry->sb_dev = work->sb ? work->sb->s_dev : 0; __entry->sync_mode = work->sync_mode; __entry->for_kupdate = work->for_kupdate; __entry->range_cyclic = work->range_cyclic; __entry->for_background = work->for_background; __entry->reason = work->reason; __entry->cgroup_ino = __trace_wb_assign_cgroup(wb); ), TP_printk("bdi %s: sb_dev %d:%d nr_pages=%ld sync_mode=%d " "kupdate=%d range_cyclic=%d background=%d reason=%s cgroup_ino=%lu", __entry->name, MAJOR(__entry->sb_dev), MINOR(__entry->sb_dev), __entry->nr_pages, __entry->sync_mode, __entry->for_kupdate, __entry->range_cyclic, __entry->for_background, __print_symbolic(__entry->reason, WB_WORK_REASON), (unsigned long)__entry->cgroup_ino ) ); #define DEFINE_WRITEBACK_WORK_EVENT(name) \ DEFINE_EVENT(writeback_work_class, name, \ TP_PROTO(struct bdi_writeback *wb, struct wb_writeback_work *work), \ TP_ARGS(wb, work)) DEFINE_WRITEBACK_WORK_EVENT(writeback_queue); DEFINE_WRITEBACK_WORK_EVENT(writeback_exec); DEFINE_WRITEBACK_WORK_EVENT(writeback_start); DEFINE_WRITEBACK_WORK_EVENT(writeback_written); DEFINE_WRITEBACK_WORK_EVENT(writeback_wait); TRACE_EVENT(writeback_pages_written, TP_PROTO(long pages_written), TP_ARGS(pages_written), TP_STRUCT__entry( __field(long, pages) ), TP_fast_assign( __entry->pages = pages_written; ), TP_printk("%ld", __entry->pages) ); DECLARE_EVENT_CLASS(writeback_class, TP_PROTO(struct bdi_writeback *wb), TP_ARGS(wb), TP_STRUCT__entry( __array(char, name, 32) __field(ino_t, cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32); __entry->cgroup_ino = __trace_wb_assign_cgroup(wb); ), TP_printk("bdi %s: cgroup_ino=%lu", __entry->name, (unsigned long)__entry->cgroup_ino ) ); #define DEFINE_WRITEBACK_EVENT(name) \ DEFINE_EVENT(writeback_class, name, \ TP_PROTO(struct bdi_writeback *wb), \ TP_ARGS(wb)) DEFINE_WRITEBACK_EVENT(writeback_wake_background); TRACE_EVENT(writeback_bdi_register, TP_PROTO(struct backing_dev_info *bdi), TP_ARGS(bdi), TP_STRUCT__entry( __array(char, name, 32) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(bdi), 32); ), TP_printk("bdi %s", __entry->name ) ); DECLARE_EVENT_CLASS(wbc_class, TP_PROTO(struct writeback_control *wbc, struct backing_dev_info *bdi), TP_ARGS(wbc, bdi), TP_STRUCT__entry( __array(char, name, 32) __field(long, nr_to_write) __field(long, pages_skipped) __field(int, sync_mode) __field(int, for_kupdate) __field(int, for_background) __field(int, for_reclaim) __field(int, range_cyclic) __field(long, range_start) __field(long, range_end) __field(ino_t, cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(bdi), 32); __entry->nr_to_write = wbc->nr_to_write; __entry->pages_skipped = wbc->pages_skipped; __entry->sync_mode = wbc->sync_mode; __entry->for_kupdate = wbc->for_kupdate; __entry->for_background = wbc->for_background; __entry->for_reclaim = wbc->for_reclaim; __entry->range_cyclic = wbc->range_cyclic; __entry->range_start = (long)wbc->range_start; __entry->range_end = (long)wbc->range_end; __entry->cgroup_ino = __trace_wbc_assign_cgroup(wbc); ), TP_printk("bdi %s: towrt=%ld skip=%ld mode=%d kupd=%d " "bgrd=%d reclm=%d cyclic=%d " "start=0x%lx end=0x%lx cgroup_ino=%lu", __entry->name, __entry->nr_to_write, __entry->pages_skipped, __entry->sync_mode, __entry->for_kupdate, __entry->for_background, __entry->for_reclaim, __entry->range_cyclic, __entry->range_start, __entry->range_end, (unsigned long)__entry->cgroup_ino ) ) #define DEFINE_WBC_EVENT(name) \ DEFINE_EVENT(wbc_class, name, \ TP_PROTO(struct writeback_control *wbc, struct backing_dev_info *bdi), \ TP_ARGS(wbc, bdi)) DEFINE_WBC_EVENT(wbc_writepage); TRACE_EVENT(writeback_queue_io, TP_PROTO(struct bdi_writeback *wb, struct wb_writeback_work *work, unsigned long dirtied_before, int moved), TP_ARGS(wb, work, dirtied_before, moved), TP_STRUCT__entry( __array(char, name, 32) __field(unsigned long, older) __field(long, age) __field(int, moved) __field(int, reason) __field(ino_t, cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32); __entry->older = dirtied_before; __entry->age = (jiffies - dirtied_before) * 1000 / HZ; __entry->moved = moved; __entry->reason = work->reason; __entry->cgroup_ino = __trace_wb_assign_cgroup(wb); ), TP_printk("bdi %s: older=%lu age=%ld enqueue=%d reason=%s cgroup_ino=%lu", __entry->name, __entry->older, /* dirtied_before in jiffies */ __entry->age, /* dirtied_before in relative milliseconds */ __entry->moved, __print_symbolic(__entry->reason, WB_WORK_REASON), (unsigned long)__entry->cgroup_ino ) ); TRACE_EVENT(global_dirty_state, TP_PROTO(unsigned long background_thresh, unsigned long dirty_thresh ), TP_ARGS(background_thresh, dirty_thresh ), TP_STRUCT__entry( __field(unsigned long, nr_dirty) __field(unsigned long, nr_writeback) __field(unsigned long, background_thresh) __field(unsigned long, dirty_thresh) __field(unsigned long, dirty_limit) __field(unsigned long, nr_dirtied) __field(unsigned long, nr_written) ), TP_fast_assign( __entry->nr_dirty = global_node_page_state(NR_FILE_DIRTY); __entry->nr_writeback = global_node_page_state(NR_WRITEBACK); __entry->nr_dirtied = global_node_page_state(NR_DIRTIED); __entry->nr_written = global_node_page_state(NR_WRITTEN); __entry->background_thresh = background_thresh; __entry->dirty_thresh = dirty_thresh; __entry->dirty_limit = global_wb_domain.dirty_limit; ), TP_printk("dirty=%lu writeback=%lu " "bg_thresh=%lu thresh=%lu limit=%lu " "dirtied=%lu written=%lu", __entry->nr_dirty, __entry->nr_writeback, __entry->background_thresh, __entry->dirty_thresh, __entry->dirty_limit, __entry->nr_dirtied, __entry->nr_written ) ); #define KBps(x) ((x) << (PAGE_SHIFT - 10)) TRACE_EVENT(bdi_dirty_ratelimit, TP_PROTO(struct bdi_writeback *wb, unsigned long dirty_rate, unsigned long task_ratelimit), TP_ARGS(wb, dirty_rate, task_ratelimit), TP_STRUCT__entry( __array(char, bdi, 32) __field(unsigned long, write_bw) __field(unsigned long, avg_write_bw) __field(unsigned long, dirty_rate) __field(unsigned long, dirty_ratelimit) __field(unsigned long, task_ratelimit) __field(unsigned long, balanced_dirty_ratelimit) __field(ino_t, cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->bdi, bdi_dev_name(wb->bdi), 32); __entry->write_bw = KBps(wb->write_bandwidth); __entry->avg_write_bw = KBps(wb->avg_write_bandwidth); __entry->dirty_rate = KBps(dirty_rate); __entry->dirty_ratelimit = KBps(wb->dirty_ratelimit); __entry->task_ratelimit = KBps(task_ratelimit); __entry->balanced_dirty_ratelimit = KBps(wb->balanced_dirty_ratelimit); __entry->cgroup_ino = __trace_wb_assign_cgroup(wb); ), TP_printk("bdi %s: " "write_bw=%lu awrite_bw=%lu dirty_rate=%lu " "dirty_ratelimit=%lu task_ratelimit=%lu " "balanced_dirty_ratelimit=%lu cgroup_ino=%lu", __entry->bdi, __entry->write_bw, /* write bandwidth */ __entry->avg_write_bw, /* avg write bandwidth */ __entry->dirty_rate, /* bdi dirty rate */ __entry->dirty_ratelimit, /* base ratelimit */ __entry->task_ratelimit, /* ratelimit with position control */ __entry->balanced_dirty_ratelimit, /* the balanced ratelimit */ (unsigned long)__entry->cgroup_ino ) ); TRACE_EVENT(balance_dirty_pages, TP_PROTO(struct bdi_writeback *wb, unsigned long thresh, unsigned long bg_thresh, unsigned long dirty, unsigned long bdi_thresh, unsigned long bdi_dirty, unsigned long dirty_ratelimit, unsigned long task_ratelimit, unsigned long dirtied, unsigned long period, long pause, unsigned long start_time), TP_ARGS(wb, thresh, bg_thresh, dirty, bdi_thresh, bdi_dirty, dirty_ratelimit, task_ratelimit, dirtied, period, pause, start_time), TP_STRUCT__entry( __array( char, bdi, 32) __field(unsigned long, limit) __field(unsigned long, setpoint) __field(unsigned long, dirty) __field(unsigned long, bdi_setpoint) __field(unsigned long, bdi_dirty) __field(unsigned long, dirty_ratelimit) __field(unsigned long, task_ratelimit) __field(unsigned int, dirtied) __field(unsigned int, dirtied_pause) __field(unsigned long, paused) __field( long, pause) __field(unsigned long, period) __field( long, think) __field(ino_t, cgroup_ino) ), TP_fast_assign( unsigned long freerun = (thresh + bg_thresh) / 2; strscpy_pad(__entry->bdi, bdi_dev_name(wb->bdi), 32); __entry->limit = global_wb_domain.dirty_limit; __entry->setpoint = (global_wb_domain.dirty_limit + freerun) / 2; __entry->dirty = dirty; __entry->bdi_setpoint = __entry->setpoint * bdi_thresh / (thresh + 1); __entry->bdi_dirty = bdi_dirty; __entry->dirty_ratelimit = KBps(dirty_ratelimit); __entry->task_ratelimit = KBps(task_ratelimit); __entry->dirtied = dirtied; __entry->dirtied_pause = current->nr_dirtied_pause; __entry->think = current->dirty_paused_when == 0 ? 0 : (long)(jiffies - current->dirty_paused_when) * 1000/HZ; __entry->period = period * 1000 / HZ; __entry->pause = pause * 1000 / HZ; __entry->paused = (jiffies - start_time) * 1000 / HZ; __entry->cgroup_ino = __trace_wb_assign_cgroup(wb); ), TP_printk("bdi %s: " "limit=%lu setpoint=%lu dirty=%lu " "bdi_setpoint=%lu bdi_dirty=%lu " "dirty_ratelimit=%lu task_ratelimit=%lu " "dirtied=%u dirtied_pause=%u " "paused=%lu pause=%ld period=%lu think=%ld cgroup_ino=%lu", __entry->bdi, __entry->limit, __entry->setpoint, __entry->dirty, __entry->bdi_setpoint, __entry->bdi_dirty, __entry->dirty_ratelimit, __entry->task_ratelimit, __entry->dirtied, __entry->dirtied_pause, __entry->paused, /* ms */ __entry->pause, /* ms */ __entry->period, /* ms */ __entry->think, /* ms */ (unsigned long)__entry->cgroup_ino ) ); TRACE_EVENT(writeback_sb_inodes_requeue, TP_PROTO(struct inode *inode), TP_ARGS(inode), TP_STRUCT__entry( __array(char, name, 32) __field(ino_t, ino) __field(unsigned long, state) __field(unsigned long, dirtied_when) __field(ino_t, cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(inode_to_bdi(inode)), 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->dirtied_when = inode->dirtied_when; __entry->cgroup_ino = __trace_wb_assign_cgroup(inode_to_wb(inode)); ), TP_printk("bdi %s: ino=%lu state=%s dirtied_when=%lu age=%lu cgroup_ino=%lu", __entry->name, (unsigned long)__entry->ino, show_inode_state(__entry->state), __entry->dirtied_when, (jiffies - __entry->dirtied_when) / HZ, (unsigned long)__entry->cgroup_ino ) ); DECLARE_EVENT_CLASS(writeback_single_inode_template, TP_PROTO(struct inode *inode, struct writeback_control *wbc, unsigned long nr_to_write ), TP_ARGS(inode, wbc, nr_to_write), TP_STRUCT__entry( __array(char, name, 32) __field(ino_t, ino) __field(unsigned long, state) __field(unsigned long, dirtied_when) __field(unsigned long, writeback_index) __field(long, nr_to_write) __field(unsigned long, wrote) __field(ino_t, cgroup_ino) ), TP_fast_assign( strscpy_pad(__entry->name, bdi_dev_name(inode_to_bdi(inode)), 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->dirtied_when = inode->dirtied_when; __entry->writeback_index = inode->i_mapping->writeback_index; __entry->nr_to_write = nr_to_write; __entry->wrote = nr_to_write - wbc->nr_to_write; __entry->cgroup_ino = __trace_wbc_assign_cgroup(wbc); ), TP_printk("bdi %s: ino=%lu state=%s dirtied_when=%lu age=%lu " "index=%lu to_write=%ld wrote=%lu cgroup_ino=%lu", __entry->name, (unsigned long)__entry->ino, show_inode_state(__entry->state), __entry->dirtied_when, (jiffies - __entry->dirtied_when) / HZ, __entry->writeback_index, __entry->nr_to_write, __entry->wrote, (unsigned long)__entry->cgroup_ino ) ); DEFINE_EVENT(writeback_single_inode_template, writeback_single_inode_start, TP_PROTO(struct inode *inode, struct writeback_control *wbc, unsigned long nr_to_write), TP_ARGS(inode, wbc, nr_to_write) ); DEFINE_EVENT(writeback_single_inode_template, writeback_single_inode, TP_PROTO(struct inode *inode, struct writeback_control *wbc, unsigned long nr_to_write), TP_ARGS(inode, wbc, nr_to_write) ); DECLARE_EVENT_CLASS(writeback_inode_template, TP_PROTO(struct inode *inode), TP_ARGS(inode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field(unsigned long, state ) __field( __u16, mode ) __field(unsigned long, dirtied_when ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->mode = inode->i_mode; __entry->dirtied_when = inode->dirtied_when; ), TP_printk("dev %d,%d ino %lu dirtied %lu state %s mode 0%o", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long)__entry->ino, __entry->dirtied_when, show_inode_state(__entry->state), __entry->mode) ); DEFINE_EVENT(writeback_inode_template, writeback_lazytime, TP_PROTO(struct inode *inode), TP_ARGS(inode) ); DEFINE_EVENT(writeback_inode_template, writeback_lazytime_iput, TP_PROTO(struct inode *inode), TP_ARGS(inode) ); DEFINE_EVENT(writeback_inode_template, writeback_dirty_inode_enqueue, TP_PROTO(struct inode *inode), TP_ARGS(inode) ); /* * Inode writeback list tracking. */ DEFINE_EVENT(writeback_inode_template, sb_mark_inode_writeback, TP_PROTO(struct inode *inode), TP_ARGS(inode) ); DEFINE_EVENT(writeback_inode_template, sb_clear_inode_writeback, TP_PROTO(struct inode *inode), TP_ARGS(inode) ); #endif /* _TRACE_WRITEBACK_H */ /* This part must be outside protection */ #include <trace/define_trace.h> |
12 1 15 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | /* SPDX-License-Identifier: GPL-2.0 */ /* Copyright (C) B.A.T.M.A.N. contributors: * * Simon Wunderlich, Marek Lindner */ #ifndef _NET_BATMAN_ADV_HASH_H_ #define _NET_BATMAN_ADV_HASH_H_ #include "main.h" #include <linux/atomic.h> #include <linux/compiler.h> #include <linux/list.h> #include <linux/lockdep.h> #include <linux/rculist.h> #include <linux/spinlock.h> #include <linux/stddef.h> #include <linux/types.h> /* callback to a compare function. should compare 2 element data for their * keys * * Return: true if same and false if not same */ typedef bool (*batadv_hashdata_compare_cb)(const struct hlist_node *, const void *); /* the hashfunction * * Return: an index based on the key in the data of the first argument and the * size the second */ typedef u32 (*batadv_hashdata_choose_cb)(const void *, u32); typedef void (*batadv_hashdata_free_cb)(struct hlist_node *, void *); /** * struct batadv_hashtable - Wrapper of simple hlist based hashtable */ struct batadv_hashtable { /** @table: the hashtable itself with the buckets */ struct hlist_head *table; /** @list_locks: spinlock for each hash list entry */ spinlock_t *list_locks; /** @size: size of hashtable */ u32 size; /** @generation: current (generation) sequence number */ atomic_t generation; }; /* allocates and clears the hash */ struct batadv_hashtable *batadv_hash_new(u32 size); /* set class key for all locks */ void batadv_hash_set_lock_class(struct batadv_hashtable *hash, struct lock_class_key *key); /* free only the hashtable and the hash itself. */ void batadv_hash_destroy(struct batadv_hashtable *hash); /** * batadv_hash_add() - adds data to the hashtable * @hash: storage hash table * @compare: callback to determine if 2 hash elements are identical * @choose: callback calculating the hash index * @data: data passed to the aforementioned callbacks as argument * @data_node: to be added element * * Return: 0 on success, 1 if the element already is in the hash * and -1 on error. */ static inline int batadv_hash_add(struct batadv_hashtable *hash, batadv_hashdata_compare_cb compare, batadv_hashdata_choose_cb choose, const void *data, struct hlist_node *data_node) { u32 index; int ret = -1; struct hlist_head *head; struct hlist_node *node; spinlock_t *list_lock; /* spinlock to protect write access */ if (!hash) goto out; index = choose(data, hash->size); head = &hash->table[index]; list_lock = &hash->list_locks[index]; spin_lock_bh(list_lock); hlist_for_each(node, head) { if (!compare(node, data)) continue; ret = 1; goto unlock; } /* no duplicate found in list, add new element */ hlist_add_head_rcu(data_node, head); atomic_inc(&hash->generation); ret = 0; unlock: spin_unlock_bh(list_lock); out: return ret; } /** * batadv_hash_remove() - Removes data from hash, if found * @hash: hash table * @compare: callback to determine if 2 hash elements are identical * @choose: callback calculating the hash index * @data: data passed to the aforementioned callbacks as argument * * ata could be the structure you use with just the key filled, we just need * the key for comparing. * * Return: returns pointer do data on success, so you can remove the used * structure yourself, or NULL on error */ static inline void *batadv_hash_remove(struct batadv_hashtable *hash, batadv_hashdata_compare_cb compare, batadv_hashdata_choose_cb choose, void *data) { u32 index; struct hlist_node *node; struct hlist_head *head; void *data_save = NULL; index = choose(data, hash->size); head = &hash->table[index]; spin_lock_bh(&hash->list_locks[index]); hlist_for_each(node, head) { if (!compare(node, data)) continue; data_save = node; hlist_del_rcu(node); atomic_inc(&hash->generation); break; } spin_unlock_bh(&hash->list_locks[index]); return data_save; } #endif /* _NET_BATMAN_ADV_HASH_H_ */ |
1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 | // SPDX-License-Identifier: GPL-2.0-or-later /* FS-Cache cache handling * * Copyright (C) 2021 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #define FSCACHE_DEBUG_LEVEL CACHE #include <linux/export.h> #include <linux/slab.h> #include "internal.h" static LIST_HEAD(fscache_caches); DECLARE_RWSEM(fscache_addremove_sem); EXPORT_SYMBOL(fscache_addremove_sem); DECLARE_WAIT_QUEUE_HEAD(fscache_clearance_waiters); EXPORT_SYMBOL(fscache_clearance_waiters); static atomic_t fscache_cache_debug_id; /* * Allocate a cache cookie. */ static struct fscache_cache *fscache_alloc_cache(const char *name) { struct fscache_cache *cache; cache = kzalloc(sizeof(*cache), GFP_KERNEL); if (cache) { if (name) { cache->name = kstrdup(name, GFP_KERNEL); if (!cache->name) { kfree(cache); return NULL; } } refcount_set(&cache->ref, 1); INIT_LIST_HEAD(&cache->cache_link); cache->debug_id = atomic_inc_return(&fscache_cache_debug_id); } return cache; } static bool fscache_get_cache_maybe(struct fscache_cache *cache, enum fscache_cache_trace where) { bool success; int ref; success = __refcount_inc_not_zero(&cache->ref, &ref); if (success) trace_fscache_cache(cache->debug_id, ref + 1, where); return success; } /* * Look up a cache cookie. */ struct fscache_cache *fscache_lookup_cache(const char *name, bool is_cache) { struct fscache_cache *candidate, *cache, *unnamed = NULL; /* firstly check for the existence of the cache under read lock */ down_read(&fscache_addremove_sem); list_for_each_entry(cache, &fscache_caches, cache_link) { if (cache->name && name && strcmp(cache->name, name) == 0 && fscache_get_cache_maybe(cache, fscache_cache_get_acquire)) goto got_cache_r; if (!cache->name && !name && fscache_get_cache_maybe(cache, fscache_cache_get_acquire)) goto got_cache_r; } if (!name) { list_for_each_entry(cache, &fscache_caches, cache_link) { if (cache->name && fscache_get_cache_maybe(cache, fscache_cache_get_acquire)) goto got_cache_r; } } up_read(&fscache_addremove_sem); /* the cache does not exist - create a candidate */ candidate = fscache_alloc_cache(name); if (!candidate) return ERR_PTR(-ENOMEM); /* write lock, search again and add if still not present */ down_write(&fscache_addremove_sem); list_for_each_entry(cache, &fscache_caches, cache_link) { if (cache->name && name && strcmp(cache->name, name) == 0 && fscache_get_cache_maybe(cache, fscache_cache_get_acquire)) goto got_cache_w; if (!cache->name) { unnamed = cache; if (!name && fscache_get_cache_maybe(cache, fscache_cache_get_acquire)) goto got_cache_w; } } if (unnamed && is_cache && fscache_get_cache_maybe(unnamed, fscache_cache_get_acquire)) goto use_unnamed_cache; if (!name) { list_for_each_entry(cache, &fscache_caches, cache_link) { if (cache->name && fscache_get_cache_maybe(cache, fscache_cache_get_acquire)) goto got_cache_w; } } list_add_tail(&candidate->cache_link, &fscache_caches); trace_fscache_cache(candidate->debug_id, refcount_read(&candidate->ref), fscache_cache_new_acquire); up_write(&fscache_addremove_sem); return candidate; got_cache_r: up_read(&fscache_addremove_sem); return cache; use_unnamed_cache: cache = unnamed; cache->name = candidate->name; candidate->name = NULL; got_cache_w: up_write(&fscache_addremove_sem); kfree(candidate->name); kfree(candidate); return cache; } /** * fscache_acquire_cache - Acquire a cache-level cookie. * @name: The name of the cache. * * Get a cookie to represent an actual cache. If a name is given and there is * a nameless cache record available, this will acquire that and set its name, * directing all the volumes using it to this cache. * * The cache will be switched over to the preparing state if not currently in * use, otherwise -EBUSY will be returned. */ struct fscache_cache *fscache_acquire_cache(const char *name) { struct fscache_cache *cache; ASSERT(name); cache = fscache_lookup_cache(name, true); if (IS_ERR(cache)) return cache; if (!fscache_set_cache_state_maybe(cache, FSCACHE_CACHE_IS_NOT_PRESENT, FSCACHE_CACHE_IS_PREPARING)) { pr_warn("Cache tag %s in use\n", name); fscache_put_cache(cache, fscache_cache_put_cache); return ERR_PTR(-EBUSY); } return cache; } EXPORT_SYMBOL(fscache_acquire_cache); /** * fscache_put_cache - Release a cache-level cookie. * @cache: The cache cookie to be released * @where: An indication of where the release happened * * Release the caller's reference on a cache-level cookie. The @where * indication should give information about the circumstances in which the call * occurs and will be logged through a tracepoint. */ void fscache_put_cache(struct fscache_cache *cache, enum fscache_cache_trace where) { unsigned int debug_id; bool zero; int ref; if (IS_ERR_OR_NULL(cache)) return; debug_id = cache->debug_id; zero = __refcount_dec_and_test(&cache->ref, &ref); trace_fscache_cache(debug_id, ref - 1, where); if (zero) { down_write(&fscache_addremove_sem); list_del_init(&cache->cache_link); up_write(&fscache_addremove_sem); kfree(cache->name); kfree(cache); } } /** * fscache_relinquish_cache - Reset cache state and release cookie * @cache: The cache cookie to be released * * Reset the state of a cache and release the caller's reference on a cache * cookie. */ void fscache_relinquish_cache(struct fscache_cache *cache) { enum fscache_cache_trace where = (cache->state == FSCACHE_CACHE_IS_PREPARING) ? fscache_cache_put_prep_failed : fscache_cache_put_relinquish; cache->ops = NULL; cache->cache_priv = NULL; fscache_set_cache_state(cache, FSCACHE_CACHE_IS_NOT_PRESENT); fscache_put_cache(cache, where); } EXPORT_SYMBOL(fscache_relinquish_cache); /** * fscache_add_cache - Declare a cache as being open for business * @cache: The cache-level cookie representing the cache * @ops: Table of cache operations to use * @cache_priv: Private data for the cache record * * Add a cache to the system, making it available for netfs's to use. * * See Documentation/filesystems/caching/backend-api.rst for a complete * description. */ int fscache_add_cache(struct fscache_cache *cache, const struct fscache_cache_ops *ops, void *cache_priv) { int n_accesses; _enter("{%s,%s}", ops->name, cache->name); BUG_ON(fscache_cache_state(cache) != FSCACHE_CACHE_IS_PREPARING); /* Get a ref on the cache cookie and keep its n_accesses counter raised * by 1 to prevent wakeups from transitioning it to 0 until we're * withdrawing caching services from it. */ n_accesses = atomic_inc_return(&cache->n_accesses); trace_fscache_access_cache(cache->debug_id, refcount_read(&cache->ref), n_accesses, fscache_access_cache_pin); down_write(&fscache_addremove_sem); cache->ops = ops; cache->cache_priv = cache_priv; fscache_set_cache_state(cache, FSCACHE_CACHE_IS_ACTIVE); up_write(&fscache_addremove_sem); pr_notice("Cache \"%s\" added (type %s)\n", cache->name, ops->name); _leave(" = 0 [%s]", cache->name); return 0; } EXPORT_SYMBOL(fscache_add_cache); /** * fscache_begin_cache_access - Pin a cache so it can be accessed * @cache: The cache-level cookie * @why: An indication of the circumstances of the access for tracing * * Attempt to pin the cache to prevent it from going away whilst we're * accessing it and returns true if successful. This works as follows: * * (1) If the cache tests as not live (state is not FSCACHE_CACHE_IS_ACTIVE), * then we return false to indicate access was not permitted. * * (2) If the cache tests as live, then we increment the n_accesses count and * then recheck the liveness, ending the access if it ceased to be live. * * (3) When we end the access, we decrement n_accesses and wake up the any * waiters if it reaches 0. * * (4) Whilst the cache is caching, n_accesses is kept artificially * incremented to prevent wakeups from happening. * * (5) When the cache is taken offline, the state is changed to prevent new * accesses, n_accesses is decremented and we wait for n_accesses to * become 0. */ bool fscache_begin_cache_access(struct fscache_cache *cache, enum fscache_access_trace why) { int n_accesses; if (!fscache_cache_is_live(cache)) return false; n_accesses = atomic_inc_return(&cache->n_accesses); smp_mb__after_atomic(); /* Reread live flag after n_accesses */ trace_fscache_access_cache(cache->debug_id, refcount_read(&cache->ref), n_accesses, why); if (!fscache_cache_is_live(cache)) { fscache_end_cache_access(cache, fscache_access_unlive); return false; } return true; } /** * fscache_end_cache_access - Unpin a cache at the end of an access. * @cache: The cache-level cookie * @why: An indication of the circumstances of the access for tracing * * Unpin a cache after we've accessed it. The @why indicator is merely * provided for tracing purposes. */ void fscache_end_cache_access(struct fscache_cache *cache, enum fscache_access_trace why) { int n_accesses; smp_mb__before_atomic(); n_accesses = atomic_dec_return(&cache->n_accesses); trace_fscache_access_cache(cache->debug_id, refcount_read(&cache->ref), n_accesses, why); if (n_accesses == 0) wake_up_var(&cache->n_accesses); } /** * fscache_io_error - Note a cache I/O error * @cache: The record describing the cache * * Note that an I/O error occurred in a cache and that it should no longer be * used for anything. This also reports the error into the kernel log. * * See Documentation/filesystems/caching/backend-api.rst for a complete * description. */ void fscache_io_error(struct fscache_cache *cache) { if (fscache_set_cache_state_maybe(cache, FSCACHE_CACHE_IS_ACTIVE, FSCACHE_CACHE_GOT_IOERROR)) pr_err("Cache '%s' stopped due to I/O error\n", cache->name); } EXPORT_SYMBOL(fscache_io_error); /** * fscache_withdraw_cache - Withdraw a cache from the active service * @cache: The cache cookie * * Begin the process of withdrawing a cache from service. This stops new * cache-level and volume-level accesses from taking place and waits for * currently ongoing cache-level accesses to end. */ void fscache_withdraw_cache(struct fscache_cache *cache) { int n_accesses; pr_notice("Withdrawing cache \"%s\" (%u objs)\n", cache->name, atomic_read(&cache->object_count)); fscache_set_cache_state(cache, FSCACHE_CACHE_IS_WITHDRAWN); /* Allow wakeups on dec-to-0 */ n_accesses = atomic_dec_return(&cache->n_accesses); trace_fscache_access_cache(cache->debug_id, refcount_read(&cache->ref), n_accesses, fscache_access_cache_unpin); wait_var_event(&cache->n_accesses, atomic_read(&cache->n_accesses) == 0); } EXPORT_SYMBOL(fscache_withdraw_cache); #ifdef CONFIG_PROC_FS static const char fscache_cache_states[NR__FSCACHE_CACHE_STATE] = "-PAEW"; /* * Generate a list of caches in /proc/fs/fscache/caches */ static int fscache_caches_seq_show(struct seq_file *m, void *v) { struct fscache_cache *cache; if (v == &fscache_caches) { seq_puts(m, "CACHE REF VOLS OBJS ACCES S NAME\n" "======== ===== ===== ===== ===== = ===============\n" ); return 0; } cache = list_entry(v, struct fscache_cache, cache_link); seq_printf(m, "%08x %5d %5d %5d %5d %c %s\n", cache->debug_id, refcount_read(&cache->ref), atomic_read(&cache->n_volumes), atomic_read(&cache->object_count), atomic_read(&cache->n_accesses), fscache_cache_states[cache->state], cache->name ?: "-"); return 0; } static void *fscache_caches_seq_start(struct seq_file *m, loff_t *_pos) __acquires(fscache_addremove_sem) { down_read(&fscache_addremove_sem); return seq_list_start_head(&fscache_caches, *_pos); } static void *fscache_caches_seq_next(struct seq_file *m, void *v, loff_t *_pos) { return seq_list_next(v, &fscache_caches, _pos); } static void fscache_caches_seq_stop(struct seq_file *m, void *v) __releases(fscache_addremove_sem) { up_read(&fscache_addremove_sem); } const struct seq_operations fscache_caches_seq_ops = { .start = fscache_caches_seq_start, .next = fscache_caches_seq_next, .stop = fscache_caches_seq_stop, .show = fscache_caches_seq_show, }; #endif /* CONFIG_PROC_FS */ |
403 306 406 373 356 2 313 8 291 46 2 2 407 421 4 1 15 297 289 293 1 1 42 44 296 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 | // SPDX-License-Identifier: GPL-2.0-only /* * lib/parser.c - simple parser for mount, etc. options. */ #include <linux/ctype.h> #include <linux/types.h> #include <linux/export.h> #include <linux/kstrtox.h> #include <linux/parser.h> #include <linux/slab.h> #include <linux/string.h> /* * max size needed by different bases to express U64 * HEX: "0xFFFFFFFFFFFFFFFF" --> 18 * DEC: "18446744073709551615" --> 20 * OCT: "01777777777777777777777" --> 23 * pick the max one to define NUMBER_BUF_LEN */ #define NUMBER_BUF_LEN 24 /** * match_one - Determines if a string matches a simple pattern * @s: the string to examine for presence of the pattern * @p: the string containing the pattern * @args: array of %MAX_OPT_ARGS &substring_t elements. Used to return match * locations. * * Description: Determines if the pattern @p is present in string @s. Can only * match extremely simple token=arg style patterns. If the pattern is found, * the location(s) of the arguments will be returned in the @args array. */ static int match_one(char *s, const char *p, substring_t args[]) { char *meta; int argc = 0; if (!p) return 1; while(1) { int len = -1; meta = strchr(p, '%'); if (!meta) return strcmp(p, s) == 0; if (strncmp(p, s, meta-p)) return 0; s += meta - p; p = meta + 1; if (isdigit(*p)) len = simple_strtoul(p, (char **) &p, 10); else if (*p == '%') { if (*s++ != '%') return 0; p++; continue; } if (argc >= MAX_OPT_ARGS) return 0; args[argc].from = s; switch (*p++) { case 's': { size_t str_len = strlen(s); if (str_len == 0) return 0; if (len == -1 || len > str_len) len = str_len; args[argc].to = s + len; break; } case 'd': simple_strtol(s, &args[argc].to, 0); goto num; case 'u': simple_strtoul(s, &args[argc].to, 0); goto num; case 'o': simple_strtoul(s, &args[argc].to, 8); goto num; case 'x': simple_strtoul(s, &args[argc].to, 16); num: if (args[argc].to == args[argc].from) return 0; break; default: return 0; } s = args[argc].to; argc++; } } /** * match_token - Find a token (and optional args) in a string * @s: the string to examine for token/argument pairs * @table: match_table_t describing the set of allowed option tokens and the * arguments that may be associated with them. Must be terminated with a * &struct match_token whose pattern is set to the NULL pointer. * @args: array of %MAX_OPT_ARGS &substring_t elements. Used to return match * locations. * * Description: Detects which if any of a set of token strings has been passed * to it. Tokens can include up to %MAX_OPT_ARGS instances of basic c-style * format identifiers which will be taken into account when matching the * tokens, and whose locations will be returned in the @args array. */ int match_token(char *s, const match_table_t table, substring_t args[]) { const struct match_token *p; for (p = table; !match_one(s, p->pattern, args) ; p++) ; return p->token; } EXPORT_SYMBOL(match_token); /** * match_number - scan a number in the given base from a substring_t * @s: substring to be scanned * @result: resulting integer on success * @base: base to use when converting string * * Description: Given a &substring_t and a base, attempts to parse the substring * as a number in that base. * * Return: On success, sets @result to the integer represented by the * string and returns 0. Returns -EINVAL or -ERANGE on failure. */ static int match_number(substring_t *s, int *result, int base) { char *endp; char buf[NUMBER_BUF_LEN]; int ret; long val; if (match_strlcpy(buf, s, NUMBER_BUF_LEN) >= NUMBER_BUF_LEN) return -ERANGE; ret = 0; val = simple_strtol(buf, &endp, base); if (endp == buf) ret = -EINVAL; else if (val < (long)INT_MIN || val > (long)INT_MAX) ret = -ERANGE; else *result = (int) val; return ret; } /** * match_u64int - scan a number in the given base from a substring_t * @s: substring to be scanned * @result: resulting u64 on success * @base: base to use when converting string * * Description: Given a &substring_t and a base, attempts to parse the substring * as a number in that base. * * Return: On success, sets @result to the integer represented by the * string and returns 0. Returns -EINVAL or -ERANGE on failure. */ static int match_u64int(substring_t *s, u64 *result, int base) { char buf[NUMBER_BUF_LEN]; int ret; u64 val; if (match_strlcpy(buf, s, NUMBER_BUF_LEN) >= NUMBER_BUF_LEN) return -ERANGE; ret = kstrtoull(buf, base, &val); if (!ret) *result = val; return ret; } /** * match_int - scan a decimal representation of an integer from a substring_t * @s: substring_t to be scanned * @result: resulting integer on success * * Description: Attempts to parse the &substring_t @s as a decimal integer. * * Return: On success, sets @result to the integer represented by the string * and returns 0. Returns -EINVAL or -ERANGE on failure. */ int match_int(substring_t *s, int *result) { return match_number(s, result, 0); } EXPORT_SYMBOL(match_int); /** * match_uint - scan a decimal representation of an integer from a substring_t * @s: substring_t to be scanned * @result: resulting integer on success * * Description: Attempts to parse the &substring_t @s as a decimal integer. * * Return: On success, sets @result to the integer represented by the string * and returns 0. Returns -EINVAL or -ERANGE on failure. */ int match_uint(substring_t *s, unsigned int *result) { char buf[NUMBER_BUF_LEN]; if (match_strlcpy(buf, s, NUMBER_BUF_LEN) >= NUMBER_BUF_LEN) return -ERANGE; return kstrtouint(buf, 10, result); } EXPORT_SYMBOL(match_uint); /** * match_u64 - scan a decimal representation of a u64 from * a substring_t * @s: substring_t to be scanned * @result: resulting unsigned long long on success * * Description: Attempts to parse the &substring_t @s as a long decimal * integer. * * Return: On success, sets @result to the integer represented by the string * and returns 0. Returns -EINVAL or -ERANGE on failure. */ int match_u64(substring_t *s, u64 *result) { return match_u64int(s, result, 0); } EXPORT_SYMBOL(match_u64); /** * match_octal - scan an octal representation of an integer from a substring_t * @s: substring_t to be scanned * @result: resulting integer on success * * Description: Attempts to parse the &substring_t @s as an octal integer. * * Return: On success, sets @result to the integer represented by the string * and returns 0. Returns -EINVAL or -ERANGE on failure. */ int match_octal(substring_t *s, int *result) { return match_number(s, result, 8); } EXPORT_SYMBOL(match_octal); /** * match_hex - scan a hex representation of an integer from a substring_t * @s: substring_t to be scanned * @result: resulting integer on success * * Description: Attempts to parse the &substring_t @s as a hexadecimal integer. * * Return: On success, sets @result to the integer represented by the string * and returns 0. Returns -EINVAL or -ERANGE on failure. */ int match_hex(substring_t *s, int *result) { return match_number(s, result, 16); } EXPORT_SYMBOL(match_hex); /** * match_wildcard - parse if a string matches given wildcard pattern * @pattern: wildcard pattern * @str: the string to be parsed * * Description: Parse the string @str to check if matches wildcard * pattern @pattern. The pattern may contain two types of wildcards: * '*' - matches zero or more characters * '?' - matches one character * * Return: If the @str matches the @pattern, return true, else return false. */ bool match_wildcard(const char *pattern, const char *str) { const char *s = str; const char *p = pattern; bool star = false; while (*s) { switch (*p) { case '?': s++; p++; break; case '*': star = true; str = s; if (!*++p) return true; pattern = p; break; default: if (*s == *p) { s++; p++; } else { if (!star) return false; str++; s = str; p = pattern; } break; } } if (*p == '*') ++p; return !*p; } EXPORT_SYMBOL(match_wildcard); /** * match_strlcpy - Copy the characters from a substring_t to a sized buffer * @dest: where to copy to * @src: &substring_t to copy * @size: size of destination buffer * * Description: Copy the characters in &substring_t @src to the * c-style string @dest. Copy no more than @size - 1 characters, plus * the terminating NUL. * * Return: length of @src. */ size_t match_strlcpy(char *dest, const substring_t *src, size_t size) { size_t ret = src->to - src->from; if (size) { size_t len = ret >= size ? size - 1 : ret; memcpy(dest, src->from, len); dest[len] = '\0'; } return ret; } EXPORT_SYMBOL(match_strlcpy); /** * match_strdup - allocate a new string with the contents of a substring_t * @s: &substring_t to copy * * Description: Allocates and returns a string filled with the contents of * the &substring_t @s. The caller is responsible for freeing the returned * string with kfree(). * * Return: the address of the newly allocated NUL-terminated string or * %NULL on error. */ char *match_strdup(const substring_t *s) { return kmemdup_nul(s->from, s->to - s->from, GFP_KERNEL); } EXPORT_SYMBOL(match_strdup); |
68 2 65 45 43 57 67 67 59 13 5 67 28 5 5 15 1 5 25 3 26 1 1 24 25 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 | // SPDX-License-Identifier: GPL-2.0-or-later #include <linux/skbuff.h> #include <linux/sctp.h> #include <net/gso.h> #include <net/gro.h> /** * skb_eth_gso_segment - segmentation handler for ethernet protocols. * @skb: buffer to segment * @features: features for the output path (see dev->features) * @type: Ethernet Protocol ID */ struct sk_buff *skb_eth_gso_segment(struct sk_buff *skb, netdev_features_t features, __be16 type) { struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT); struct packet_offload *ptype; rcu_read_lock(); list_for_each_entry_rcu(ptype, &net_hotdata.offload_base, list) { if (ptype->type == type && ptype->callbacks.gso_segment) { segs = ptype->callbacks.gso_segment(skb, features); break; } } rcu_read_unlock(); return segs; } EXPORT_SYMBOL(skb_eth_gso_segment); /** * skb_mac_gso_segment - mac layer segmentation handler. * @skb: buffer to segment * @features: features for the output path (see dev->features) */ struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb, netdev_features_t features) { struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT); struct packet_offload *ptype; int vlan_depth = skb->mac_len; __be16 type = skb_network_protocol(skb, &vlan_depth); if (unlikely(!type)) return ERR_PTR(-EINVAL); __skb_pull(skb, vlan_depth); rcu_read_lock(); list_for_each_entry_rcu(ptype, &net_hotdata.offload_base, list) { if (ptype->type == type && ptype->callbacks.gso_segment) { segs = ptype->callbacks.gso_segment(skb, features); break; } } rcu_read_unlock(); __skb_push(skb, skb->data - skb_mac_header(skb)); return segs; } EXPORT_SYMBOL(skb_mac_gso_segment); /* openvswitch calls this on rx path, so we need a different check. */ static bool skb_needs_check(const struct sk_buff *skb, bool tx_path) { if (tx_path) return skb->ip_summed != CHECKSUM_PARTIAL && skb->ip_summed != CHECKSUM_UNNECESSARY; return skb->ip_summed == CHECKSUM_NONE; } /** * __skb_gso_segment - Perform segmentation on skb. * @skb: buffer to segment * @features: features for the output path (see dev->features) * @tx_path: whether it is called in TX path * * This function segments the given skb and returns a list of segments. * * It may return NULL if the skb requires no segmentation. This is * only possible when GSO is used for verifying header integrity. * * Segmentation preserves SKB_GSO_CB_OFFSET bytes of previous skb cb. */ struct sk_buff *__skb_gso_segment(struct sk_buff *skb, netdev_features_t features, bool tx_path) { struct sk_buff *segs; if (unlikely(skb_needs_check(skb, tx_path))) { int err; /* We're going to init ->check field in TCP or UDP header */ err = skb_cow_head(skb, 0); if (err < 0) return ERR_PTR(err); } /* Only report GSO partial support if it will enable us to * support segmentation on this frame without needing additional * work. */ if (features & NETIF_F_GSO_PARTIAL) { netdev_features_t partial_features = NETIF_F_GSO_ROBUST; struct net_device *dev = skb->dev; partial_features |= dev->features & dev->gso_partial_features; if (!skb_gso_ok(skb, features | partial_features)) features &= ~NETIF_F_GSO_PARTIAL; } BUILD_BUG_ON(SKB_GSO_CB_OFFSET + sizeof(*SKB_GSO_CB(skb)) > sizeof(skb->cb)); SKB_GSO_CB(skb)->mac_offset = skb_headroom(skb); SKB_GSO_CB(skb)->encap_level = 0; skb_reset_mac_header(skb); skb_reset_mac_len(skb); segs = skb_mac_gso_segment(skb, features); if (segs != skb && unlikely(skb_needs_check(skb, tx_path) && !IS_ERR(segs))) skb_warn_bad_offload(skb); return segs; } EXPORT_SYMBOL(__skb_gso_segment); /** * skb_gso_transport_seglen - Return length of individual segments of a gso packet * * @skb: GSO skb * * skb_gso_transport_seglen is used to determine the real size of the * individual segments, including Layer4 headers (TCP/UDP). * * The MAC/L2 or network (IP, IPv6) headers are not accounted for. */ static unsigned int skb_gso_transport_seglen(const struct sk_buff *skb) { const struct skb_shared_info *shinfo = skb_shinfo(skb); unsigned int thlen = 0; if (skb->encapsulation) { thlen = skb_inner_transport_header(skb) - skb_transport_header(skb); if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) thlen += inner_tcp_hdrlen(skb); } else if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) { thlen = tcp_hdrlen(skb); } else if (unlikely(skb_is_gso_sctp(skb))) { thlen = sizeof(struct sctphdr); } else if (shinfo->gso_type & SKB_GSO_UDP_L4) { thlen = sizeof(struct udphdr); } /* UFO sets gso_size to the size of the fragmentation * payload, i.e. the size of the L4 (UDP) header is already * accounted for. */ return thlen + shinfo->gso_size; } /** * skb_gso_network_seglen - Return length of individual segments of a gso packet * * @skb: GSO skb * * skb_gso_network_seglen is used to determine the real size of the * individual segments, including Layer3 (IP, IPv6) and L4 headers (TCP/UDP). * * The MAC/L2 header is not accounted for. */ static unsigned int skb_gso_network_seglen(const struct sk_buff *skb) { unsigned int hdr_len = skb_transport_header(skb) - skb_network_header(skb); return hdr_len + skb_gso_transport_seglen(skb); } /** * skb_gso_mac_seglen - Return length of individual segments of a gso packet * * @skb: GSO skb * * skb_gso_mac_seglen is used to determine the real size of the * individual segments, including MAC/L2, Layer3 (IP, IPv6) and L4 * headers (TCP/UDP). */ static unsigned int skb_gso_mac_seglen(const struct sk_buff *skb) { unsigned int hdr_len = skb_transport_header(skb) - skb_mac_header(skb); return hdr_len + skb_gso_transport_seglen(skb); } /** * skb_gso_size_check - check the skb size, considering GSO_BY_FRAGS * * There are a couple of instances where we have a GSO skb, and we * want to determine what size it would be after it is segmented. * * We might want to check: * - L3+L4+payload size (e.g. IP forwarding) * - L2+L3+L4+payload size (e.g. sanity check before passing to driver) * * This is a helper to do that correctly considering GSO_BY_FRAGS. * * @skb: GSO skb * * @seg_len: The segmented length (from skb_gso_*_seglen). In the * GSO_BY_FRAGS case this will be [header sizes + GSO_BY_FRAGS]. * * @max_len: The maximum permissible length. * * Returns true if the segmented length <= max length. */ static inline bool skb_gso_size_check(const struct sk_buff *skb, unsigned int seg_len, unsigned int max_len) { const struct skb_shared_info *shinfo = skb_shinfo(skb); const struct sk_buff *iter; if (shinfo->gso_size != GSO_BY_FRAGS) return seg_len <= max_len; /* Undo this so we can re-use header sizes */ seg_len -= GSO_BY_FRAGS; skb_walk_frags(skb, iter) { if (seg_len + skb_headlen(iter) > max_len) return false; } return true; } /** * skb_gso_validate_network_len - Will a split GSO skb fit into a given MTU? * * @skb: GSO skb * @mtu: MTU to validate against * * skb_gso_validate_network_len validates if a given skb will fit a * wanted MTU once split. It considers L3 headers, L4 headers, and the * payload. */ bool skb_gso_validate_network_len(const struct sk_buff *skb, unsigned int mtu) { return skb_gso_size_check(skb, skb_gso_network_seglen(skb), mtu); } EXPORT_SYMBOL_GPL(skb_gso_validate_network_len); /** * skb_gso_validate_mac_len - Will a split GSO skb fit in a given length? * * @skb: GSO skb * @len: length to validate against * * skb_gso_validate_mac_len validates if a given skb will fit a wanted * length once split, including L2, L3 and L4 headers and the payload. */ bool skb_gso_validate_mac_len(const struct sk_buff *skb, unsigned int len) { return skb_gso_size_check(skb, skb_gso_mac_seglen(skb), len); } EXPORT_SYMBOL_GPL(skb_gso_validate_mac_len); |
36 39 26 27 4 27 5 23 27 27 25 1 1 25 24 22 4 23 22 23 22 23 24 24 24 19 6 19 4 2 23 26 23 23 24 25 3 1 1 2 1 1 4 4 1 3 4 3 1 1 3 2 2 2 3 4 2 2 2 2 1 1 1 1 1 1 1 2 1 1 1 4 1 45 4 41 34 6 3 34 1 2 3 34 32 4 36 2 38 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 | // SPDX-License-Identifier: GPL-2.0 /* Copyright (C) B.A.T.M.A.N. contributors: * * Matthias Schiffer */ #include "netlink.h" #include "main.h" #include <linux/array_size.h> #include <linux/atomic.h> #include <linux/bitops.h> #include <linux/bug.h> #include <linux/byteorder/generic.h> #include <linux/cache.h> #include <linux/err.h> #include <linux/errno.h> #include <linux/genetlink.h> #include <linux/gfp.h> #include <linux/if_ether.h> #include <linux/if_vlan.h> #include <linux/init.h> #include <linux/limits.h> #include <linux/list.h> #include <linux/minmax.h> #include <linux/netdevice.h> #include <linux/netlink.h> #include <linux/printk.h> #include <linux/rtnetlink.h> #include <linux/skbuff.h> #include <linux/stddef.h> #include <linux/types.h> #include <net/genetlink.h> #include <net/net_namespace.h> #include <net/netlink.h> #include <net/sock.h> #include <uapi/linux/batadv_packet.h> #include <uapi/linux/batman_adv.h> #include "bat_algo.h" #include "bridge_loop_avoidance.h" #include "distributed-arp-table.h" #include "gateway_client.h" #include "gateway_common.h" #include "hard-interface.h" #include "log.h" #include "multicast.h" #include "network-coding.h" #include "originator.h" #include "soft-interface.h" #include "tp_meter.h" #include "translation-table.h" struct genl_family batadv_netlink_family; /* multicast groups */ enum batadv_netlink_multicast_groups { BATADV_NL_MCGRP_CONFIG, BATADV_NL_MCGRP_TPMETER, }; /** * enum batadv_genl_ops_flags - flags for genl_ops's internal_flags */ enum batadv_genl_ops_flags { /** * @BATADV_FLAG_NEED_MESH: request requires valid soft interface in * attribute BATADV_ATTR_MESH_IFINDEX and expects a pointer to it to be * saved in info->user_ptr[0] */ BATADV_FLAG_NEED_MESH = BIT(0), /** * @BATADV_FLAG_NEED_HARDIF: request requires valid hard interface in * attribute BATADV_ATTR_HARD_IFINDEX and expects a pointer to it to be * saved in info->user_ptr[1] */ BATADV_FLAG_NEED_HARDIF = BIT(1), /** * @BATADV_FLAG_NEED_VLAN: request requires valid vlan in * attribute BATADV_ATTR_VLANID and expects a pointer to it to be * saved in info->user_ptr[1] */ BATADV_FLAG_NEED_VLAN = BIT(2), }; static const struct genl_multicast_group batadv_netlink_mcgrps[] = { [BATADV_NL_MCGRP_CONFIG] = { .name = BATADV_NL_MCAST_GROUP_CONFIG }, [BATADV_NL_MCGRP_TPMETER] = { .name = BATADV_NL_MCAST_GROUP_TPMETER }, }; static const struct nla_policy batadv_netlink_policy[NUM_BATADV_ATTR] = { [BATADV_ATTR_VERSION] = { .type = NLA_STRING }, [BATADV_ATTR_ALGO_NAME] = { .type = NLA_STRING }, [BATADV_ATTR_MESH_IFINDEX] = { .type = NLA_U32 }, [BATADV_ATTR_MESH_IFNAME] = { .type = NLA_STRING }, [BATADV_ATTR_MESH_ADDRESS] = { .len = ETH_ALEN }, [BATADV_ATTR_HARD_IFINDEX] = { .type = NLA_U32 }, [BATADV_ATTR_HARD_IFNAME] = { .type = NLA_STRING }, [BATADV_ATTR_HARD_ADDRESS] = { .len = ETH_ALEN }, [BATADV_ATTR_ORIG_ADDRESS] = { .len = ETH_ALEN }, [BATADV_ATTR_TPMETER_RESULT] = { .type = NLA_U8 }, [BATADV_ATTR_TPMETER_TEST_TIME] = { .type = NLA_U32 }, [BATADV_ATTR_TPMETER_BYTES] = { .type = NLA_U64 }, [BATADV_ATTR_TPMETER_COOKIE] = { .type = NLA_U32 }, [BATADV_ATTR_ACTIVE] = { .type = NLA_FLAG }, [BATADV_ATTR_TT_ADDRESS] = { .len = ETH_ALEN }, [BATADV_ATTR_TT_TTVN] = { .type = NLA_U8 }, [BATADV_ATTR_TT_LAST_TTVN] = { .type = NLA_U8 }, [BATADV_ATTR_TT_CRC32] = { .type = NLA_U32 }, [BATADV_ATTR_TT_VID] = { .type = NLA_U16 }, [BATADV_ATTR_TT_FLAGS] = { .type = NLA_U32 }, [BATADV_ATTR_FLAG_BEST] = { .type = NLA_FLAG }, [BATADV_ATTR_LAST_SEEN_MSECS] = { .type = NLA_U32 }, [BATADV_ATTR_NEIGH_ADDRESS] = { .len = ETH_ALEN }, [BATADV_ATTR_TQ] = { .type = NLA_U8 }, [BATADV_ATTR_THROUGHPUT] = { .type = NLA_U32 }, [BATADV_ATTR_BANDWIDTH_UP] = { .type = NLA_U32 }, [BATADV_ATTR_BANDWIDTH_DOWN] = { .type = NLA_U32 }, [BATADV_ATTR_ROUTER] = { .len = ETH_ALEN }, [BATADV_ATTR_BLA_OWN] = { .type = NLA_FLAG }, [BATADV_ATTR_BLA_ADDRESS] = { .len = ETH_ALEN }, [BATADV_ATTR_BLA_VID] = { .type = NLA_U16 }, [BATADV_ATTR_BLA_BACKBONE] = { .len = ETH_ALEN }, [BATADV_ATTR_BLA_CRC] = { .type = NLA_U16 }, [BATADV_ATTR_DAT_CACHE_IP4ADDRESS] = { .type = NLA_U32 }, [BATADV_ATTR_DAT_CACHE_HWADDRESS] = { .len = ETH_ALEN }, [BATADV_ATTR_DAT_CACHE_VID] = { .type = NLA_U16 }, [BATADV_ATTR_MCAST_FLAGS] = { .type = NLA_U32 }, [BATADV_ATTR_MCAST_FLAGS_PRIV] = { .type = NLA_U32 }, [BATADV_ATTR_VLANID] = { .type = NLA_U16 }, [BATADV_ATTR_AGGREGATED_OGMS_ENABLED] = { .type = NLA_U8 }, [BATADV_ATTR_AP_ISOLATION_ENABLED] = { .type = NLA_U8 }, [BATADV_ATTR_ISOLATION_MARK] = { .type = NLA_U32 }, [BATADV_ATTR_ISOLATION_MASK] = { .type = NLA_U32 }, [BATADV_ATTR_BONDING_ENABLED] = { .type = NLA_U8 }, [BATADV_ATTR_BRIDGE_LOOP_AVOIDANCE_ENABLED] = { .type = NLA_U8 }, [BATADV_ATTR_DISTRIBUTED_ARP_TABLE_ENABLED] = { .type = NLA_U8 }, [BATADV_ATTR_FRAGMENTATION_ENABLED] = { .type = NLA_U8 }, [BATADV_ATTR_GW_BANDWIDTH_DOWN] = { .type = NLA_U32 }, [BATADV_ATTR_GW_BANDWIDTH_UP] = { .type = NLA_U32 }, [BATADV_ATTR_GW_MODE] = { .type = NLA_U8 }, [BATADV_ATTR_GW_SEL_CLASS] = { .type = NLA_U32 }, [BATADV_ATTR_HOP_PENALTY] = { .type = NLA_U8 }, [BATADV_ATTR_LOG_LEVEL] = { .type = NLA_U32 }, [BATADV_ATTR_MULTICAST_FORCEFLOOD_ENABLED] = { .type = NLA_U8 }, [BATADV_ATTR_MULTICAST_FANOUT] = { .type = NLA_U32 }, [BATADV_ATTR_NETWORK_CODING_ENABLED] = { .type = NLA_U8 }, [BATADV_ATTR_ORIG_INTERVAL] = { .type = NLA_U32 }, [BATADV_ATTR_ELP_INTERVAL] = { .type = NLA_U32 }, [BATADV_ATTR_THROUGHPUT_OVERRIDE] = { .type = NLA_U32 }, }; /** * batadv_netlink_get_ifindex() - Extract an interface index from a message * @nlh: Message header * @attrtype: Attribute which holds an interface index * * Return: interface index, or 0. */ int batadv_netlink_get_ifindex(const struct nlmsghdr *nlh, int attrtype) { struct nlattr *attr = nlmsg_find_attr(nlh, GENL_HDRLEN, attrtype); return (attr && nla_len(attr) == sizeof(u32)) ? nla_get_u32(attr) : 0; } /** * batadv_netlink_mesh_fill_ap_isolation() - Add ap_isolation softif attribute * @msg: Netlink message to dump into * @bat_priv: the bat priv with all the soft interface information * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_mesh_fill_ap_isolation(struct sk_buff *msg, struct batadv_priv *bat_priv) { struct batadv_softif_vlan *vlan; u8 ap_isolation; vlan = batadv_softif_vlan_get(bat_priv, BATADV_NO_FLAGS); if (!vlan) return 0; ap_isolation = atomic_read(&vlan->ap_isolation); batadv_softif_vlan_put(vlan); return nla_put_u8(msg, BATADV_ATTR_AP_ISOLATION_ENABLED, !!ap_isolation); } /** * batadv_netlink_set_mesh_ap_isolation() - Set ap_isolation from genl msg * @attr: parsed BATADV_ATTR_AP_ISOLATION_ENABLED attribute * @bat_priv: the bat priv with all the soft interface information * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_set_mesh_ap_isolation(struct nlattr *attr, struct batadv_priv *bat_priv) { struct batadv_softif_vlan *vlan; vlan = batadv_softif_vlan_get(bat_priv, BATADV_NO_FLAGS); if (!vlan) return -ENOENT; atomic_set(&vlan->ap_isolation, !!nla_get_u8(attr)); batadv_softif_vlan_put(vlan); return 0; } /** * batadv_netlink_mesh_fill() - Fill message with mesh attributes * @msg: Netlink message to dump into * @bat_priv: the bat priv with all the soft interface information * @cmd: type of message to generate * @portid: Port making netlink request * @seq: sequence number for message * @flags: Additional flags for message * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_mesh_fill(struct sk_buff *msg, struct batadv_priv *bat_priv, enum batadv_nl_commands cmd, u32 portid, u32 seq, int flags) { struct net_device *soft_iface = bat_priv->soft_iface; struct batadv_hard_iface *primary_if = NULL; struct net_device *hard_iface; void *hdr; hdr = genlmsg_put(msg, portid, seq, &batadv_netlink_family, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_string(msg, BATADV_ATTR_VERSION, BATADV_SOURCE_VERSION) || nla_put_string(msg, BATADV_ATTR_ALGO_NAME, bat_priv->algo_ops->name) || nla_put_u32(msg, BATADV_ATTR_MESH_IFINDEX, soft_iface->ifindex) || nla_put_string(msg, BATADV_ATTR_MESH_IFNAME, soft_iface->name) || nla_put(msg, BATADV_ATTR_MESH_ADDRESS, ETH_ALEN, soft_iface->dev_addr) || nla_put_u8(msg, BATADV_ATTR_TT_TTVN, (u8)atomic_read(&bat_priv->tt.vn))) goto nla_put_failure; #ifdef CONFIG_BATMAN_ADV_BLA if (nla_put_u16(msg, BATADV_ATTR_BLA_CRC, ntohs(bat_priv->bla.claim_dest.group))) goto nla_put_failure; #endif if (batadv_mcast_mesh_info_put(msg, bat_priv)) goto nla_put_failure; primary_if = batadv_primary_if_get_selected(bat_priv); if (primary_if && primary_if->if_status == BATADV_IF_ACTIVE) { hard_iface = primary_if->net_dev; if (nla_put_u32(msg, BATADV_ATTR_HARD_IFINDEX, hard_iface->ifindex) || nla_put_string(msg, BATADV_ATTR_HARD_IFNAME, hard_iface->name) || nla_put(msg, BATADV_ATTR_HARD_ADDRESS, ETH_ALEN, hard_iface->dev_addr)) goto nla_put_failure; } if (nla_put_u8(msg, BATADV_ATTR_AGGREGATED_OGMS_ENABLED, !!atomic_read(&bat_priv->aggregated_ogms))) goto nla_put_failure; if (batadv_netlink_mesh_fill_ap_isolation(msg, bat_priv)) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_ISOLATION_MARK, bat_priv->isolation_mark)) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_ISOLATION_MASK, bat_priv->isolation_mark_mask)) goto nla_put_failure; if (nla_put_u8(msg, BATADV_ATTR_BONDING_ENABLED, !!atomic_read(&bat_priv->bonding))) goto nla_put_failure; #ifdef CONFIG_BATMAN_ADV_BLA if (nla_put_u8(msg, BATADV_ATTR_BRIDGE_LOOP_AVOIDANCE_ENABLED, !!atomic_read(&bat_priv->bridge_loop_avoidance))) goto nla_put_failure; #endif /* CONFIG_BATMAN_ADV_BLA */ #ifdef CONFIG_BATMAN_ADV_DAT if (nla_put_u8(msg, BATADV_ATTR_DISTRIBUTED_ARP_TABLE_ENABLED, !!atomic_read(&bat_priv->distributed_arp_table))) goto nla_put_failure; #endif /* CONFIG_BATMAN_ADV_DAT */ if (nla_put_u8(msg, BATADV_ATTR_FRAGMENTATION_ENABLED, !!atomic_read(&bat_priv->fragmentation))) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_GW_BANDWIDTH_DOWN, atomic_read(&bat_priv->gw.bandwidth_down))) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_GW_BANDWIDTH_UP, atomic_read(&bat_priv->gw.bandwidth_up))) goto nla_put_failure; if (nla_put_u8(msg, BATADV_ATTR_GW_MODE, atomic_read(&bat_priv->gw.mode))) goto nla_put_failure; if (bat_priv->algo_ops->gw.get_best_gw_node && bat_priv->algo_ops->gw.is_eligible) { /* GW selection class is not available if the routing algorithm * in use does not implement the GW API */ if (nla_put_u32(msg, BATADV_ATTR_GW_SEL_CLASS, atomic_read(&bat_priv->gw.sel_class))) goto nla_put_failure; } if (nla_put_u8(msg, BATADV_ATTR_HOP_PENALTY, atomic_read(&bat_priv->hop_penalty))) goto nla_put_failure; #ifdef CONFIG_BATMAN_ADV_DEBUG if (nla_put_u32(msg, BATADV_ATTR_LOG_LEVEL, atomic_read(&bat_priv->log_level))) goto nla_put_failure; #endif /* CONFIG_BATMAN_ADV_DEBUG */ #ifdef CONFIG_BATMAN_ADV_MCAST if (nla_put_u8(msg, BATADV_ATTR_MULTICAST_FORCEFLOOD_ENABLED, !atomic_read(&bat_priv->multicast_mode))) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_MULTICAST_FANOUT, atomic_read(&bat_priv->multicast_fanout))) goto nla_put_failure; #endif /* CONFIG_BATMAN_ADV_MCAST */ #ifdef CONFIG_BATMAN_ADV_NC if (nla_put_u8(msg, BATADV_ATTR_NETWORK_CODING_ENABLED, !!atomic_read(&bat_priv->network_coding))) goto nla_put_failure; #endif /* CONFIG_BATMAN_ADV_NC */ if (nla_put_u32(msg, BATADV_ATTR_ORIG_INTERVAL, atomic_read(&bat_priv->orig_interval))) goto nla_put_failure; batadv_hardif_put(primary_if); genlmsg_end(msg, hdr); return 0; nla_put_failure: batadv_hardif_put(primary_if); genlmsg_cancel(msg, hdr); return -EMSGSIZE; } /** * batadv_netlink_notify_mesh() - send softif attributes to listener * @bat_priv: the bat priv with all the soft interface information * * Return: 0 on success, < 0 on error */ static int batadv_netlink_notify_mesh(struct batadv_priv *bat_priv) { struct sk_buff *msg; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; ret = batadv_netlink_mesh_fill(msg, bat_priv, BATADV_CMD_SET_MESH, 0, 0, 0); if (ret < 0) { nlmsg_free(msg); return ret; } genlmsg_multicast_netns(&batadv_netlink_family, dev_net(bat_priv->soft_iface), msg, 0, BATADV_NL_MCGRP_CONFIG, GFP_KERNEL); return 0; } /** * batadv_netlink_get_mesh() - Get softif attributes * @skb: Netlink message with request data * @info: receiver information * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_get_mesh(struct sk_buff *skb, struct genl_info *info) { struct batadv_priv *bat_priv = info->user_ptr[0]; struct sk_buff *msg; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; ret = batadv_netlink_mesh_fill(msg, bat_priv, BATADV_CMD_GET_MESH, info->snd_portid, info->snd_seq, 0); if (ret < 0) { nlmsg_free(msg); return ret; } ret = genlmsg_reply(msg, info); return ret; } /** * batadv_netlink_set_mesh() - Set softif attributes * @skb: Netlink message with request data * @info: receiver information * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_set_mesh(struct sk_buff *skb, struct genl_info *info) { struct batadv_priv *bat_priv = info->user_ptr[0]; struct nlattr *attr; if (info->attrs[BATADV_ATTR_AGGREGATED_OGMS_ENABLED]) { attr = info->attrs[BATADV_ATTR_AGGREGATED_OGMS_ENABLED]; atomic_set(&bat_priv->aggregated_ogms, !!nla_get_u8(attr)); } if (info->attrs[BATADV_ATTR_AP_ISOLATION_ENABLED]) { attr = info->attrs[BATADV_ATTR_AP_ISOLATION_ENABLED]; batadv_netlink_set_mesh_ap_isolation(attr, bat_priv); } if (info->attrs[BATADV_ATTR_ISOLATION_MARK]) { attr = info->attrs[BATADV_ATTR_ISOLATION_MARK]; bat_priv->isolation_mark = nla_get_u32(attr); } if (info->attrs[BATADV_ATTR_ISOLATION_MASK]) { attr = info->attrs[BATADV_ATTR_ISOLATION_MASK]; bat_priv->isolation_mark_mask = nla_get_u32(attr); } if (info->attrs[BATADV_ATTR_BONDING_ENABLED]) { attr = info->attrs[BATADV_ATTR_BONDING_ENABLED]; atomic_set(&bat_priv->bonding, !!nla_get_u8(attr)); } #ifdef CONFIG_BATMAN_ADV_BLA if (info->attrs[BATADV_ATTR_BRIDGE_LOOP_AVOIDANCE_ENABLED]) { attr = info->attrs[BATADV_ATTR_BRIDGE_LOOP_AVOIDANCE_ENABLED]; atomic_set(&bat_priv->bridge_loop_avoidance, !!nla_get_u8(attr)); batadv_bla_status_update(bat_priv->soft_iface); } #endif /* CONFIG_BATMAN_ADV_BLA */ #ifdef CONFIG_BATMAN_ADV_DAT if (info->attrs[BATADV_ATTR_DISTRIBUTED_ARP_TABLE_ENABLED]) { attr = info->attrs[BATADV_ATTR_DISTRIBUTED_ARP_TABLE_ENABLED]; atomic_set(&bat_priv->distributed_arp_table, !!nla_get_u8(attr)); batadv_dat_status_update(bat_priv->soft_iface); } #endif /* CONFIG_BATMAN_ADV_DAT */ if (info->attrs[BATADV_ATTR_FRAGMENTATION_ENABLED]) { attr = info->attrs[BATADV_ATTR_FRAGMENTATION_ENABLED]; atomic_set(&bat_priv->fragmentation, !!nla_get_u8(attr)); rtnl_lock(); batadv_update_min_mtu(bat_priv->soft_iface); rtnl_unlock(); } if (info->attrs[BATADV_ATTR_GW_BANDWIDTH_DOWN]) { attr = info->attrs[BATADV_ATTR_GW_BANDWIDTH_DOWN]; atomic_set(&bat_priv->gw.bandwidth_down, nla_get_u32(attr)); batadv_gw_tvlv_container_update(bat_priv); } if (info->attrs[BATADV_ATTR_GW_BANDWIDTH_UP]) { attr = info->attrs[BATADV_ATTR_GW_BANDWIDTH_UP]; atomic_set(&bat_priv->gw.bandwidth_up, nla_get_u32(attr)); batadv_gw_tvlv_container_update(bat_priv); } if (info->attrs[BATADV_ATTR_GW_MODE]) { u8 gw_mode; attr = info->attrs[BATADV_ATTR_GW_MODE]; gw_mode = nla_get_u8(attr); if (gw_mode <= BATADV_GW_MODE_SERVER) { /* Invoking batadv_gw_reselect() is not enough to really * de-select the current GW. It will only instruct the * gateway client code to perform a re-election the next * time that this is needed. * * When gw client mode is being switched off the current * GW must be de-selected explicitly otherwise no GW_ADD * uevent is thrown on client mode re-activation. This * is operation is performed in * batadv_gw_check_client_stop(). */ batadv_gw_reselect(bat_priv); /* always call batadv_gw_check_client_stop() before * changing the gateway state */ batadv_gw_check_client_stop(bat_priv); atomic_set(&bat_priv->gw.mode, gw_mode); batadv_gw_tvlv_container_update(bat_priv); } } if (info->attrs[BATADV_ATTR_GW_SEL_CLASS] && bat_priv->algo_ops->gw.get_best_gw_node && bat_priv->algo_ops->gw.is_eligible) { /* setting the GW selection class is allowed only if the routing * algorithm in use implements the GW API */ u32 sel_class_max = bat_priv->algo_ops->gw.sel_class_max; u32 sel_class; attr = info->attrs[BATADV_ATTR_GW_SEL_CLASS]; sel_class = nla_get_u32(attr); if (sel_class >= 1 && sel_class <= sel_class_max) { atomic_set(&bat_priv->gw.sel_class, sel_class); batadv_gw_reselect(bat_priv); } } if (info->attrs[BATADV_ATTR_HOP_PENALTY]) { attr = info->attrs[BATADV_ATTR_HOP_PENALTY]; atomic_set(&bat_priv->hop_penalty, nla_get_u8(attr)); } #ifdef CONFIG_BATMAN_ADV_DEBUG if (info->attrs[BATADV_ATTR_LOG_LEVEL]) { attr = info->attrs[BATADV_ATTR_LOG_LEVEL]; atomic_set(&bat_priv->log_level, nla_get_u32(attr) & BATADV_DBG_ALL); } #endif /* CONFIG_BATMAN_ADV_DEBUG */ #ifdef CONFIG_BATMAN_ADV_MCAST if (info->attrs[BATADV_ATTR_MULTICAST_FORCEFLOOD_ENABLED]) { attr = info->attrs[BATADV_ATTR_MULTICAST_FORCEFLOOD_ENABLED]; atomic_set(&bat_priv->multicast_mode, !nla_get_u8(attr)); } if (info->attrs[BATADV_ATTR_MULTICAST_FANOUT]) { attr = info->attrs[BATADV_ATTR_MULTICAST_FANOUT]; atomic_set(&bat_priv->multicast_fanout, nla_get_u32(attr)); } #endif /* CONFIG_BATMAN_ADV_MCAST */ #ifdef CONFIG_BATMAN_ADV_NC if (info->attrs[BATADV_ATTR_NETWORK_CODING_ENABLED]) { attr = info->attrs[BATADV_ATTR_NETWORK_CODING_ENABLED]; atomic_set(&bat_priv->network_coding, !!nla_get_u8(attr)); batadv_nc_status_update(bat_priv->soft_iface); } #endif /* CONFIG_BATMAN_ADV_NC */ if (info->attrs[BATADV_ATTR_ORIG_INTERVAL]) { u32 orig_interval; attr = info->attrs[BATADV_ATTR_ORIG_INTERVAL]; orig_interval = nla_get_u32(attr); orig_interval = min_t(u32, orig_interval, INT_MAX); orig_interval = max_t(u32, orig_interval, 2 * BATADV_JITTER); atomic_set(&bat_priv->orig_interval, orig_interval); } batadv_netlink_notify_mesh(bat_priv); return 0; } /** * batadv_netlink_tp_meter_put() - Fill information of started tp_meter session * @msg: netlink message to be sent back * @cookie: tp meter session cookie * * Return: 0 on success, < 0 on error */ static int batadv_netlink_tp_meter_put(struct sk_buff *msg, u32 cookie) { if (nla_put_u32(msg, BATADV_ATTR_TPMETER_COOKIE, cookie)) return -ENOBUFS; return 0; } /** * batadv_netlink_tpmeter_notify() - send tp_meter result via netlink to client * @bat_priv: the bat priv with all the soft interface information * @dst: destination of tp_meter session * @result: reason for tp meter session stop * @test_time: total time of the tp_meter session * @total_bytes: bytes acked to the receiver * @cookie: cookie of tp_meter session * * Return: 0 on success, < 0 on error */ int batadv_netlink_tpmeter_notify(struct batadv_priv *bat_priv, const u8 *dst, u8 result, u32 test_time, u64 total_bytes, u32 cookie) { struct sk_buff *msg; void *hdr; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; hdr = genlmsg_put(msg, 0, 0, &batadv_netlink_family, 0, BATADV_CMD_TP_METER); if (!hdr) { ret = -ENOBUFS; goto err_genlmsg; } if (nla_put_u32(msg, BATADV_ATTR_TPMETER_COOKIE, cookie)) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_TPMETER_TEST_TIME, test_time)) goto nla_put_failure; if (nla_put_u64_64bit(msg, BATADV_ATTR_TPMETER_BYTES, total_bytes, BATADV_ATTR_PAD)) goto nla_put_failure; if (nla_put_u8(msg, BATADV_ATTR_TPMETER_RESULT, result)) goto nla_put_failure; if (nla_put(msg, BATADV_ATTR_ORIG_ADDRESS, ETH_ALEN, dst)) goto nla_put_failure; genlmsg_end(msg, hdr); genlmsg_multicast_netns(&batadv_netlink_family, dev_net(bat_priv->soft_iface), msg, 0, BATADV_NL_MCGRP_TPMETER, GFP_KERNEL); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); ret = -EMSGSIZE; err_genlmsg: nlmsg_free(msg); return ret; } /** * batadv_netlink_tp_meter_start() - Start a new tp_meter session * @skb: received netlink message * @info: receiver information * * Return: 0 on success, < 0 on error */ static int batadv_netlink_tp_meter_start(struct sk_buff *skb, struct genl_info *info) { struct batadv_priv *bat_priv = info->user_ptr[0]; struct sk_buff *msg = NULL; u32 test_length; void *msg_head; u32 cookie; u8 *dst; int ret; if (!info->attrs[BATADV_ATTR_ORIG_ADDRESS]) return -EINVAL; if (!info->attrs[BATADV_ATTR_TPMETER_TEST_TIME]) return -EINVAL; dst = nla_data(info->attrs[BATADV_ATTR_ORIG_ADDRESS]); test_length = nla_get_u32(info->attrs[BATADV_ATTR_TPMETER_TEST_TIME]); msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) { ret = -ENOMEM; goto out; } msg_head = genlmsg_put(msg, info->snd_portid, info->snd_seq, &batadv_netlink_family, 0, BATADV_CMD_TP_METER); if (!msg_head) { ret = -ENOBUFS; goto out; } batadv_tp_start(bat_priv, dst, test_length, &cookie); ret = batadv_netlink_tp_meter_put(msg, cookie); out: if (ret) { if (msg) nlmsg_free(msg); return ret; } genlmsg_end(msg, msg_head); return genlmsg_reply(msg, info); } /** * batadv_netlink_tp_meter_cancel() - Cancel a running tp_meter session * @skb: received netlink message * @info: receiver information * * Return: 0 on success, < 0 on error */ static int batadv_netlink_tp_meter_cancel(struct sk_buff *skb, struct genl_info *info) { struct batadv_priv *bat_priv = info->user_ptr[0]; u8 *dst; int ret = 0; if (!info->attrs[BATADV_ATTR_ORIG_ADDRESS]) return -EINVAL; dst = nla_data(info->attrs[BATADV_ATTR_ORIG_ADDRESS]); batadv_tp_stop(bat_priv, dst, BATADV_TP_REASON_CANCEL); return ret; } /** * batadv_netlink_hardif_fill() - Fill message with hardif attributes * @msg: Netlink message to dump into * @bat_priv: the bat priv with all the soft interface information * @hard_iface: hard interface which was modified * @cmd: type of message to generate * @portid: Port making netlink request * @seq: sequence number for message * @flags: Additional flags for message * @cb: Control block containing additional options * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_hardif_fill(struct sk_buff *msg, struct batadv_priv *bat_priv, struct batadv_hard_iface *hard_iface, enum batadv_nl_commands cmd, u32 portid, u32 seq, int flags, struct netlink_callback *cb) { struct net_device *net_dev = hard_iface->net_dev; void *hdr; hdr = genlmsg_put(msg, portid, seq, &batadv_netlink_family, flags, cmd); if (!hdr) return -ENOBUFS; if (cb) genl_dump_check_consistent(cb, hdr); if (nla_put_u32(msg, BATADV_ATTR_MESH_IFINDEX, bat_priv->soft_iface->ifindex)) goto nla_put_failure; if (nla_put_string(msg, BATADV_ATTR_MESH_IFNAME, bat_priv->soft_iface->name)) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_HARD_IFINDEX, net_dev->ifindex) || nla_put_string(msg, BATADV_ATTR_HARD_IFNAME, net_dev->name) || nla_put(msg, BATADV_ATTR_HARD_ADDRESS, ETH_ALEN, net_dev->dev_addr)) goto nla_put_failure; if (hard_iface->if_status == BATADV_IF_ACTIVE) { if (nla_put_flag(msg, BATADV_ATTR_ACTIVE)) goto nla_put_failure; } if (nla_put_u8(msg, BATADV_ATTR_HOP_PENALTY, atomic_read(&hard_iface->hop_penalty))) goto nla_put_failure; #ifdef CONFIG_BATMAN_ADV_BATMAN_V if (nla_put_u32(msg, BATADV_ATTR_ELP_INTERVAL, atomic_read(&hard_iface->bat_v.elp_interval))) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_THROUGHPUT_OVERRIDE, atomic_read(&hard_iface->bat_v.throughput_override))) goto nla_put_failure; #endif /* CONFIG_BATMAN_ADV_BATMAN_V */ genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } /** * batadv_netlink_notify_hardif() - send hardif attributes to listener * @bat_priv: the bat priv with all the soft interface information * @hard_iface: hard interface which was modified * * Return: 0 on success, < 0 on error */ static int batadv_netlink_notify_hardif(struct batadv_priv *bat_priv, struct batadv_hard_iface *hard_iface) { struct sk_buff *msg; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; ret = batadv_netlink_hardif_fill(msg, bat_priv, hard_iface, BATADV_CMD_SET_HARDIF, 0, 0, 0, NULL); if (ret < 0) { nlmsg_free(msg); return ret; } genlmsg_multicast_netns(&batadv_netlink_family, dev_net(bat_priv->soft_iface), msg, 0, BATADV_NL_MCGRP_CONFIG, GFP_KERNEL); return 0; } /** * batadv_netlink_get_hardif() - Get hardif attributes * @skb: Netlink message with request data * @info: receiver information * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_get_hardif(struct sk_buff *skb, struct genl_info *info) { struct batadv_hard_iface *hard_iface = info->user_ptr[1]; struct batadv_priv *bat_priv = info->user_ptr[0]; struct sk_buff *msg; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; ret = batadv_netlink_hardif_fill(msg, bat_priv, hard_iface, BATADV_CMD_GET_HARDIF, info->snd_portid, info->snd_seq, 0, NULL); if (ret < 0) { nlmsg_free(msg); return ret; } ret = genlmsg_reply(msg, info); return ret; } /** * batadv_netlink_set_hardif() - Set hardif attributes * @skb: Netlink message with request data * @info: receiver information * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_set_hardif(struct sk_buff *skb, struct genl_info *info) { struct batadv_hard_iface *hard_iface = info->user_ptr[1]; struct batadv_priv *bat_priv = info->user_ptr[0]; struct nlattr *attr; if (info->attrs[BATADV_ATTR_HOP_PENALTY]) { attr = info->attrs[BATADV_ATTR_HOP_PENALTY]; atomic_set(&hard_iface->hop_penalty, nla_get_u8(attr)); } #ifdef CONFIG_BATMAN_ADV_BATMAN_V if (info->attrs[BATADV_ATTR_ELP_INTERVAL]) { attr = info->attrs[BATADV_ATTR_ELP_INTERVAL]; atomic_set(&hard_iface->bat_v.elp_interval, nla_get_u32(attr)); } if (info->attrs[BATADV_ATTR_THROUGHPUT_OVERRIDE]) { attr = info->attrs[BATADV_ATTR_THROUGHPUT_OVERRIDE]; atomic_set(&hard_iface->bat_v.throughput_override, nla_get_u32(attr)); } #endif /* CONFIG_BATMAN_ADV_BATMAN_V */ batadv_netlink_notify_hardif(bat_priv, hard_iface); return 0; } /** * batadv_netlink_dump_hardif() - Dump all hard interface into a messages * @msg: Netlink message to dump into * @cb: Parameters from query * * Return: error code, or length of reply message on success */ static int batadv_netlink_dump_hardif(struct sk_buff *msg, struct netlink_callback *cb) { struct net *net = sock_net(cb->skb->sk); struct net_device *soft_iface; struct batadv_hard_iface *hard_iface; struct batadv_priv *bat_priv; int ifindex; int portid = NETLINK_CB(cb->skb).portid; int skip = cb->args[0]; int i = 0; ifindex = batadv_netlink_get_ifindex(cb->nlh, BATADV_ATTR_MESH_IFINDEX); if (!ifindex) return -EINVAL; soft_iface = dev_get_by_index(net, ifindex); if (!soft_iface) return -ENODEV; if (!batadv_softif_is_valid(soft_iface)) { dev_put(soft_iface); return -ENODEV; } bat_priv = netdev_priv(soft_iface); rtnl_lock(); cb->seq = batadv_hardif_generation << 1 | 1; list_for_each_entry(hard_iface, &batadv_hardif_list, list) { if (hard_iface->soft_iface != soft_iface) continue; if (i++ < skip) continue; if (batadv_netlink_hardif_fill(msg, bat_priv, hard_iface, BATADV_CMD_GET_HARDIF, portid, cb->nlh->nlmsg_seq, NLM_F_MULTI, cb)) { i--; break; } } rtnl_unlock(); dev_put(soft_iface); cb->args[0] = i; return msg->len; } /** * batadv_netlink_vlan_fill() - Fill message with vlan attributes * @msg: Netlink message to dump into * @bat_priv: the bat priv with all the soft interface information * @vlan: vlan which was modified * @cmd: type of message to generate * @portid: Port making netlink request * @seq: sequence number for message * @flags: Additional flags for message * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_vlan_fill(struct sk_buff *msg, struct batadv_priv *bat_priv, struct batadv_softif_vlan *vlan, enum batadv_nl_commands cmd, u32 portid, u32 seq, int flags) { void *hdr; hdr = genlmsg_put(msg, portid, seq, &batadv_netlink_family, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_u32(msg, BATADV_ATTR_MESH_IFINDEX, bat_priv->soft_iface->ifindex)) goto nla_put_failure; if (nla_put_string(msg, BATADV_ATTR_MESH_IFNAME, bat_priv->soft_iface->name)) goto nla_put_failure; if (nla_put_u32(msg, BATADV_ATTR_VLANID, vlan->vid & VLAN_VID_MASK)) goto nla_put_failure; if (nla_put_u8(msg, BATADV_ATTR_AP_ISOLATION_ENABLED, !!atomic_read(&vlan->ap_isolation))) goto nla_put_failure; genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } /** * batadv_netlink_notify_vlan() - send vlan attributes to listener * @bat_priv: the bat priv with all the soft interface information * @vlan: vlan which was modified * * Return: 0 on success, < 0 on error */ static int batadv_netlink_notify_vlan(struct batadv_priv *bat_priv, struct batadv_softif_vlan *vlan) { struct sk_buff *msg; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; ret = batadv_netlink_vlan_fill(msg, bat_priv, vlan, BATADV_CMD_SET_VLAN, 0, 0, 0); if (ret < 0) { nlmsg_free(msg); return ret; } genlmsg_multicast_netns(&batadv_netlink_family, dev_net(bat_priv->soft_iface), msg, 0, BATADV_NL_MCGRP_CONFIG, GFP_KERNEL); return 0; } /** * batadv_netlink_get_vlan() - Get vlan attributes * @skb: Netlink message with request data * @info: receiver information * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_get_vlan(struct sk_buff *skb, struct genl_info *info) { struct batadv_softif_vlan *vlan = info->user_ptr[1]; struct batadv_priv *bat_priv = info->user_ptr[0]; struct sk_buff *msg; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; ret = batadv_netlink_vlan_fill(msg, bat_priv, vlan, BATADV_CMD_GET_VLAN, info->snd_portid, info->snd_seq, 0); if (ret < 0) { nlmsg_free(msg); return ret; } ret = genlmsg_reply(msg, info); return ret; } /** * batadv_netlink_set_vlan() - Get vlan attributes * @skb: Netlink message with request data * @info: receiver information * * Return: 0 on success or negative error number in case of failure */ static int batadv_netlink_set_vlan(struct sk_buff *skb, struct genl_info *info) { struct batadv_softif_vlan *vlan = info->user_ptr[1]; struct batadv_priv *bat_priv = info->user_ptr[0]; struct nlattr *attr; if (info->attrs[BATADV_ATTR_AP_ISOLATION_ENABLED]) { attr = info->attrs[BATADV_ATTR_AP_ISOLATION_ENABLED]; atomic_set(&vlan->ap_isolation, !!nla_get_u8(attr)); } batadv_netlink_notify_vlan(bat_priv, vlan); return 0; } /** * batadv_get_softif_from_info() - Retrieve soft interface from genl attributes * @net: the applicable net namespace * @info: receiver information * * Return: Pointer to soft interface (with increased refcnt) on success, error * pointer on error */ static struct net_device * batadv_get_softif_from_info(struct net *net, struct genl_info *info) { struct net_device *soft_iface; int ifindex; if (!info->attrs[BATADV_ATTR_MESH_IFINDEX]) return ERR_PTR(-EINVAL); ifindex = nla_get_u32(info->attrs[BATADV_ATTR_MESH_IFINDEX]); soft_iface = dev_get_by_index(net, ifindex); if (!soft_iface) return ERR_PTR(-ENODEV); if (!batadv_softif_is_valid(soft_iface)) goto err_put_softif; return soft_iface; err_put_softif: dev_put(soft_iface); return ERR_PTR(-EINVAL); } /** * batadv_get_hardif_from_info() - Retrieve hardif from genl attributes * @bat_priv: the bat priv with all the soft interface information * @net: the applicable net namespace * @info: receiver information * * Return: Pointer to hard interface (with increased refcnt) on success, error * pointer on error */ static struct batadv_hard_iface * batadv_get_hardif_from_info(struct batadv_priv *bat_priv, struct net *net, struct genl_info *info) { struct batadv_hard_iface *hard_iface; struct net_device *hard_dev; unsigned int hardif_index; if (!info->attrs[BATADV_ATTR_HARD_IFINDEX]) return ERR_PTR(-EINVAL); hardif_index = nla_get_u32(info->attrs[BATADV_ATTR_HARD_IFINDEX]); hard_dev = dev_get_by_index(net, hardif_index); if (!hard_dev) return ERR_PTR(-ENODEV); hard_iface = batadv_hardif_get_by_netdev(hard_dev); if (!hard_iface) goto err_put_harddev; if (hard_iface->soft_iface != bat_priv->soft_iface) goto err_put_hardif; /* hard_dev is referenced by hard_iface and not needed here */ dev_put(hard_dev); return hard_iface; err_put_hardif: batadv_hardif_put(hard_iface); err_put_harddev: dev_put(hard_dev); return ERR_PTR(-EINVAL); } /** * batadv_get_vlan_from_info() - Retrieve vlan from genl attributes * @bat_priv: the bat priv with all the soft interface information * @net: the applicable net namespace * @info: receiver information * * Return: Pointer to vlan on success (with increased refcnt), error pointer * on error */ static struct batadv_softif_vlan * batadv_get_vlan_from_info(struct batadv_priv *bat_priv, struct net *net, struct genl_info *info) { struct batadv_softif_vlan *vlan; u16 vid; if (!info->attrs[BATADV_ATTR_VLANID]) return ERR_PTR(-EINVAL); vid = nla_get_u16(info->attrs[BATADV_ATTR_VLANID]); vlan = batadv_softif_vlan_get(bat_priv, vid | BATADV_VLAN_HAS_TAG); if (!vlan) return ERR_PTR(-ENOENT); return vlan; } /** * batadv_pre_doit() - Prepare batman-adv genl doit request * @ops: requested netlink operation * @skb: Netlink message with request data * @info: receiver information * * Return: 0 on success or negative error number in case of failure */ static int batadv_pre_doit(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info) { struct net *net = genl_info_net(info); struct batadv_hard_iface *hard_iface; struct batadv_priv *bat_priv = NULL; struct batadv_softif_vlan *vlan; struct net_device *soft_iface; u8 user_ptr1_flags; u8 mesh_dep_flags; int ret; user_ptr1_flags = BATADV_FLAG_NEED_HARDIF | BATADV_FLAG_NEED_VLAN; if (WARN_ON(hweight8(ops->internal_flags & user_ptr1_flags) > 1)) return -EINVAL; mesh_dep_flags = BATADV_FLAG_NEED_HARDIF | BATADV_FLAG_NEED_VLAN; if (WARN_ON((ops->internal_flags & mesh_dep_flags) && (~ops->internal_flags & BATADV_FLAG_NEED_MESH))) return -EINVAL; if (ops->internal_flags & BATADV_FLAG_NEED_MESH) { soft_iface = batadv_get_softif_from_info(net, info); if (IS_ERR(soft_iface)) return PTR_ERR(soft_iface); bat_priv = netdev_priv(soft_iface); info->user_ptr[0] = bat_priv; } if (ops->internal_flags & BATADV_FLAG_NEED_HARDIF) { hard_iface = batadv_get_hardif_from_info(bat_priv, net, info); if (IS_ERR(hard_iface)) { ret = PTR_ERR(hard_iface); goto err_put_softif; } info->user_ptr[1] = hard_iface; } if (ops->internal_flags & BATADV_FLAG_NEED_VLAN) { vlan = batadv_get_vlan_from_info(bat_priv, net, info); if (IS_ERR(vlan)) { ret = PTR_ERR(vlan); goto err_put_softif; } info->user_ptr[1] = vlan; } return 0; err_put_softif: if (bat_priv) dev_put(bat_priv->soft_iface); return ret; } /** * batadv_post_doit() - End batman-adv genl doit request * @ops: requested netlink operation * @skb: Netlink message with request data * @info: receiver information */ static void batadv_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info) { struct batadv_hard_iface *hard_iface; struct batadv_softif_vlan *vlan; struct batadv_priv *bat_priv; if (ops->internal_flags & BATADV_FLAG_NEED_HARDIF && info->user_ptr[1]) { hard_iface = info->user_ptr[1]; batadv_hardif_put(hard_iface); } if (ops->internal_flags & BATADV_FLAG_NEED_VLAN && info->user_ptr[1]) { vlan = info->user_ptr[1]; batadv_softif_vlan_put(vlan); } if (ops->internal_flags & BATADV_FLAG_NEED_MESH && info->user_ptr[0]) { bat_priv = info->user_ptr[0]; dev_put(bat_priv->soft_iface); } } static const struct genl_small_ops batadv_netlink_ops[] = { { .cmd = BATADV_CMD_GET_MESH, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, /* can be retrieved by unprivileged users */ .doit = batadv_netlink_get_mesh, .internal_flags = BATADV_FLAG_NEED_MESH, }, { .cmd = BATADV_CMD_TP_METER, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .doit = batadv_netlink_tp_meter_start, .internal_flags = BATADV_FLAG_NEED_MESH, }, { .cmd = BATADV_CMD_TP_METER_CANCEL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .doit = batadv_netlink_tp_meter_cancel, .internal_flags = BATADV_FLAG_NEED_MESH, }, { .cmd = BATADV_CMD_GET_ROUTING_ALGOS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_algo_dump, }, { .cmd = BATADV_CMD_GET_HARDIF, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, /* can be retrieved by unprivileged users */ .dumpit = batadv_netlink_dump_hardif, .doit = batadv_netlink_get_hardif, .internal_flags = BATADV_FLAG_NEED_MESH | BATADV_FLAG_NEED_HARDIF, }, { .cmd = BATADV_CMD_GET_TRANSTABLE_LOCAL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_tt_local_dump, }, { .cmd = BATADV_CMD_GET_TRANSTABLE_GLOBAL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_tt_global_dump, }, { .cmd = BATADV_CMD_GET_ORIGINATORS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_orig_dump, }, { .cmd = BATADV_CMD_GET_NEIGHBORS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_hardif_neigh_dump, }, { .cmd = BATADV_CMD_GET_GATEWAYS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_gw_dump, }, { .cmd = BATADV_CMD_GET_BLA_CLAIM, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_bla_claim_dump, }, { .cmd = BATADV_CMD_GET_BLA_BACKBONE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_bla_backbone_dump, }, { .cmd = BATADV_CMD_GET_DAT_CACHE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_dat_cache_dump, }, { .cmd = BATADV_CMD_GET_MCAST_FLAGS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .dumpit = batadv_mcast_flags_dump, }, { .cmd = BATADV_CMD_SET_MESH, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .doit = batadv_netlink_set_mesh, .internal_flags = BATADV_FLAG_NEED_MESH, }, { .cmd = BATADV_CMD_SET_HARDIF, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .doit = batadv_netlink_set_hardif, .internal_flags = BATADV_FLAG_NEED_MESH | BATADV_FLAG_NEED_HARDIF, }, { .cmd = BATADV_CMD_GET_VLAN, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, /* can be retrieved by unprivileged users */ .doit = batadv_netlink_get_vlan, .internal_flags = BATADV_FLAG_NEED_MESH | BATADV_FLAG_NEED_VLAN, }, { .cmd = BATADV_CMD_SET_VLAN, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .flags = GENL_UNS_ADMIN_PERM, .doit = batadv_netlink_set_vlan, .internal_flags = BATADV_FLAG_NEED_MESH | BATADV_FLAG_NEED_VLAN, }, }; struct genl_family batadv_netlink_family __ro_after_init = { .hdrsize = 0, .name = BATADV_NL_NAME, .version = 1, .maxattr = BATADV_ATTR_MAX, .policy = batadv_netlink_policy, .netnsok = true, .pre_doit = batadv_pre_doit, .post_doit = batadv_post_doit, .module = THIS_MODULE, .small_ops = batadv_netlink_ops, .n_small_ops = ARRAY_SIZE(batadv_netlink_ops), .resv_start_op = BATADV_CMD_SET_VLAN + 1, .mcgrps = batadv_netlink_mcgrps, .n_mcgrps = ARRAY_SIZE(batadv_netlink_mcgrps), }; /** * batadv_netlink_register() - register batadv genl netlink family */ void __init batadv_netlink_register(void) { int ret; ret = genl_register_family(&batadv_netlink_family); if (ret) pr_warn("unable to register netlink family"); } /** * batadv_netlink_unregister() - unregister batadv genl netlink family */ void batadv_netlink_unregister(void) { genl_unregister_family(&batadv_netlink_family); } |
34 32 34 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | // SPDX-License-Identifier: GPL-2.0-or-later /* mpihelp-mul_1.c - MPI helper functions * Copyright (C) 1994, 1996, 1997, 1998, 2001 Free Software Foundation, Inc. * * This file is part of GnuPG. * * Note: This code is heavily based on the GNU MP Library. * Actually it's the same code with only minor changes in the * way the data is stored; this is to support the abstraction * of an optional secure memory allocation which may be used * to avoid revealing of sensitive data due to paging etc. * The GNU MP Library itself is published under the LGPL; * however I decided to publish this code under the plain GPL. */ #include "mpi-internal.h" #include "longlong.h" mpi_limb_t mpihelp_mul_1(mpi_ptr_t res_ptr, mpi_ptr_t s1_ptr, mpi_size_t s1_size, mpi_limb_t s2_limb) { mpi_limb_t cy_limb; mpi_size_t j; mpi_limb_t prod_high, prod_low; /* The loop counter and index J goes from -S1_SIZE to -1. This way * the loop becomes faster. */ j = -s1_size; /* Offset the base pointers to compensate for the negative indices. */ s1_ptr -= j; res_ptr -= j; cy_limb = 0; do { umul_ppmm(prod_high, prod_low, s1_ptr[j], s2_limb); prod_low += cy_limb; cy_limb = (prod_low < cy_limb ? 1 : 0) + prod_high; res_ptr[j] = prod_low; } while (++j); return cy_limb; } |
3 3 1 1 3 3 9 9 5 4 5 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 | // SPDX-License-Identifier: GPL-2.0-only /* * scsi_pm.c Copyright (C) 2010 Alan Stern * * SCSI dynamic Power Management * Initial version: Alan Stern <stern@rowland.harvard.edu> */ #include <linux/pm_runtime.h> #include <linux/export.h> #include <linux/blk-pm.h> #include <scsi/scsi.h> #include <scsi/scsi_device.h> #include <scsi/scsi_driver.h> #include <scsi/scsi_host.h> #include "scsi_priv.h" #ifdef CONFIG_PM_SLEEP static int do_scsi_suspend(struct device *dev, const struct dev_pm_ops *pm) { return pm && pm->suspend ? pm->suspend(dev) : 0; } static int do_scsi_freeze(struct device *dev, const struct dev_pm_ops *pm) { return pm && pm->freeze ? pm->freeze(dev) : 0; } static int do_scsi_poweroff(struct device *dev, const struct dev_pm_ops *pm) { return pm && pm->poweroff ? pm->poweroff(dev) : 0; } static int do_scsi_resume(struct device *dev, const struct dev_pm_ops *pm) { return pm && pm->resume ? pm->resume(dev) : 0; } static int do_scsi_thaw(struct device *dev, const struct dev_pm_ops *pm) { return pm && pm->thaw ? pm->thaw(dev) : 0; } static int do_scsi_restore(struct device *dev, const struct dev_pm_ops *pm) { return pm && pm->restore ? pm->restore(dev) : 0; } static int scsi_dev_type_suspend(struct device *dev, int (*cb)(struct device *, const struct dev_pm_ops *)) { const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; int err; err = scsi_device_quiesce(to_scsi_device(dev)); if (err == 0) { err = cb(dev, pm); if (err) scsi_device_resume(to_scsi_device(dev)); } dev_dbg(dev, "scsi suspend: %d\n", err); return err; } static int scsi_bus_suspend_common(struct device *dev, int (*cb)(struct device *, const struct dev_pm_ops *)) { if (!scsi_is_sdev_device(dev)) return 0; return scsi_dev_type_suspend(dev, cb); } static int scsi_bus_resume_common(struct device *dev, int (*cb)(struct device *, const struct dev_pm_ops *)) { const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; int err; if (!scsi_is_sdev_device(dev)) return 0; err = cb(dev, pm); scsi_device_resume(to_scsi_device(dev)); dev_dbg(dev, "scsi resume: %d\n", err); return err; } static int scsi_bus_prepare(struct device *dev) { if (scsi_is_host_device(dev)) { /* Wait until async scanning is finished */ scsi_complete_async_scans(); } return 0; } static int scsi_bus_suspend(struct device *dev) { return scsi_bus_suspend_common(dev, do_scsi_suspend); } static int scsi_bus_resume(struct device *dev) { return scsi_bus_resume_common(dev, do_scsi_resume); } static int scsi_bus_freeze(struct device *dev) { return scsi_bus_suspend_common(dev, do_scsi_freeze); } static int scsi_bus_thaw(struct device *dev) { return scsi_bus_resume_common(dev, do_scsi_thaw); } static int scsi_bus_poweroff(struct device *dev) { return scsi_bus_suspend_common(dev, do_scsi_poweroff); } static int scsi_bus_restore(struct device *dev) { return scsi_bus_resume_common(dev, do_scsi_restore); } #else /* CONFIG_PM_SLEEP */ #define scsi_bus_prepare NULL #define scsi_bus_suspend NULL #define scsi_bus_resume NULL #define scsi_bus_freeze NULL #define scsi_bus_thaw NULL #define scsi_bus_poweroff NULL #define scsi_bus_restore NULL #endif /* CONFIG_PM_SLEEP */ static int sdev_runtime_suspend(struct device *dev) { const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; struct scsi_device *sdev = to_scsi_device(dev); int err = 0; err = blk_pre_runtime_suspend(sdev->request_queue); if (err) return err; if (pm && pm->runtime_suspend) err = pm->runtime_suspend(dev); blk_post_runtime_suspend(sdev->request_queue, err); return err; } static int scsi_runtime_suspend(struct device *dev) { int err = 0; dev_dbg(dev, "scsi_runtime_suspend\n"); if (scsi_is_sdev_device(dev)) err = sdev_runtime_suspend(dev); /* Insert hooks here for targets, hosts, and transport classes */ return err; } static int sdev_runtime_resume(struct device *dev) { struct scsi_device *sdev = to_scsi_device(dev); const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; int err = 0; blk_pre_runtime_resume(sdev->request_queue); if (pm && pm->runtime_resume) err = pm->runtime_resume(dev); blk_post_runtime_resume(sdev->request_queue); return err; } static int scsi_runtime_resume(struct device *dev) { int err = 0; dev_dbg(dev, "scsi_runtime_resume\n"); if (scsi_is_sdev_device(dev)) err = sdev_runtime_resume(dev); /* Insert hooks here for targets, hosts, and transport classes */ return err; } static int scsi_runtime_idle(struct device *dev) { dev_dbg(dev, "scsi_runtime_idle\n"); /* Insert hooks here for targets, hosts, and transport classes */ if (scsi_is_sdev_device(dev)) { pm_runtime_mark_last_busy(dev); pm_runtime_autosuspend(dev); return -EBUSY; } return 0; } int scsi_autopm_get_device(struct scsi_device *sdev) { int err; err = pm_runtime_get_sync(&sdev->sdev_gendev); if (err < 0 && err !=-EACCES) pm_runtime_put_sync(&sdev->sdev_gendev); else err = 0; return err; } EXPORT_SYMBOL_GPL(scsi_autopm_get_device); void scsi_autopm_put_device(struct scsi_device *sdev) { pm_runtime_put_sync(&sdev->sdev_gendev); } EXPORT_SYMBOL_GPL(scsi_autopm_put_device); void scsi_autopm_get_target(struct scsi_target *starget) { pm_runtime_get_sync(&starget->dev); } void scsi_autopm_put_target(struct scsi_target *starget) { pm_runtime_put_sync(&starget->dev); } int scsi_autopm_get_host(struct Scsi_Host *shost) { int err; err = pm_runtime_get_sync(&shost->shost_gendev); if (err < 0 && err !=-EACCES) pm_runtime_put_sync(&shost->shost_gendev); else err = 0; return err; } void scsi_autopm_put_host(struct Scsi_Host *shost) { pm_runtime_put_sync(&shost->shost_gendev); } const struct dev_pm_ops scsi_bus_pm_ops = { .prepare = scsi_bus_prepare, .suspend = scsi_bus_suspend, .resume = scsi_bus_resume, .freeze = scsi_bus_freeze, .thaw = scsi_bus_thaw, .poweroff = scsi_bus_poweroff, .restore = scsi_bus_restore, .runtime_suspend = scsi_runtime_suspend, .runtime_resume = scsi_runtime_resume, .runtime_idle = scsi_runtime_idle, }; |
69 60 8 4 72 3 4 3 3 3 1 4 3 3 1 2 3 1 1 1 1 2 1 4 2 8 1 2 1 7 9 3 2 5 1 1 9 2 3 8 35 35 1 15 7 8 7 8 7 8 4 24 5 4 4 1 4 1 1 2 10 1 10 33 1 3 2 4 27 34 8 2 2 20 3 6 1 10 1 4 12 2 14 14 7 4 3 2 5 13 12 9 9 1 11 14 4 11 1 7 2 1 1 1 2 3 2 1 1 1 1 1 1 1 3 3 3 11 11 3 8 1 8 8 11 1 4 5 11 6 6 10 11 8 11 6 6 4 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 | // SPDX-License-Identifier: GPL-2.0-or-later /* * INET An implementation of the TCP/IP protocol suite for the LINUX * operating system. INET is implemented using the BSD Socket * interface as the means of communication with the user level. * * "Ping" sockets * * Based on ipv4/udp.c code. * * Authors: Vasiliy Kulikov / Openwall (for Linux 2.6), * Pavel Kankovsky (for Linux 2.4.32) * * Pavel gave all rights to bugs to Vasiliy, * none of the bugs are Pavel's now. */ #include <linux/uaccess.h> #include <linux/types.h> #include <linux/fcntl.h> #include <linux/socket.h> #include <linux/sockios.h> #include <linux/in.h> #include <linux/errno.h> #include <linux/timer.h> #include <linux/mm.h> #include <linux/inet.h> #include <linux/netdevice.h> #include <net/snmp.h> #include <net/ip.h> #include <net/icmp.h> #include <net/protocol.h> #include <linux/skbuff.h> #include <linux/proc_fs.h> #include <linux/export.h> #include <linux/bpf-cgroup.h> #include <net/sock.h> #include <net/ping.h> #include <net/udp.h> #include <net/route.h> #include <net/inet_common.h> #include <net/checksum.h> #if IS_ENABLED(CONFIG_IPV6) #include <linux/in6.h> #include <linux/icmpv6.h> #include <net/addrconf.h> #include <net/ipv6.h> #include <net/transp_v6.h> #endif struct ping_table { struct hlist_head hash[PING_HTABLE_SIZE]; spinlock_t lock; }; static struct ping_table ping_table; struct pingv6_ops pingv6_ops; EXPORT_SYMBOL_GPL(pingv6_ops); static u16 ping_port_rover; static inline u32 ping_hashfn(const struct net *net, u32 num, u32 mask) { u32 res = (num + net_hash_mix(net)) & mask; pr_debug("hash(%u) = %u\n", num, res); return res; } EXPORT_SYMBOL_GPL(ping_hash); static inline struct hlist_head *ping_hashslot(struct ping_table *table, struct net *net, unsigned int num) { return &table->hash[ping_hashfn(net, num, PING_HTABLE_MASK)]; } int ping_get_port(struct sock *sk, unsigned short ident) { struct inet_sock *isk, *isk2; struct hlist_head *hlist; struct sock *sk2 = NULL; isk = inet_sk(sk); spin_lock(&ping_table.lock); if (ident == 0) { u32 i; u16 result = ping_port_rover + 1; for (i = 0; i < (1L << 16); i++, result++) { if (!result) result++; /* avoid zero */ hlist = ping_hashslot(&ping_table, sock_net(sk), result); sk_for_each(sk2, hlist) { isk2 = inet_sk(sk2); if (isk2->inet_num == result) goto next_port; } /* found */ ping_port_rover = ident = result; break; next_port: ; } if (i >= (1L << 16)) goto fail; } else { hlist = ping_hashslot(&ping_table, sock_net(sk), ident); sk_for_each(sk2, hlist) { isk2 = inet_sk(sk2); /* BUG? Why is this reuse and not reuseaddr? ping.c * doesn't turn off SO_REUSEADDR, and it doesn't expect * that other ping processes can steal its packets. */ if ((isk2->inet_num == ident) && (sk2 != sk) && (!sk2->sk_reuse || !sk->sk_reuse)) goto fail; } } pr_debug("found port/ident = %d\n", ident); isk->inet_num = ident; if (sk_unhashed(sk)) { pr_debug("was not hashed\n"); sk_add_node_rcu(sk, hlist); sock_set_flag(sk, SOCK_RCU_FREE); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); } spin_unlock(&ping_table.lock); return 0; fail: spin_unlock(&ping_table.lock); return -EADDRINUSE; } EXPORT_SYMBOL_GPL(ping_get_port); int ping_hash(struct sock *sk) { pr_debug("ping_hash(sk->port=%u)\n", inet_sk(sk)->inet_num); BUG(); /* "Please do not press this button again." */ return 0; } void ping_unhash(struct sock *sk) { struct inet_sock *isk = inet_sk(sk); pr_debug("ping_unhash(isk=%p,isk->num=%u)\n", isk, isk->inet_num); spin_lock(&ping_table.lock); if (sk_del_node_init_rcu(sk)) { isk->inet_num = 0; isk->inet_sport = 0; sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); } spin_unlock(&ping_table.lock); } EXPORT_SYMBOL_GPL(ping_unhash); /* Called under rcu_read_lock() */ static struct sock *ping_lookup(struct net *net, struct sk_buff *skb, u16 ident) { struct hlist_head *hslot = ping_hashslot(&ping_table, net, ident); struct sock *sk = NULL; struct inet_sock *isk; int dif, sdif; if (skb->protocol == htons(ETH_P_IP)) { dif = inet_iif(skb); sdif = inet_sdif(skb); pr_debug("try to find: num = %d, daddr = %pI4, dif = %d\n", (int)ident, &ip_hdr(skb)->daddr, dif); #if IS_ENABLED(CONFIG_IPV6) } else if (skb->protocol == htons(ETH_P_IPV6)) { dif = inet6_iif(skb); sdif = inet6_sdif(skb); pr_debug("try to find: num = %d, daddr = %pI6c, dif = %d\n", (int)ident, &ipv6_hdr(skb)->daddr, dif); #endif } else { return NULL; } sk_for_each_rcu(sk, hslot) { isk = inet_sk(sk); pr_debug("iterate\n"); if (isk->inet_num != ident) continue; if (skb->protocol == htons(ETH_P_IP) && sk->sk_family == AF_INET) { pr_debug("found: %p: num=%d, daddr=%pI4, dif=%d\n", sk, (int) isk->inet_num, &isk->inet_rcv_saddr, sk->sk_bound_dev_if); if (isk->inet_rcv_saddr && isk->inet_rcv_saddr != ip_hdr(skb)->daddr) continue; #if IS_ENABLED(CONFIG_IPV6) } else if (skb->protocol == htons(ETH_P_IPV6) && sk->sk_family == AF_INET6) { pr_debug("found: %p: num=%d, daddr=%pI6c, dif=%d\n", sk, (int) isk->inet_num, &sk->sk_v6_rcv_saddr, sk->sk_bound_dev_if); if (!ipv6_addr_any(&sk->sk_v6_rcv_saddr) && !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, &ipv6_hdr(skb)->daddr)) continue; #endif } else { continue; } if (sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif && sk->sk_bound_dev_if != sdif) continue; goto exit; } sk = NULL; exit: return sk; } static void inet_get_ping_group_range_net(struct net *net, kgid_t *low, kgid_t *high) { kgid_t *data = net->ipv4.ping_group_range.range; unsigned int seq; do { seq = read_seqbegin(&net->ipv4.ping_group_range.lock); *low = data[0]; *high = data[1]; } while (read_seqretry(&net->ipv4.ping_group_range.lock, seq)); } int ping_init_sock(struct sock *sk) { struct net *net = sock_net(sk); kgid_t group = current_egid(); struct group_info *group_info; int i; kgid_t low, high; int ret = 0; if (sk->sk_family == AF_INET6) sk->sk_ipv6only = 1; inet_get_ping_group_range_net(net, &low, &high); if (gid_lte(low, group) && gid_lte(group, high)) return 0; group_info = get_current_groups(); for (i = 0; i < group_info->ngroups; i++) { kgid_t gid = group_info->gid[i]; if (gid_lte(low, gid) && gid_lte(gid, high)) goto out_release_group; } ret = -EACCES; out_release_group: put_group_info(group_info); return ret; } EXPORT_SYMBOL_GPL(ping_init_sock); void ping_close(struct sock *sk, long timeout) { pr_debug("ping_close(sk=%p,sk->num=%u)\n", inet_sk(sk), inet_sk(sk)->inet_num); pr_debug("isk->refcnt = %d\n", refcount_read(&sk->sk_refcnt)); sk_common_release(sk); } EXPORT_SYMBOL_GPL(ping_close); static int ping_pre_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { /* This check is replicated from __ip4_datagram_connect() and * intended to prevent BPF program called below from accessing bytes * that are out of the bound specified by user in addr_len. */ if (addr_len < sizeof(struct sockaddr_in)) return -EINVAL; return BPF_CGROUP_RUN_PROG_INET4_CONNECT_LOCK(sk, uaddr, &addr_len); } /* Checks the bind address and possibly modifies sk->sk_bound_dev_if. */ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk, struct sockaddr *uaddr, int addr_len) { struct net *net = sock_net(sk); if (sk->sk_family == AF_INET) { struct sockaddr_in *addr = (struct sockaddr_in *) uaddr; u32 tb_id = RT_TABLE_LOCAL; int chk_addr_ret; if (addr_len < sizeof(*addr)) return -EINVAL; if (addr->sin_family != AF_INET && !(addr->sin_family == AF_UNSPEC && addr->sin_addr.s_addr == htonl(INADDR_ANY))) return -EAFNOSUPPORT; pr_debug("ping_check_bind_addr(sk=%p,addr=%pI4,port=%d)\n", sk, &addr->sin_addr.s_addr, ntohs(addr->sin_port)); if (addr->sin_addr.s_addr == htonl(INADDR_ANY)) return 0; tb_id = l3mdev_fib_table_by_index(net, sk->sk_bound_dev_if) ? : tb_id; chk_addr_ret = inet_addr_type_table(net, addr->sin_addr.s_addr, tb_id); if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST || (chk_addr_ret != RTN_LOCAL && !inet_can_nonlocal_bind(net, isk))) return -EADDRNOTAVAIL; #if IS_ENABLED(CONFIG_IPV6) } else if (sk->sk_family == AF_INET6) { struct sockaddr_in6 *addr = (struct sockaddr_in6 *) uaddr; int addr_type, scoped, has_addr; struct net_device *dev = NULL; if (addr_len < sizeof(*addr)) return -EINVAL; if (addr->sin6_family != AF_INET6) return -EAFNOSUPPORT; pr_debug("ping_check_bind_addr(sk=%p,addr=%pI6c,port=%d)\n", sk, addr->sin6_addr.s6_addr, ntohs(addr->sin6_port)); addr_type = ipv6_addr_type(&addr->sin6_addr); scoped = __ipv6_addr_needs_scope_id(addr_type); if ((addr_type != IPV6_ADDR_ANY && !(addr_type & IPV6_ADDR_UNICAST)) || (scoped && !addr->sin6_scope_id)) return -EINVAL; rcu_read_lock(); if (addr->sin6_scope_id) { dev = dev_get_by_index_rcu(net, addr->sin6_scope_id); if (!dev) { rcu_read_unlock(); return -ENODEV; } } if (!dev && sk->sk_bound_dev_if) { dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); if (!dev) { rcu_read_unlock(); return -ENODEV; } } has_addr = pingv6_ops.ipv6_chk_addr(net, &addr->sin6_addr, dev, scoped); rcu_read_unlock(); if (!(ipv6_can_nonlocal_bind(net, isk) || has_addr || addr_type == IPV6_ADDR_ANY)) return -EADDRNOTAVAIL; if (scoped) sk->sk_bound_dev_if = addr->sin6_scope_id; #endif } else { return -EAFNOSUPPORT; } return 0; } static void ping_set_saddr(struct sock *sk, struct sockaddr *saddr) { if (saddr->sa_family == AF_INET) { struct inet_sock *isk = inet_sk(sk); struct sockaddr_in *addr = (struct sockaddr_in *) saddr; isk->inet_rcv_saddr = isk->inet_saddr = addr->sin_addr.s_addr; #if IS_ENABLED(CONFIG_IPV6) } else if (saddr->sa_family == AF_INET6) { struct sockaddr_in6 *addr = (struct sockaddr_in6 *) saddr; struct ipv6_pinfo *np = inet6_sk(sk); sk->sk_v6_rcv_saddr = np->saddr = addr->sin6_addr; #endif } } /* * We need our own bind because there are no privileged id's == local ports. * Moreover, we don't allow binding to multi- and broadcast addresses. */ int ping_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) { struct inet_sock *isk = inet_sk(sk); unsigned short snum; int err; int dif = sk->sk_bound_dev_if; err = ping_check_bind_addr(sk, isk, uaddr, addr_len); if (err) return err; lock_sock(sk); err = -EINVAL; if (isk->inet_num != 0) goto out; err = -EADDRINUSE; snum = ntohs(((struct sockaddr_in *)uaddr)->sin_port); if (ping_get_port(sk, snum) != 0) { /* Restore possibly modified sk->sk_bound_dev_if by ping_check_bind_addr(). */ sk->sk_bound_dev_if = dif; goto out; } ping_set_saddr(sk, uaddr); pr_debug("after bind(): num = %hu, dif = %d\n", isk->inet_num, sk->sk_bound_dev_if); err = 0; if (sk->sk_family == AF_INET && isk->inet_rcv_saddr) sk->sk_userlocks |= SOCK_BINDADDR_LOCK; #if IS_ENABLED(CONFIG_IPV6) if (sk->sk_family == AF_INET6 && !ipv6_addr_any(&sk->sk_v6_rcv_saddr)) sk->sk_userlocks |= SOCK_BINDADDR_LOCK; #endif if (snum) sk->sk_userlocks |= SOCK_BINDPORT_LOCK; isk->inet_sport = htons(isk->inet_num); isk->inet_daddr = 0; isk->inet_dport = 0; #if IS_ENABLED(CONFIG_IPV6) if (sk->sk_family == AF_INET6) memset(&sk->sk_v6_daddr, 0, sizeof(sk->sk_v6_daddr)); #endif sk_dst_reset(sk); out: release_sock(sk); pr_debug("ping_v4_bind -> %d\n", err); return err; } EXPORT_SYMBOL_GPL(ping_bind); /* * Is this a supported type of ICMP message? */ static inline int ping_supported(int family, int type, int code) { return (family == AF_INET && type == ICMP_ECHO && code == 0) || (family == AF_INET && type == ICMP_EXT_ECHO && code == 0) || (family == AF_INET6 && type == ICMPV6_ECHO_REQUEST && code == 0) || (family == AF_INET6 && type == ICMPV6_EXT_ECHO_REQUEST && code == 0); } /* * This routine is called by the ICMP module when it gets some * sort of error condition. */ void ping_err(struct sk_buff *skb, int offset, u32 info) { int family; struct icmphdr *icmph; struct inet_sock *inet_sock; int type; int code; struct net *net = dev_net(skb->dev); struct sock *sk; int harderr; int err; if (skb->protocol == htons(ETH_P_IP)) { family = AF_INET; type = icmp_hdr(skb)->type; code = icmp_hdr(skb)->code; icmph = (struct icmphdr *)(skb->data + offset); } else if (skb->protocol == htons(ETH_P_IPV6)) { family = AF_INET6; type = icmp6_hdr(skb)->icmp6_type; code = icmp6_hdr(skb)->icmp6_code; icmph = (struct icmphdr *) (skb->data + offset); } else { BUG(); } /* We assume the packet has already been checked by icmp_unreach */ if (!ping_supported(family, icmph->type, icmph->code)) return; pr_debug("ping_err(proto=0x%x,type=%d,code=%d,id=%04x,seq=%04x)\n", skb->protocol, type, code, ntohs(icmph->un.echo.id), ntohs(icmph->un.echo.sequence)); sk = ping_lookup(net, skb, ntohs(icmph->un.echo.id)); if (!sk) { pr_debug("no socket, dropping\n"); return; /* No socket for error */ } pr_debug("err on socket %p\n", sk); err = 0; harderr = 0; inet_sock = inet_sk(sk); if (skb->protocol == htons(ETH_P_IP)) { switch (type) { default: case ICMP_TIME_EXCEEDED: err = EHOSTUNREACH; break; case ICMP_SOURCE_QUENCH: /* This is not a real error but ping wants to see it. * Report it with some fake errno. */ err = EREMOTEIO; break; case ICMP_PARAMETERPROB: err = EPROTO; harderr = 1; break; case ICMP_DEST_UNREACH: if (code == ICMP_FRAG_NEEDED) { /* Path MTU discovery */ ipv4_sk_update_pmtu(skb, sk, info); if (READ_ONCE(inet_sock->pmtudisc) != IP_PMTUDISC_DONT) { err = EMSGSIZE; harderr = 1; break; } goto out; } err = EHOSTUNREACH; if (code <= NR_ICMP_UNREACH) { harderr = icmp_err_convert[code].fatal; err = icmp_err_convert[code].errno; } break; case ICMP_REDIRECT: /* See ICMP_SOURCE_QUENCH */ ipv4_sk_redirect(skb, sk); err = EREMOTEIO; break; } #if IS_ENABLED(CONFIG_IPV6) } else if (skb->protocol == htons(ETH_P_IPV6)) { harderr = pingv6_ops.icmpv6_err_convert(type, code, &err); #endif } /* * RFC1122: OK. Passes ICMP errors back to application, as per * 4.1.3.3. */ if ((family == AF_INET && !inet_test_bit(RECVERR, sk)) || (family == AF_INET6 && !inet6_test_bit(RECVERR6, sk))) { if (!harderr || sk->sk_state != TCP_ESTABLISHED) goto out; } else { if (family == AF_INET) { ip_icmp_error(sk, skb, err, 0 /* no remote port */, info, (u8 *)icmph); #if IS_ENABLED(CONFIG_IPV6) } else if (family == AF_INET6) { pingv6_ops.ipv6_icmp_error(sk, skb, err, 0, info, (u8 *)icmph); #endif } } sk->sk_err = err; sk_error_report(sk); out: return; } EXPORT_SYMBOL_GPL(ping_err); /* * Copy and checksum an ICMP Echo packet from user space into a buffer * starting from the payload. */ int ping_getfrag(void *from, char *to, int offset, int fraglen, int odd, struct sk_buff *skb) { struct pingfakehdr *pfh = from; if (!csum_and_copy_from_iter_full(to, fraglen, &pfh->wcheck, &pfh->msg->msg_iter)) return -EFAULT; #if IS_ENABLED(CONFIG_IPV6) /* For IPv6, checksum each skb as we go along, as expected by * icmpv6_push_pending_frames. For IPv4, accumulate the checksum in * wcheck, it will be finalized in ping_v4_push_pending_frames. */ if (pfh->family == AF_INET6) { skb->csum = csum_block_add(skb->csum, pfh->wcheck, odd); skb->ip_summed = CHECKSUM_NONE; pfh->wcheck = 0; } #endif return 0; } EXPORT_SYMBOL_GPL(ping_getfrag); static int ping_v4_push_pending_frames(struct sock *sk, struct pingfakehdr *pfh, struct flowi4 *fl4) { struct sk_buff *skb = skb_peek(&sk->sk_write_queue); if (!skb) return 0; pfh->wcheck = csum_partial((char *)&pfh->icmph, sizeof(struct icmphdr), pfh->wcheck); pfh->icmph.checksum = csum_fold(pfh->wcheck); memcpy(icmp_hdr(skb), &pfh->icmph, sizeof(struct icmphdr)); skb->ip_summed = CHECKSUM_NONE; return ip_push_pending_frames(sk, fl4); } int ping_common_sendmsg(int family, struct msghdr *msg, size_t len, void *user_icmph, size_t icmph_len) { u8 type, code; if (len > 0xFFFF) return -EMSGSIZE; /* Must have at least a full ICMP header. */ if (len < icmph_len) return -EINVAL; /* * Check the flags. */ /* Mirror BSD error message compatibility */ if (msg->msg_flags & MSG_OOB) return -EOPNOTSUPP; /* * Fetch the ICMP header provided by the userland. * iovec is modified! The ICMP header is consumed. */ if (memcpy_from_msg(user_icmph, msg, icmph_len)) return -EFAULT; if (family == AF_INET) { type = ((struct icmphdr *) user_icmph)->type; code = ((struct icmphdr *) user_icmph)->code; #if IS_ENABLED(CONFIG_IPV6) } else if (family == AF_INET6) { type = ((struct icmp6hdr *) user_icmph)->icmp6_type; code = ((struct icmp6hdr *) user_icmph)->icmp6_code; #endif } else { BUG(); } if (!ping_supported(family, type, code)) return -EINVAL; return 0; } EXPORT_SYMBOL_GPL(ping_common_sendmsg); static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) { struct net *net = sock_net(sk); struct flowi4 fl4; struct inet_sock *inet = inet_sk(sk); struct ipcm_cookie ipc; struct icmphdr user_icmph; struct pingfakehdr pfh; struct rtable *rt = NULL; struct ip_options_data opt_copy; int free = 0; __be32 saddr, daddr, faddr; u8 tos, scope; int err; pr_debug("ping_v4_sendmsg(sk=%p,sk->num=%u)\n", inet, inet->inet_num); err = ping_common_sendmsg(AF_INET, msg, len, &user_icmph, sizeof(user_icmph)); if (err) return err; /* * Get and verify the address. */ if (msg->msg_name) { DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name); if (msg->msg_namelen < sizeof(*usin)) return -EINVAL; if (usin->sin_family != AF_INET) return -EAFNOSUPPORT; daddr = usin->sin_addr.s_addr; /* no remote port */ } else { if (sk->sk_state != TCP_ESTABLISHED) return -EDESTADDRREQ; daddr = inet->inet_daddr; /* no remote port */ } ipcm_init_sk(&ipc, inet); if (msg->msg_controllen) { err = ip_cmsg_send(sk, msg, &ipc, false); if (unlikely(err)) { kfree(ipc.opt); return err; } if (ipc.opt) free = 1; } if (!ipc.opt) { struct ip_options_rcu *inet_opt; rcu_read_lock(); inet_opt = rcu_dereference(inet->inet_opt); if (inet_opt) { memcpy(&opt_copy, inet_opt, sizeof(*inet_opt) + inet_opt->opt.optlen); ipc.opt = &opt_copy.opt; } rcu_read_unlock(); } saddr = ipc.addr; ipc.addr = faddr = daddr; if (ipc.opt && ipc.opt->opt.srr) { if (!daddr) { err = -EINVAL; goto out_free; } faddr = ipc.opt->opt.faddr; } tos = get_rttos(&ipc, inet); scope = ip_sendmsg_scope(inet, &ipc, msg); if (ipv4_is_multicast(daddr)) { if (!ipc.oif || netif_index_is_l3_master(sock_net(sk), ipc.oif)) ipc.oif = READ_ONCE(inet->mc_index); if (!saddr) saddr = READ_ONCE(inet->mc_addr); } else if (!ipc.oif) ipc.oif = READ_ONCE(inet->uc_index); flowi4_init_output(&fl4, ipc.oif, ipc.sockc.mark, tos, scope, sk->sk_protocol, inet_sk_flowi_flags(sk), faddr, saddr, 0, 0, sk->sk_uid); fl4.fl4_icmp_type = user_icmph.type; fl4.fl4_icmp_code = user_icmph.code; security_sk_classify_flow(sk, flowi4_to_flowi_common(&fl4)); rt = ip_route_output_flow(net, &fl4, sk); if (IS_ERR(rt)) { err = PTR_ERR(rt); rt = NULL; if (err == -ENETUNREACH) IP_INC_STATS(net, IPSTATS_MIB_OUTNOROUTES); goto out; } err = -EACCES; if ((rt->rt_flags & RTCF_BROADCAST) && !sock_flag(sk, SOCK_BROADCAST)) goto out; if (msg->msg_flags & MSG_CONFIRM) goto do_confirm; back_from_confirm: if (!ipc.addr) ipc.addr = fl4.daddr; lock_sock(sk); pfh.icmph.type = user_icmph.type; /* already checked */ pfh.icmph.code = user_icmph.code; /* ditto */ pfh.icmph.checksum = 0; pfh.icmph.un.echo.id = inet->inet_sport; pfh.icmph.un.echo.sequence = user_icmph.un.echo.sequence; pfh.msg = msg; pfh.wcheck = 0; pfh.family = AF_INET; err = ip_append_data(sk, &fl4, ping_getfrag, &pfh, len, sizeof(struct icmphdr), &ipc, &rt, msg->msg_flags); if (err) ip_flush_pending_frames(sk); else err = ping_v4_push_pending_frames(sk, &pfh, &fl4); release_sock(sk); out: ip_rt_put(rt); out_free: if (free) kfree(ipc.opt); if (!err) { icmp_out_count(sock_net(sk), user_icmph.type); return len; } return err; do_confirm: if (msg->msg_flags & MSG_PROBE) dst_confirm_neigh(&rt->dst, &fl4.daddr); if (!(msg->msg_flags & MSG_PROBE) || len) goto back_from_confirm; err = 0; goto out; } int ping_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len) { struct inet_sock *isk = inet_sk(sk); int family = sk->sk_family; struct sk_buff *skb; int copied, err; pr_debug("ping_recvmsg(sk=%p,sk->num=%u)\n", isk, isk->inet_num); err = -EOPNOTSUPP; if (flags & MSG_OOB) goto out; if (flags & MSG_ERRQUEUE) return inet_recv_error(sk, msg, len, addr_len); skb = skb_recv_datagram(sk, flags, &err); if (!skb) goto out; copied = skb->len; if (copied > len) { msg->msg_flags |= MSG_TRUNC; copied = len; } /* Don't bother checking the checksum */ err = skb_copy_datagram_msg(skb, 0, msg, copied); if (err) goto done; sock_recv_timestamp(msg, sk, skb); /* Copy the address and add cmsg data. */ if (family == AF_INET) { DECLARE_SOCKADDR(struct sockaddr_in *, sin, msg->msg_name); if (sin) { sin->sin_family = AF_INET; sin->sin_port = 0 /* skb->h.uh->source */; sin->sin_addr.s_addr = ip_hdr(skb)->saddr; memset(sin->sin_zero, 0, sizeof(sin->sin_zero)); *addr_len = sizeof(*sin); } if (inet_cmsg_flags(isk)) ip_cmsg_recv(msg, skb); #if IS_ENABLED(CONFIG_IPV6) } else if (family == AF_INET6) { struct ipv6hdr *ip6 = ipv6_hdr(skb); DECLARE_SOCKADDR(struct sockaddr_in6 *, sin6, msg->msg_name); if (sin6) { sin6->sin6_family = AF_INET6; sin6->sin6_port = 0; sin6->sin6_addr = ip6->saddr; sin6->sin6_flowinfo = 0; if (inet6_test_bit(SNDFLOW, sk)) sin6->sin6_flowinfo = ip6_flowinfo(ip6); sin6->sin6_scope_id = ipv6_iface_scope_id(&sin6->sin6_addr, inet6_iif(skb)); *addr_len = sizeof(*sin6); } if (inet6_sk(sk)->rxopt.all) pingv6_ops.ip6_datagram_recv_common_ctl(sk, msg, skb); if (skb->protocol == htons(ETH_P_IPV6) && inet6_sk(sk)->rxopt.all) pingv6_ops.ip6_datagram_recv_specific_ctl(sk, msg, skb); else if (skb->protocol == htons(ETH_P_IP) && inet_cmsg_flags(isk)) ip_cmsg_recv(msg, skb); #endif } else { BUG(); } err = copied; done: skb_free_datagram(sk, skb); out: pr_debug("ping_recvmsg -> %d\n", err); return err; } EXPORT_SYMBOL_GPL(ping_recvmsg); static enum skb_drop_reason __ping_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) { enum skb_drop_reason reason; pr_debug("ping_queue_rcv_skb(sk=%p,sk->num=%d,skb=%p)\n", inet_sk(sk), inet_sk(sk)->inet_num, skb); if (sock_queue_rcv_skb_reason(sk, skb, &reason) < 0) { kfree_skb_reason(skb, reason); pr_debug("ping_queue_rcv_skb -> failed\n"); return reason; } return SKB_NOT_DROPPED_YET; } int ping_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) { return __ping_queue_rcv_skb(sk, skb) ? -1 : 0; } EXPORT_SYMBOL_GPL(ping_queue_rcv_skb); /* * All we need to do is get the socket. */ enum skb_drop_reason ping_rcv(struct sk_buff *skb) { enum skb_drop_reason reason = SKB_DROP_REASON_NO_SOCKET; struct sock *sk; struct net *net = dev_net(skb->dev); struct icmphdr *icmph = icmp_hdr(skb); /* We assume the packet has already been checked by icmp_rcv */ pr_debug("ping_rcv(skb=%p,id=%04x,seq=%04x)\n", skb, ntohs(icmph->un.echo.id), ntohs(icmph->un.echo.sequence)); /* Push ICMP header back */ skb_push(skb, skb->data - (u8 *)icmph); sk = ping_lookup(net, skb, ntohs(icmph->un.echo.id)); if (sk) { struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC); pr_debug("rcv on socket %p\n", sk); if (skb2) reason = __ping_queue_rcv_skb(sk, skb2); else reason = SKB_DROP_REASON_NOMEM; } if (reason) pr_debug("no socket, dropping\n"); return reason; } EXPORT_SYMBOL_GPL(ping_rcv); struct proto ping_prot = { .name = "PING", .owner = THIS_MODULE, .init = ping_init_sock, .close = ping_close, .pre_connect = ping_pre_connect, .connect = ip4_datagram_connect, .disconnect = __udp_disconnect, .setsockopt = ip_setsockopt, .getsockopt = ip_getsockopt, .sendmsg = ping_v4_sendmsg, .recvmsg = ping_recvmsg, .bind = ping_bind, .backlog_rcv = ping_queue_rcv_skb, .release_cb = ip4_datagram_release_cb, .hash = ping_hash, .unhash = ping_unhash, .get_port = ping_get_port, .put_port = ping_unhash, .obj_size = sizeof(struct inet_sock), }; EXPORT_SYMBOL(ping_prot); #ifdef CONFIG_PROC_FS static struct sock *ping_get_first(struct seq_file *seq, int start) { struct sock *sk; struct ping_iter_state *state = seq->private; struct net *net = seq_file_net(seq); for (state->bucket = start; state->bucket < PING_HTABLE_SIZE; ++state->bucket) { struct hlist_head *hslot; hslot = &ping_table.hash[state->bucket]; if (hlist_empty(hslot)) continue; sk_for_each(sk, hslot) { if (net_eq(sock_net(sk), net) && sk->sk_family == state->family) goto found; } } sk = NULL; found: return sk; } static struct sock *ping_get_next(struct seq_file *seq, struct sock *sk) { struct ping_iter_state *state = seq->private; struct net *net = seq_file_net(seq); do { sk = sk_next(sk); } while (sk && (!net_eq(sock_net(sk), net))); if (!sk) return ping_get_first(seq, state->bucket + 1); return sk; } static struct sock *ping_get_idx(struct seq_file *seq, loff_t pos) { struct sock *sk = ping_get_first(seq, 0); if (sk) while (pos && (sk = ping_get_next(seq, sk)) != NULL) --pos; return pos ? NULL : sk; } void *ping_seq_start(struct seq_file *seq, loff_t *pos, sa_family_t family) __acquires(ping_table.lock) { struct ping_iter_state *state = seq->private; state->bucket = 0; state->family = family; spin_lock(&ping_table.lock); return *pos ? ping_get_idx(seq, *pos-1) : SEQ_START_TOKEN; } EXPORT_SYMBOL_GPL(ping_seq_start); static void *ping_v4_seq_start(struct seq_file *seq, loff_t *pos) { return ping_seq_start(seq, pos, AF_INET); } void *ping_seq_next(struct seq_file *seq, void *v, loff_t *pos) { struct sock *sk; if (v == SEQ_START_TOKEN) sk = ping_get_idx(seq, 0); else sk = ping_get_next(seq, v); ++*pos; return sk; } EXPORT_SYMBOL_GPL(ping_seq_next); void ping_seq_stop(struct seq_file *seq, void *v) __releases(ping_table.lock) { spin_unlock(&ping_table.lock); } EXPORT_SYMBOL_GPL(ping_seq_stop); static void ping_v4_format_sock(struct sock *sp, struct seq_file *f, int bucket) { struct inet_sock *inet = inet_sk(sp); __be32 dest = inet->inet_daddr; __be32 src = inet->inet_rcv_saddr; __u16 destp = ntohs(inet->inet_dport); __u16 srcp = ntohs(inet->inet_sport); seq_printf(f, "%5d: %08X:%04X %08X:%04X" " %02X %08X:%08X %02X:%08lX %08X %5u %8d %lu %d %pK %u", bucket, src, srcp, dest, destp, sp->sk_state, sk_wmem_alloc_get(sp), sk_rmem_alloc_get(sp), 0, 0L, 0, from_kuid_munged(seq_user_ns(f), sock_i_uid(sp)), 0, sock_i_ino(sp), refcount_read(&sp->sk_refcnt), sp, atomic_read(&sp->sk_drops)); } static int ping_v4_seq_show(struct seq_file *seq, void *v) { seq_setwidth(seq, 127); if (v == SEQ_START_TOKEN) seq_puts(seq, " sl local_address rem_address st tx_queue " "rx_queue tr tm->when retrnsmt uid timeout " "inode ref pointer drops"); else { struct ping_iter_state *state = seq->private; ping_v4_format_sock(v, seq, state->bucket); } seq_pad(seq, '\n'); return 0; } static const struct seq_operations ping_v4_seq_ops = { .start = ping_v4_seq_start, .show = ping_v4_seq_show, .next = ping_seq_next, .stop = ping_seq_stop, }; static int __net_init ping_v4_proc_init_net(struct net *net) { if (!proc_create_net("icmp", 0444, net->proc_net, &ping_v4_seq_ops, sizeof(struct ping_iter_state))) return -ENOMEM; return 0; } static void __net_exit ping_v4_proc_exit_net(struct net *net) { remove_proc_entry("icmp", net->proc_net); } static struct pernet_operations ping_v4_net_ops = { .init = ping_v4_proc_init_net, .exit = ping_v4_proc_exit_net, }; int __init ping_proc_init(void) { return register_pernet_subsys(&ping_v4_net_ops); } void ping_proc_exit(void) { unregister_pernet_subsys(&ping_v4_net_ops); } #endif void __init ping_init(void) { int i; for (i = 0; i < PING_HTABLE_SIZE; i++) INIT_HLIST_HEAD(&ping_table.hash[i]); spin_lock_init(&ping_table.lock); } |
18 6 7804 208 963 404 3 7 5 2 733 60 15 15 213 8 95 25 122 429 225 109 25 158 190 2 250 1 24 9 2 24 21 13 56 8 60 56 1 8198 9295 47 371 476 39 9295 37 120 8197 11 8251 101 8100 4 4 4 4 77 77 90 7538 128 33 433 243 234 210 210 3 422 40 18 23 15 7651 29 20 8 64 5 7676 87 1 108 87 279 45 238 5 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* * INET An implementation of the TCP/IP protocol suite for the LINUX * operating system. INET is implemented using the BSD Socket * interface as the means of communication with the user level. * * Definitions for the Interfaces handler. * * Version: @(#)dev.h 1.0.10 08/12/93 * * Authors: Ross Biro * Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG> * Corey Minyard <wf-rch!minyard@relay.EU.net> * Donald J. Becker, <becker@cesdis.gsfc.nasa.gov> * Alan Cox, <alan@lxorguk.ukuu.org.uk> * Bjorn Ekwall. <bj0rn@blox.se> * Pekka Riikonen <priikone@poseidon.pspt.fi> * * Moved to /usr/include/linux for NET3 */ #ifndef _LINUX_NETDEVICE_H #define _LINUX_NETDEVICE_H #include <linux/timer.h> #include <linux/bug.h> #include <linux/delay.h> #include <linux/atomic.h> #include <linux/prefetch.h> #include <asm/cache.h> #include <asm/byteorder.h> #include <asm/local.h> #include <linux/percpu.h> #include <linux/rculist.h> #include <linux/workqueue.h> #include <linux/dynamic_queue_limits.h> #include <net/net_namespace.h> #ifdef CONFIG_DCB #include <net/dcbnl.h> #endif #include <net/netprio_cgroup.h> #include <linux/netdev_features.h> #include <linux/neighbour.h> #include <uapi/linux/netdevice.h> #include <uapi/linux/if_bonding.h> #include <uapi/linux/pkt_cls.h> #include <uapi/linux/netdev.h> #include <linux/hashtable.h> #include <linux/rbtree.h> #include <net/net_trackers.h> #include <net/net_debug.h> #include <net/dropreason-core.h> struct netpoll_info; struct device; struct ethtool_ops; struct kernel_hwtstamp_config; struct phy_device; struct dsa_port; struct ip_tunnel_parm; struct macsec_context; struct macsec_ops; struct netdev_name_node; struct sd_flow_limit; struct sfp_bus; /* 802.11 specific */ struct wireless_dev; /* 802.15.4 specific */ struct wpan_dev; struct mpls_dev; /* UDP Tunnel offloads */ struct udp_tunnel_info; struct udp_tunnel_nic_info; struct udp_tunnel_nic; struct bpf_prog; struct xdp_buff; struct xdp_frame; struct xdp_metadata_ops; struct xdp_md; typedef u32 xdp_features_t; void synchronize_net(void); void netdev_set_default_ethtool_ops(struct net_device *dev, const struct ethtool_ops *ops); void netdev_sw_irq_coalesce_default_on(struct net_device *dev); /* Backlog congestion levels */ #define NET_RX_SUCCESS 0 /* keep 'em coming, baby */ #define NET_RX_DROP 1 /* packet dropped */ #define MAX_NEST_DEV 8 /* * Transmit return codes: transmit return codes originate from three different * namespaces: * * - qdisc return codes * - driver transmit return codes * - errno values * * Drivers are allowed to return any one of those in their hard_start_xmit() * function. Real network devices commonly used with qdiscs should only return * the driver transmit return codes though - when qdiscs are used, the actual * transmission happens asynchronously, so the value is not propagated to * higher layers. Virtual network devices transmit synchronously; in this case * the driver transmit return codes are consumed by dev_queue_xmit(), and all * others are propagated to higher layers. */ /* qdisc ->enqueue() return codes. */ #define NET_XMIT_SUCCESS 0x00 #define NET_XMIT_DROP 0x01 /* skb dropped */ #define NET_XMIT_CN 0x02 /* congestion notification */ #define NET_XMIT_MASK 0x0f /* qdisc flags in net/sch_generic.h */ /* NET_XMIT_CN is special. It does not guarantee that this packet is lost. It * indicates that the device will soon be dropping packets, or already drops * some packets of the same priority; prompting us to send less aggressively. */ #define net_xmit_eval(e) ((e) == NET_XMIT_CN ? 0 : (e)) #define net_xmit_errno(e) ((e) != NET_XMIT_CN ? -ENOBUFS : 0) /* Driver transmit return codes */ #define NETDEV_TX_MASK 0xf0 enum netdev_tx { __NETDEV_TX_MIN = INT_MIN, /* make sure enum is signed */ NETDEV_TX_OK = 0x00, /* driver took care of packet */ NETDEV_TX_BUSY = 0x10, /* driver tx path was busy*/ }; typedef enum netdev_tx netdev_tx_t; /* * Current order: NETDEV_TX_MASK > NET_XMIT_MASK >= 0 is significant; * hard_start_xmit() return < NET_XMIT_MASK means skb was consumed. */ static inline bool dev_xmit_complete(int rc) { /* * Positive cases with an skb consumed by a driver: * - successful transmission (rc == NETDEV_TX_OK) * - error while transmitting (rc < 0) * - error while queueing to a different device (rc & NET_XMIT_MASK) */ if (likely(rc < NET_XMIT_MASK)) return true; return false; } /* * Compute the worst-case header length according to the protocols * used. */ #if defined(CONFIG_HYPERV_NET) # define LL_MAX_HEADER 128 #elif defined(CONFIG_WLAN) || IS_ENABLED(CONFIG_AX25) # if defined(CONFIG_MAC80211_MESH) # define LL_MAX_HEADER 128 # else # define LL_MAX_HEADER 96 # endif #else # define LL_MAX_HEADER 32 #endif #if !IS_ENABLED(CONFIG_NET_IPIP) && !IS_ENABLED(CONFIG_NET_IPGRE) && \ !IS_ENABLED(CONFIG_IPV6_SIT) && !IS_ENABLED(CONFIG_IPV6_TUNNEL) #define MAX_HEADER LL_MAX_HEADER #else #define MAX_HEADER (LL_MAX_HEADER + 48) #endif /* * Old network device statistics. Fields are native words * (unsigned long) so they can be read and written atomically. */ #define NET_DEV_STAT(FIELD) \ union { \ unsigned long FIELD; \ atomic_long_t __##FIELD; \ } struct net_device_stats { NET_DEV_STAT(rx_packets); NET_DEV_STAT(tx_packets); NET_DEV_STAT(rx_bytes); NET_DEV_STAT(tx_bytes); NET_DEV_STAT(rx_errors); NET_DEV_STAT(tx_errors); NET_DEV_STAT(rx_dropped); NET_DEV_STAT(tx_dropped); NET_DEV_STAT(multicast); NET_DEV_STAT(collisions); NET_DEV_STAT(rx_length_errors); NET_DEV_STAT(rx_over_errors); NET_DEV_STAT(rx_crc_errors); NET_DEV_STAT(rx_frame_errors); NET_DEV_STAT(rx_fifo_errors); NET_DEV_STAT(rx_missed_errors); NET_DEV_STAT(tx_aborted_errors); NET_DEV_STAT(tx_carrier_errors); NET_DEV_STAT(tx_fifo_errors); NET_DEV_STAT(tx_heartbeat_errors); NET_DEV_STAT(tx_window_errors); NET_DEV_STAT(rx_compressed); NET_DEV_STAT(tx_compressed); }; #undef NET_DEV_STAT /* per-cpu stats, allocated on demand. * Try to fit them in a single cache line, for dev_get_stats() sake. */ struct net_device_core_stats { unsigned long rx_dropped; unsigned long tx_dropped; unsigned long rx_nohandler; unsigned long rx_otherhost_dropped; } __aligned(4 * sizeof(unsigned long)); #include <linux/cache.h> #include <linux/skbuff.h> struct neighbour; struct neigh_parms; struct sk_buff; struct netdev_hw_addr { struct list_head list; struct rb_node node; unsigned char addr[MAX_ADDR_LEN]; unsigned char type; #define NETDEV_HW_ADDR_T_LAN 1 #define NETDEV_HW_ADDR_T_SAN 2 #define NETDEV_HW_ADDR_T_UNICAST 3 #define NETDEV_HW_ADDR_T_MULTICAST 4 bool global_use; int sync_cnt; int refcount; int synced; struct rcu_head rcu_head; }; struct netdev_hw_addr_list { struct list_head list; int count; /* Auxiliary tree for faster lookup on addition and deletion */ struct rb_root tree; }; #define netdev_hw_addr_list_count(l) ((l)->count) #define netdev_hw_addr_list_empty(l) (netdev_hw_addr_list_count(l) == 0) #define netdev_hw_addr_list_for_each(ha, l) \ list_for_each_entry(ha, &(l)->list, list) #define netdev_uc_count(dev) netdev_hw_addr_list_count(&(dev)->uc) #define netdev_uc_empty(dev) netdev_hw_addr_list_empty(&(dev)->uc) #define netdev_for_each_uc_addr(ha, dev) \ netdev_hw_addr_list_for_each(ha, &(dev)->uc) #define netdev_for_each_synced_uc_addr(_ha, _dev) \ netdev_for_each_uc_addr((_ha), (_dev)) \ if ((_ha)->sync_cnt) #define netdev_mc_count(dev) netdev_hw_addr_list_count(&(dev)->mc) #define netdev_mc_empty(dev) netdev_hw_addr_list_empty(&(dev)->mc) #define netdev_for_each_mc_addr(ha, dev) \ netdev_hw_addr_list_for_each(ha, &(dev)->mc) #define netdev_for_each_synced_mc_addr(_ha, _dev) \ netdev_for_each_mc_addr((_ha), (_dev)) \ if ((_ha)->sync_cnt) struct hh_cache { unsigned int hh_len; seqlock_t hh_lock; /* cached hardware header; allow for machine alignment needs. */ #define HH_DATA_MOD 16 #define HH_DATA_OFF(__len) \ (HH_DATA_MOD - (((__len - 1) & (HH_DATA_MOD - 1)) + 1)) #define HH_DATA_ALIGN(__len) \ (((__len)+(HH_DATA_MOD-1))&~(HH_DATA_MOD - 1)) unsigned long hh_data[HH_DATA_ALIGN(LL_MAX_HEADER) / sizeof(long)]; }; /* Reserve HH_DATA_MOD byte-aligned hard_header_len, but at least that much. * Alternative is: * dev->hard_header_len ? (dev->hard_header_len + * (HH_DATA_MOD - 1)) & ~(HH_DATA_MOD - 1) : 0 * * We could use other alignment values, but we must maintain the * relationship HH alignment <= LL alignment. */ #define LL_RESERVED_SPACE(dev) \ ((((dev)->hard_header_len + READ_ONCE((dev)->needed_headroom)) \ & ~(HH_DATA_MOD - 1)) + HH_DATA_MOD) #define LL_RESERVED_SPACE_EXTRA(dev,extra) \ ((((dev)->hard_header_len + READ_ONCE((dev)->needed_headroom) + (extra)) \ & ~(HH_DATA_MOD - 1)) + HH_DATA_MOD) struct header_ops { int (*create) (struct sk_buff *skb, struct net_device *dev, unsigned short type, const void *daddr, const void *saddr, unsigned int len); int (*parse)(const struct sk_buff *skb, unsigned char *haddr); int (*cache)(const struct neighbour *neigh, struct hh_cache *hh, __be16 type); void (*cache_update)(struct hh_cache *hh, const struct net_device *dev, const unsigned char *haddr); bool (*validate)(const char *ll_header, unsigned int len); __be16 (*parse_protocol)(const struct sk_buff *skb); }; /* These flag bits are private to the generic network queueing * layer; they may not be explicitly referenced by any other * code. */ enum netdev_state_t { __LINK_STATE_START, __LINK_STATE_PRESENT, __LINK_STATE_NOCARRIER, __LINK_STATE_LINKWATCH_PENDING, __LINK_STATE_DORMANT, __LINK_STATE_TESTING, }; struct gro_list { struct list_head list; int count; }; /* * size of gro hash buckets, must less than bit number of * napi_struct::gro_bitmask */ #define GRO_HASH_BUCKETS 8 /* * Structure for NAPI scheduling similar to tasklet but with weighting */ struct napi_struct { /* The poll_list must only be managed by the entity which * changes the state of the NAPI_STATE_SCHED bit. This means * whoever atomically sets that bit can add this napi_struct * to the per-CPU poll_list, and whoever clears that bit * can remove from the list right before clearing the bit. */ struct list_head poll_list; unsigned long state; int weight; int defer_hard_irqs_count; unsigned long gro_bitmask; int (*poll)(struct napi_struct *, int); #ifdef CONFIG_NETPOLL /* CPU actively polling if netpoll is configured */ int poll_owner; #endif /* CPU on which NAPI has been scheduled for processing */ int list_owner; struct net_device *dev; struct gro_list gro_hash[GRO_HASH_BUCKETS]; struct sk_buff *skb; struct list_head rx_list; /* Pending GRO_NORMAL skbs */ int rx_count; /* length of rx_list */ unsigned int napi_id; struct hrtimer timer; struct task_struct *thread; /* control-path-only fields follow */ struct list_head dev_list; struct hlist_node napi_hash_node; int irq; }; enum { NAPI_STATE_SCHED, /* Poll is scheduled */ NAPI_STATE_MISSED, /* reschedule a napi */ NAPI_STATE_DISABLE, /* Disable pending */ NAPI_STATE_NPSVC, /* Netpoll - don't dequeue from poll_list */ NAPI_STATE_LISTED, /* NAPI added to system lists */ NAPI_STATE_NO_BUSY_POLL, /* Do not add in napi_hash, no busy polling */ NAPI_STATE_IN_BUSY_POLL, /* sk_busy_loop() owns this NAPI */ NAPI_STATE_PREFER_BUSY_POLL, /* prefer busy-polling over softirq processing*/ NAPI_STATE_THREADED, /* The poll is performed inside its own thread*/ NAPI_STATE_SCHED_THREADED, /* Napi is currently scheduled in threaded mode */ }; enum { NAPIF_STATE_SCHED = BIT(NAPI_STATE_SCHED), NAPIF_STATE_MISSED = BIT(NAPI_STATE_MISSED), NAPIF_STATE_DISABLE = BIT(NAPI_STATE_DISABLE), NAPIF_STATE_NPSVC = BIT(NAPI_STATE_NPSVC), NAPIF_STATE_LISTED = BIT(NAPI_STATE_LISTED), NAPIF_STATE_NO_BUSY_POLL = BIT(NAPI_STATE_NO_BUSY_POLL), NAPIF_STATE_IN_BUSY_POLL = BIT(NAPI_STATE_IN_BUSY_POLL), NAPIF_STATE_PREFER_BUSY_POLL = BIT(NAPI_STATE_PREFER_BUSY_POLL), NAPIF_STATE_THREADED = BIT(NAPI_STATE_THREADED), NAPIF_STATE_SCHED_THREADED = BIT(NAPI_STATE_SCHED_THREADED), }; enum gro_result { GRO_MERGED, GRO_MERGED_FREE, GRO_HELD, GRO_NORMAL, GRO_CONSUMED, }; typedef enum gro_result gro_result_t; /* * enum rx_handler_result - Possible return values for rx_handlers. * @RX_HANDLER_CONSUMED: skb was consumed by rx_handler, do not process it * further. * @RX_HANDLER_ANOTHER: Do another round in receive path. This is indicated in * case skb->dev was changed by rx_handler. * @RX_HANDLER_EXACT: Force exact delivery, no wildcard. * @RX_HANDLER_PASS: Do nothing, pass the skb as if no rx_handler was called. * * rx_handlers are functions called from inside __netif_receive_skb(), to do * special processing of the skb, prior to delivery to protocol handlers. * * Currently, a net_device can only have a single rx_handler registered. Trying * to register a second rx_handler will return -EBUSY. * * To register a rx_handler on a net_device, use netdev_rx_handler_register(). * To unregister a rx_handler on a net_device, use * netdev_rx_handler_unregister(). * * Upon return, rx_handler is expected to tell __netif_receive_skb() what to * do with the skb. * * If the rx_handler consumed the skb in some way, it should return * RX_HANDLER_CONSUMED. This is appropriate when the rx_handler arranged for * the skb to be delivered in some other way. * * If the rx_handler changed skb->dev, to divert the skb to another * net_device, it should return RX_HANDLER_ANOTHER. The rx_handler for the * new device will be called if it exists. * * If the rx_handler decides the skb should be ignored, it should return * RX_HANDLER_EXACT. The skb will only be delivered to protocol handlers that * are registered on exact device (ptype->dev == skb->dev). * * If the rx_handler didn't change skb->dev, but wants the skb to be normally * delivered, it should return RX_HANDLER_PASS. * * A device without a registered rx_handler will behave as if rx_handler * returned RX_HANDLER_PASS. */ enum rx_handler_result { RX_HANDLER_CONSUMED, RX_HANDLER_ANOTHER, RX_HANDLER_EXACT, RX_HANDLER_PASS, }; typedef enum rx_handler_result rx_handler_result_t; typedef rx_handler_result_t rx_handler_func_t(struct sk_buff **pskb); void __napi_schedule(struct napi_struct *n); void __napi_schedule_irqoff(struct napi_struct *n); static inline bool napi_disable_pending(struct napi_struct *n) { return test_bit(NAPI_STATE_DISABLE, &n->state); } static inline bool napi_prefer_busy_poll(struct napi_struct *n) { return test_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state); } /** * napi_is_scheduled - test if NAPI is scheduled * @n: NAPI context * * This check is "best-effort". With no locking implemented, * a NAPI can be scheduled or terminate right after this check * and produce not precise results. * * NAPI_STATE_SCHED is an internal state, napi_is_scheduled * should not be used normally and napi_schedule should be * used instead. * * Use only if the driver really needs to check if a NAPI * is scheduled for example in the context of delayed timer * that can be skipped if a NAPI is already scheduled. * * Return True if NAPI is scheduled, False otherwise. */ static inline bool napi_is_scheduled(struct napi_struct *n) { return test_bit(NAPI_STATE_SCHED, &n->state); } bool napi_schedule_prep(struct napi_struct *n); /** * napi_schedule - schedule NAPI poll * @n: NAPI context * * Schedule NAPI poll routine to be called if it is not already * running. * Return true if we schedule a NAPI or false if not. * Refer to napi_schedule_prep() for additional reason on why * a NAPI might not be scheduled. */ static inline bool napi_schedule(struct napi_struct *n) { if (napi_schedule_prep(n)) { __napi_schedule(n); return true; } return false; } /** * napi_schedule_irqoff - schedule NAPI poll * @n: NAPI context * * Variant of napi_schedule(), assuming hard irqs are masked. */ static inline void napi_schedule_irqoff(struct napi_struct *n) { if (napi_schedule_prep(n)) __napi_schedule_irqoff(n); } /** * napi_complete_done - NAPI processing complete * @n: NAPI context * @work_done: number of packets processed * * Mark NAPI processing as complete. Should only be called if poll budget * has not been completely consumed. * Prefer over napi_complete(). * Return false if device should avoid rearming interrupts. */ bool napi_complete_done(struct napi_struct *n, int work_done); static inline bool napi_complete(struct napi_struct *n) { return napi_complete_done(n, 0); } int dev_set_threaded(struct net_device *dev, bool threaded); /** * napi_disable - prevent NAPI from scheduling * @n: NAPI context * * Stop NAPI from being scheduled on this context. * Waits till any outstanding processing completes. */ void napi_disable(struct napi_struct *n); void napi_enable(struct napi_struct *n); /** * napi_synchronize - wait until NAPI is not running * @n: NAPI context * * Wait until NAPI is done being scheduled on this context. * Waits till any outstanding processing completes but * does not disable future activations. */ static inline void napi_synchronize(const struct napi_struct *n) { if (IS_ENABLED(CONFIG_SMP)) while (test_bit(NAPI_STATE_SCHED, &n->state)) msleep(1); else barrier(); } /** * napi_if_scheduled_mark_missed - if napi is running, set the * NAPIF_STATE_MISSED * @n: NAPI context * * If napi is running, set the NAPIF_STATE_MISSED, and return true if * NAPI is scheduled. **/ static inline bool napi_if_scheduled_mark_missed(struct napi_struct *n) { unsigned long val, new; val = READ_ONCE(n->state); do { if (val & NAPIF_STATE_DISABLE) return true; if (!(val & NAPIF_STATE_SCHED)) return false; new = val | NAPIF_STATE_MISSED; } while (!try_cmpxchg(&n->state, &val, new)); return true; } enum netdev_queue_state_t { __QUEUE_STATE_DRV_XOFF, __QUEUE_STATE_STACK_XOFF, __QUEUE_STATE_FROZEN, }; #define QUEUE_STATE_DRV_XOFF (1 << __QUEUE_STATE_DRV_XOFF) #define QUEUE_STATE_STACK_XOFF (1 << __QUEUE_STATE_STACK_XOFF) #define QUEUE_STATE_FROZEN (1 << __QUEUE_STATE_FROZEN) #define QUEUE_STATE_ANY_XOFF (QUEUE_STATE_DRV_XOFF | QUEUE_STATE_STACK_XOFF) #define QUEUE_STATE_ANY_XOFF_OR_FROZEN (QUEUE_STATE_ANY_XOFF | \ QUEUE_STATE_FROZEN) #define QUEUE_STATE_DRV_XOFF_OR_FROZEN (QUEUE_STATE_DRV_XOFF | \ QUEUE_STATE_FROZEN) /* * __QUEUE_STATE_DRV_XOFF is used by drivers to stop the transmit queue. The * netif_tx_* functions below are used to manipulate this flag. The * __QUEUE_STATE_STACK_XOFF flag is used by the stack to stop the transmit * queue independently. The netif_xmit_*stopped functions below are called * to check if the queue has been stopped by the driver or stack (either * of the XOFF bits are set in the state). Drivers should not need to call * netif_xmit*stopped functions, they should only be using netif_tx_*. */ struct netdev_queue { /* * read-mostly part */ struct net_device *dev; netdevice_tracker dev_tracker; struct Qdisc __rcu *qdisc; struct Qdisc __rcu *qdisc_sleeping; #ifdef CONFIG_SYSFS struct kobject kobj; #endif #if defined(CONFIG_XPS) && defined(CONFIG_NUMA) int numa_node; #endif unsigned long tx_maxrate; /* * Number of TX timeouts for this queue * (/sys/class/net/DEV/Q/trans_timeout) */ atomic_long_t trans_timeout; /* Subordinate device that the queue has been assigned to */ struct net_device *sb_dev; #ifdef CONFIG_XDP_SOCKETS struct xsk_buff_pool *pool; #endif /* NAPI instance for the queue * Readers and writers must hold RTNL */ struct napi_struct *napi; /* * write-mostly part */ spinlock_t _xmit_lock ____cacheline_aligned_in_smp; int xmit_lock_owner; /* * Time (in jiffies) of last Tx */ unsigned long trans_start; unsigned long state; #ifdef CONFIG_BQL struct dql dql; #endif } ____cacheline_aligned_in_smp; extern int sysctl_fb_tunnels_only_for_init_net; extern int sysctl_devconf_inherit_init_net; /* * sysctl_fb_tunnels_only_for_init_net == 0 : For all netns * == 1 : For initns only * == 2 : For none. */ static inline bool net_has_fallback_tunnels(const struct net *net) { #if IS_ENABLED(CONFIG_SYSCTL) int fb_tunnels_only_for_init_net = READ_ONCE(sysctl_fb_tunnels_only_for_init_net); return !fb_tunnels_only_for_init_net || (net_eq(net, &init_net) && fb_tunnels_only_for_init_net == 1); #else return true; #endif } static inline int net_inherit_devconf(void) { #if IS_ENABLED(CONFIG_SYSCTL) return READ_ONCE(sysctl_devconf_inherit_init_net); #else return 0; #endif } static inline int netdev_queue_numa_node_read(const struct netdev_queue *q) { #if defined(CONFIG_XPS) && defined(CONFIG_NUMA) return q->numa_node; #else return NUMA_NO_NODE; #endif } static inline void netdev_queue_numa_node_write(struct netdev_queue *q, int node) { #if defined(CONFIG_XPS) && defined(CONFIG_NUMA) q->numa_node = node; #endif } #ifdef CONFIG_RFS_ACCEL bool rps_may_expire_flow(struct net_device *dev, u16 rxq_index, u32 flow_id, u16 filter_id); #endif /* XPS map type and offset of the xps map within net_device->xps_maps[]. */ enum xps_map_type { XPS_CPUS = 0, XPS_RXQS, XPS_MAPS_MAX, }; #ifdef CONFIG_XPS /* * This structure holds an XPS map which can be of variable length. The * map is an array of queues. */ struct xps_map { unsigned int len; unsigned int alloc_len; struct rcu_head rcu; u16 queues[]; }; #define XPS_MAP_SIZE(_num) (sizeof(struct xps_map) + ((_num) * sizeof(u16))) #define XPS_MIN_MAP_ALLOC ((L1_CACHE_ALIGN(offsetof(struct xps_map, queues[1])) \ - sizeof(struct xps_map)) / sizeof(u16)) /* * This structure holds all XPS maps for device. Maps are indexed by CPU. * * We keep track of the number of cpus/rxqs used when the struct is allocated, * in nr_ids. This will help not accessing out-of-bound memory. * * We keep track of the number of traffic classes used when the struct is * allocated, in num_tc. This will be used to navigate the maps, to ensure we're * not crossing its upper bound, as the original dev->num_tc can be updated in * the meantime. */ struct xps_dev_maps { struct rcu_head rcu; unsigned int nr_ids; s16 num_tc; struct xps_map __rcu *attr_map[]; /* Either CPUs map or RXQs map */ }; #define XPS_CPU_DEV_MAPS_SIZE(_tcs) (sizeof(struct xps_dev_maps) + \ (nr_cpu_ids * (_tcs) * sizeof(struct xps_map *))) #define XPS_RXQ_DEV_MAPS_SIZE(_tcs, _rxqs) (sizeof(struct xps_dev_maps) +\ (_rxqs * (_tcs) * sizeof(struct xps_map *))) #endif /* CONFIG_XPS */ #define TC_MAX_QUEUE 16 #define TC_BITMASK 15 /* HW offloaded queuing disciplines txq count and offset maps */ struct netdev_tc_txq { u16 count; u16 offset; }; #if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE) /* * This structure is to hold information about the device * configured to run FCoE protocol stack. */ struct netdev_fcoe_hbainfo { char manufacturer[64]; char serial_number[64]; char hardware_version[64]; char driver_version[64]; char optionrom_version[64]; char firmware_version[64]; char model[256]; char model_description[256]; }; #endif #define MAX_PHYS_ITEM_ID_LEN 32 /* This structure holds a unique identifier to identify some * physical item (port for example) used by a netdevice. */ struct netdev_phys_item_id { unsigned char id[MAX_PHYS_ITEM_ID_LEN]; unsigned char id_len; }; static inline bool netdev_phys_item_id_same(struct netdev_phys_item_id *a, struct netdev_phys_item_id *b) { return a->id_len == b->id_len && memcmp(a->id, b->id, a->id_len) == 0; } typedef u16 (*select_queue_fallback_t)(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev); enum net_device_path_type { DEV_PATH_ETHERNET = 0, DEV_PATH_VLAN, DEV_PATH_BRIDGE, DEV_PATH_PPPOE, DEV_PATH_DSA, DEV_PATH_MTK_WDMA, }; struct net_device_path { enum net_device_path_type type; const struct net_device *dev; union { struct { u16 id; __be16 proto; u8 h_dest[ETH_ALEN]; } encap; struct { enum { DEV_PATH_BR_VLAN_KEEP, DEV_PATH_BR_VLAN_TAG, DEV_PATH_BR_VLAN_UNTAG, DEV_PATH_BR_VLAN_UNTAG_HW, } vlan_mode; u16 vlan_id; __be16 vlan_proto; } bridge; struct { int port; u16 proto; } dsa; struct { u8 wdma_idx; u8 queue; u16 wcid; u8 bss; u8 amsdu; } mtk_wdma; }; }; #define NET_DEVICE_PATH_STACK_MAX 5 #define NET_DEVICE_PATH_VLAN_MAX 2 struct net_device_path_stack { int num_paths; struct net_device_path path[NET_DEVICE_PATH_STACK_MAX]; }; struct net_device_path_ctx { const struct net_device *dev; u8 daddr[ETH_ALEN]; int num_vlans; struct { u16 id; __be16 proto; } vlan[NET_DEVICE_PATH_VLAN_MAX]; }; enum tc_setup_type { TC_QUERY_CAPS, TC_SETUP_QDISC_MQPRIO, TC_SETUP_CLSU32, TC_SETUP_CLSFLOWER, TC_SETUP_CLSMATCHALL, TC_SETUP_CLSBPF, TC_SETUP_BLOCK, TC_SETUP_QDISC_CBS, TC_SETUP_QDISC_RED, TC_SETUP_QDISC_PRIO, TC_SETUP_QDISC_MQ, TC_SETUP_QDISC_ETF, TC_SETUP_ROOT_QDISC, TC_SETUP_QDISC_GRED, TC_SETUP_QDISC_TAPRIO, TC_SETUP_FT, TC_SETUP_QDISC_ETS, TC_SETUP_QDISC_TBF, TC_SETUP_QDISC_FIFO, TC_SETUP_QDISC_HTB, TC_SETUP_ACT, }; /* These structures hold the attributes of bpf state that are being passed * to the netdevice through the bpf op. */ enum bpf_netdev_command { /* Set or clear a bpf program used in the earliest stages of packet * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The callee * is responsible for calling bpf_prog_put on any old progs that are * stored. In case of error, the callee need not release the new prog * reference, but on success it takes ownership and must bpf_prog_put * when it is no longer used. */ XDP_SETUP_PROG, XDP_SETUP_PROG_HW, /* BPF program for offload callbacks, invoked at program load time. */ BPF_OFFLOAD_MAP_ALLOC, BPF_OFFLOAD_MAP_FREE, XDP_SETUP_XSK_POOL, }; struct bpf_prog_offload_ops; struct netlink_ext_ack; struct xdp_umem; struct xdp_dev_bulk_queue; struct bpf_xdp_link; enum bpf_xdp_mode { XDP_MODE_SKB = 0, XDP_MODE_DRV = 1, XDP_MODE_HW = 2, __MAX_XDP_MODE }; struct bpf_xdp_entity { struct bpf_prog *prog; struct bpf_xdp_link *link; }; struct netdev_bpf { enum bpf_netdev_command command; union { /* XDP_SETUP_PROG */ struct { u32 flags; struct bpf_prog *prog; struct netlink_ext_ack *extack; }; /* BPF_OFFLOAD_MAP_ALLOC, BPF_OFFLOAD_MAP_FREE */ struct { struct bpf_offloaded_map *offmap; }; /* XDP_SETUP_XSK_POOL */ struct { struct xsk_buff_pool *pool; u16 queue_id; } xsk; }; }; /* Flags for ndo_xsk_wakeup. */ #define XDP_WAKEUP_RX (1 << 0) #define XDP_WAKEUP_TX (1 << 1) #ifdef CONFIG_XFRM_OFFLOAD struct xfrmdev_ops { int (*xdo_dev_state_add) (struct xfrm_state *x, struct netlink_ext_ack *extack); void (*xdo_dev_state_delete) (struct xfrm_state *x); void (*xdo_dev_state_free) (struct xfrm_state *x); bool (*xdo_dev_offload_ok) (struct sk_buff *skb, struct xfrm_state *x); void (*xdo_dev_state_advance_esn) (struct xfrm_state *x); void (*xdo_dev_state_update_stats) (struct xfrm_state *x); int (*xdo_dev_policy_add) (struct xfrm_policy *x, struct netlink_ext_ack *extack); void (*xdo_dev_policy_delete) (struct xfrm_policy *x); void (*xdo_dev_policy_free) (struct xfrm_policy *x); }; #endif struct dev_ifalias { struct rcu_head rcuhead; char ifalias[]; }; struct devlink; struct tlsdev_ops; struct netdev_net_notifier { struct list_head list; struct notifier_block *nb; }; /* * This structure defines the management hooks for network devices. * The following hooks can be defined; unless noted otherwise, they are * optional and can be filled with a null pointer. * * int (*ndo_init)(struct net_device *dev); * This function is called once when a network device is registered. * The network device can use this for any late stage initialization * or semantic validation. It can fail with an error code which will * be propagated back to register_netdev. * * void (*ndo_uninit)(struct net_device *dev); * This function is called when device is unregistered or when registration * fails. It is not called if init fails. * * int (*ndo_open)(struct net_device *dev); * This function is called when a network device transitions to the up * state. * * int (*ndo_stop)(struct net_device *dev); * This function is called when a network device transitions to the down * state. * * netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, * struct net_device *dev); * Called when a packet needs to be transmitted. * Returns NETDEV_TX_OK. Can return NETDEV_TX_BUSY, but you should stop * the queue before that can happen; it's for obsolete devices and weird * corner cases, but the stack really does a non-trivial amount * of useless work if you return NETDEV_TX_BUSY. * Required; cannot be NULL. * * netdev_features_t (*ndo_features_check)(struct sk_buff *skb, * struct net_device *dev * netdev_features_t features); * Called by core transmit path to determine if device is capable of * performing offload operations on a given packet. This is to give * the device an opportunity to implement any restrictions that cannot * be otherwise expressed by feature flags. The check is called with * the set of features that the stack has calculated and it returns * those the driver believes to be appropriate. * * u16 (*ndo_select_queue)(struct net_device *dev, struct sk_buff *skb, * struct net_device *sb_dev); * Called to decide which queue to use when device supports multiple * transmit queues. * * void (*ndo_change_rx_flags)(struct net_device *dev, int flags); * This function is called to allow device receiver to make * changes to configuration when multicast or promiscuous is enabled. * * void (*ndo_set_rx_mode)(struct net_device *dev); * This function is called device changes address list filtering. * If driver handles unicast address filtering, it should set * IFF_UNICAST_FLT in its priv_flags. * * int (*ndo_set_mac_address)(struct net_device *dev, void *addr); * This function is called when the Media Access Control address * needs to be changed. If this interface is not defined, the * MAC address can not be changed. * * int (*ndo_validate_addr)(struct net_device *dev); * Test if Media Access Control address is valid for the device. * * int (*ndo_do_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); * Old-style ioctl entry point. This is used internally by the * appletalk and ieee802154 subsystems but is no longer called by * the device ioctl handler. * * int (*ndo_siocbond)(struct net_device *dev, struct ifreq *ifr, int cmd); * Used by the bonding driver for its device specific ioctls: * SIOCBONDENSLAVE, SIOCBONDRELEASE, SIOCBONDSETHWADDR, SIOCBONDCHANGEACTIVE, * SIOCBONDSLAVEINFOQUERY, and SIOCBONDINFOQUERY * * * int (*ndo_eth_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); * Called for ethernet specific ioctls: SIOCGMIIPHY, SIOCGMIIREG, * SIOCSMIIREG, SIOCSHWTSTAMP and SIOCGHWTSTAMP. * * int (*ndo_set_config)(struct net_device *dev, struct ifmap *map); * Used to set network devices bus interface parameters. This interface * is retained for legacy reasons; new devices should use the bus * interface (PCI) for low level management. * * int (*ndo_change_mtu)(struct net_device *dev, int new_mtu); * Called when a user wants to change the Maximum Transfer Unit * of a device. * * void (*ndo_tx_timeout)(struct net_device *dev, unsigned int txqueue); * Callback used when the transmitter has not made any progress * for dev->watchdog ticks. * * void (*ndo_get_stats64)(struct net_device *dev, * struct rtnl_link_stats64 *storage); * struct net_device_stats* (*ndo_get_stats)(struct net_device *dev); * Called when a user wants to get the network device usage * statistics. Drivers must do one of the following: * 1. Define @ndo_get_stats64 to fill in a zero-initialised * rtnl_link_stats64 structure passed by the caller. * 2. Define @ndo_get_stats to update a net_device_stats structure * (which should normally be dev->stats) and return a pointer to * it. The structure may be changed asynchronously only if each * field is written atomically. * 3. Update dev->stats asynchronously and atomically, and define * neither operation. * * bool (*ndo_has_offload_stats)(const struct net_device *dev, int attr_id) * Return true if this device supports offload stats of this attr_id. * * int (*ndo_get_offload_stats)(int attr_id, const struct net_device *dev, * void *attr_data) * Get statistics for offload operations by attr_id. Write it into the * attr_data pointer. * * int (*ndo_vlan_rx_add_vid)(struct net_device *dev, __be16 proto, u16 vid); * If device supports VLAN filtering this function is called when a * VLAN id is registered. * * int (*ndo_vlan_rx_kill_vid)(struct net_device *dev, __be16 proto, u16 vid); * If device supports VLAN filtering this function is called when a * VLAN id is unregistered. * * void (*ndo_poll_controller)(struct net_device *dev); * * SR-IOV management functions. * int (*ndo_set_vf_mac)(struct net_device *dev, int vf, u8* mac); * int (*ndo_set_vf_vlan)(struct net_device *dev, int vf, u16 vlan, * u8 qos, __be16 proto); * int (*ndo_set_vf_rate)(struct net_device *dev, int vf, int min_tx_rate, * int max_tx_rate); * int (*ndo_set_vf_spoofchk)(struct net_device *dev, int vf, bool setting); * int (*ndo_set_vf_trust)(struct net_device *dev, int vf, bool setting); * int (*ndo_get_vf_config)(struct net_device *dev, * int vf, struct ifla_vf_info *ivf); * int (*ndo_set_vf_link_state)(struct net_device *dev, int vf, int link_state); * int (*ndo_set_vf_port)(struct net_device *dev, int vf, * struct nlattr *port[]); * * Enable or disable the VF ability to query its RSS Redirection Table and * Hash Key. This is needed since on some devices VF share this information * with PF and querying it may introduce a theoretical security risk. * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool setting); * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb); * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type, * void *type_data); * Called to setup any 'tc' scheduler, classifier or action on @dev. * This is always called from the stack with the rtnl lock held and netif * tx queues stopped. This allows the netdevice to perform queue * management safely. * * Fiber Channel over Ethernet (FCoE) offload functions. * int (*ndo_fcoe_enable)(struct net_device *dev); * Called when the FCoE protocol stack wants to start using LLD for FCoE * so the underlying device can perform whatever needed configuration or * initialization to support acceleration of FCoE traffic. * * int (*ndo_fcoe_disable)(struct net_device *dev); * Called when the FCoE protocol stack wants to stop using LLD for FCoE * so the underlying device can perform whatever needed clean-ups to * stop supporting acceleration of FCoE traffic. * * int (*ndo_fcoe_ddp_setup)(struct net_device *dev, u16 xid, * struct scatterlist *sgl, unsigned int sgc); * Called when the FCoE Initiator wants to initialize an I/O that * is a possible candidate for Direct Data Placement (DDP). The LLD can * perform necessary setup and returns 1 to indicate the device is set up * successfully to perform DDP on this I/O, otherwise this returns 0. * * int (*ndo_fcoe_ddp_done)(struct net_device *dev, u16 xid); * Called when the FCoE Initiator/Target is done with the DDPed I/O as * indicated by the FC exchange id 'xid', so the underlying device can * clean up and reuse resources for later DDP requests. * * int (*ndo_fcoe_ddp_target)(struct net_device *dev, u16 xid, * struct scatterlist *sgl, unsigned int sgc); * Called when the FCoE Target wants to initialize an I/O that * is a possible candidate for Direct Data Placement (DDP). The LLD can * perform necessary setup and returns 1 to indicate the device is set up * successfully to perform DDP on this I/O, otherwise this returns 0. * * int (*ndo_fcoe_get_hbainfo)(struct net_device *dev, * struct netdev_fcoe_hbainfo *hbainfo); * Called when the FCoE Protocol stack wants information on the underlying * device. This information is utilized by the FCoE protocol stack to * register attributes with Fiber Channel management service as per the * FC-GS Fabric Device Management Information(FDMI) specification. * * int (*ndo_fcoe_get_wwn)(struct net_device *dev, u64 *wwn, int type); * Called when the underlying device wants to override default World Wide * Name (WWN) generation mechanism in FCoE protocol stack to pass its own * World Wide Port Name (WWPN) or World Wide Node Name (WWNN) to the FCoE * protocol stack to use. * * RFS acceleration. * int (*ndo_rx_flow_steer)(struct net_device *dev, const struct sk_buff *skb, * u16 rxq_index, u32 flow_id); * Set hardware filter for RFS. rxq_index is the target queue index; * flow_id is a flow ID to be passed to rps_may_expire_flow() later. * Return the filter ID on success, or a negative error code. * * Slave management functions (for bridge, bonding, etc). * int (*ndo_add_slave)(struct net_device *dev, struct net_device *slave_dev); * Called to make another netdev an underling. * * int (*ndo_del_slave)(struct net_device *dev, struct net_device *slave_dev); * Called to release previously enslaved netdev. * * struct net_device *(*ndo_get_xmit_slave)(struct net_device *dev, * struct sk_buff *skb, * bool all_slaves); * Get the xmit slave of master device. If all_slaves is true, function * assume all the slaves can transmit. * * Feature/offload setting functions. * netdev_features_t (*ndo_fix_features)(struct net_device *dev, * netdev_features_t features); * Adjusts the requested feature flags according to device-specific * constraints, and returns the resulting flags. Must not modify * the device state. * * int (*ndo_set_features)(struct net_device *dev, netdev_features_t features); * Called to update device configuration to new features. Passed * feature set might be less than what was returned by ndo_fix_features()). * Must return >0 or -errno if it changed dev->features itself. * * int (*ndo_fdb_add)(struct ndmsg *ndm, struct nlattr *tb[], * struct net_device *dev, * const unsigned char *addr, u16 vid, u16 flags, * struct netlink_ext_ack *extack); * Adds an FDB entry to dev for addr. * int (*ndo_fdb_del)(struct ndmsg *ndm, struct nlattr *tb[], * struct net_device *dev, * const unsigned char *addr, u16 vid) * Deletes the FDB entry from dev coresponding to addr. * int (*ndo_fdb_del_bulk)(struct nlmsghdr *nlh, struct net_device *dev, * struct netlink_ext_ack *extack); * int (*ndo_fdb_dump)(struct sk_buff *skb, struct netlink_callback *cb, * struct net_device *dev, struct net_device *filter_dev, * int *idx) * Used to add FDB entries to dump requests. Implementers should add * entries to skb and update idx with the number of entries. * * int (*ndo_mdb_add)(struct net_device *dev, struct nlattr *tb[], * u16 nlmsg_flags, struct netlink_ext_ack *extack); * Adds an MDB entry to dev. * int (*ndo_mdb_del)(struct net_device *dev, struct nlattr *tb[], * struct netlink_ext_ack *extack); * Deletes the MDB entry from dev. * int (*ndo_mdb_del_bulk)(struct net_device *dev, struct nlattr *tb[], * struct netlink_ext_ack *extack); * Bulk deletes MDB entries from dev. * int (*ndo_mdb_dump)(struct net_device *dev, struct sk_buff *skb, * struct netlink_callback *cb); * Dumps MDB entries from dev. The first argument (marker) in the netlink * callback is used by core rtnetlink code. * * int (*ndo_bridge_setlink)(struct net_device *dev, struct nlmsghdr *nlh, * u16 flags, struct netlink_ext_ack *extack) * int (*ndo_bridge_getlink)(struct sk_buff *skb, u32 pid, u32 seq, * struct net_device *dev, u32 filter_mask, * int nlflags) * int (*ndo_bridge_dellink)(struct net_device *dev, struct nlmsghdr *nlh, * u16 flags); * * int (*ndo_change_carrier)(struct net_device *dev, bool new_carrier); * Called to change device carrier. Soft-devices (like dummy, team, etc) * which do not represent real hardware may define this to allow their * userspace components to manage their virtual carrier state. Devices * that determine carrier state from physical hardware properties (eg * network cables) or protocol-dependent mechanisms (eg * USB_CDC_NOTIFY_NETWORK_CONNECTION) should NOT implement this function. * * int (*ndo_get_phys_port_id)(struct net_device *dev, * struct netdev_phys_item_id *ppid); * Called to get ID of physical port of this device. If driver does * not implement this, it is assumed that the hw is not able to have * multiple net devices on single physical port. * * int (*ndo_get_port_parent_id)(struct net_device *dev, * struct netdev_phys_item_id *ppid) * Called to get the parent ID of the physical port of this device. * * void* (*ndo_dfwd_add_station)(struct net_device *pdev, * struct net_device *dev) * Called by upper layer devices to accelerate switching or other * station functionality into hardware. 'pdev is the lowerdev * to use for the offload and 'dev' is the net device that will * back the offload. Returns a pointer to the private structure * the upper layer will maintain. * void (*ndo_dfwd_del_station)(struct net_device *pdev, void *priv) * Called by upper layer device to delete the station created * by 'ndo_dfwd_add_station'. 'pdev' is the net device backing * the station and priv is the structure returned by the add * operation. * int (*ndo_set_tx_maxrate)(struct net_device *dev, * int queue_index, u32 maxrate); * Called when a user wants to set a max-rate limitation of specific * TX queue. * int (*ndo_get_iflink)(const struct net_device *dev); * Called to get the iflink value of this device. * int (*ndo_fill_metadata_dst)(struct net_device *dev, struct sk_buff *skb); * This function is used to get egress tunnel information for given skb. * This is useful for retrieving outer tunnel header parameters while * sampling packet. * void (*ndo_set_rx_headroom)(struct net_device *dev, int needed_headroom); * This function is used to specify the headroom that the skb must * consider when allocation skb during packet reception. Setting * appropriate rx headroom value allows avoiding skb head copy on * forward. Setting a negative value resets the rx headroom to the * default value. * int (*ndo_bpf)(struct net_device *dev, struct netdev_bpf *bpf); * This function is used to set or query state related to XDP on the * netdevice and manage BPF offload. See definition of * enum bpf_netdev_command for details. * int (*ndo_xdp_xmit)(struct net_device *dev, int n, struct xdp_frame **xdp, * u32 flags); * This function is used to submit @n XDP packets for transmit on a * netdevice. Returns number of frames successfully transmitted, frames * that got dropped are freed/returned via xdp_return_frame(). * Returns negative number, means general error invoking ndo, meaning * no frames were xmit'ed and core-caller will free all frames. * struct net_device *(*ndo_xdp_get_xmit_slave)(struct net_device *dev, * struct xdp_buff *xdp); * Get the xmit slave of master device based on the xdp_buff. * int (*ndo_xsk_wakeup)(struct net_device *dev, u32 queue_id, u32 flags); * This function is used to wake up the softirq, ksoftirqd or kthread * responsible for sending and/or receiving packets on a specific * queue id bound to an AF_XDP socket. The flags field specifies if * only RX, only Tx, or both should be woken up using the flags * XDP_WAKEUP_RX and XDP_WAKEUP_TX. * int (*ndo_tunnel_ctl)(struct net_device *dev, struct ip_tunnel_parm *p, * int cmd); * Add, change, delete or get information on an IPv4 tunnel. * struct net_device *(*ndo_get_peer_dev)(struct net_device *dev); * If a device is paired with a peer device, return the peer instance. * The caller must be under RCU read context. * int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path); * Get the forwarding path to reach the real device from the HW destination address * ktime_t (*ndo_get_tstamp)(struct net_device *dev, * const struct skb_shared_hwtstamps *hwtstamps, * bool cycles); * Get hardware timestamp based on normal/adjustable time or free running * cycle counter. This function is required if physical clock supports a * free running cycle counter. * * int (*ndo_hwtstamp_get)(struct net_device *dev, * struct kernel_hwtstamp_config *kernel_config); * Get the currently configured hardware timestamping parameters for the * NIC device. * * int (*ndo_hwtstamp_set)(struct net_device *dev, * struct kernel_hwtstamp_config *kernel_config, * struct netlink_ext_ack *extack); * Change the hardware timestamping parameters for NIC device. */ struct net_device_ops { int (*ndo_init)(struct net_device *dev); void (*ndo_uninit)(struct net_device *dev); int (*ndo_open)(struct net_device *dev); int (*ndo_stop)(struct net_device *dev); netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev); netdev_features_t (*ndo_features_check)(struct sk_buff *skb, struct net_device *dev, netdev_features_t features); u16 (*ndo_select_queue)(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev); void (*ndo_change_rx_flags)(struct net_device *dev, int flags); void (*ndo_set_rx_mode)(struct net_device *dev); int (*ndo_set_mac_address)(struct net_device *dev, void *addr); int (*ndo_validate_addr)(struct net_device *dev); int (*ndo_do_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); int (*ndo_eth_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); int (*ndo_siocbond)(struct net_device *dev, struct ifreq *ifr, int cmd); int (*ndo_siocwandev)(struct net_device *dev, struct if_settings *ifs); int (*ndo_siocdevprivate)(struct net_device *dev, struct ifreq *ifr, void __user *data, int cmd); int (*ndo_set_config)(struct net_device *dev, struct ifmap *map); int (*ndo_change_mtu)(struct net_device *dev, int new_mtu); int (*ndo_neigh_setup)(struct net_device *dev, struct neigh_parms *); void (*ndo_tx_timeout) (struct net_device *dev, unsigned int txqueue); void (*ndo_get_stats64)(struct net_device *dev, struct rtnl_link_stats64 *storage); bool (*ndo_has_offload_stats)(const struct net_device *dev, int attr_id); int (*ndo_get_offload_stats)(int attr_id, const struct net_device *dev, void *attr_data); struct net_device_stats* (*ndo_get_stats)(struct net_device *dev); int (*ndo_vlan_rx_add_vid)(struct net_device *dev, __be16 proto, u16 vid); int (*ndo_vlan_rx_kill_vid)(struct net_device *dev, __be16 proto, u16 vid); #ifdef CONFIG_NET_POLL_CONTROLLER void (*ndo_poll_controller)(struct net_device *dev); int (*ndo_netpoll_setup)(struct net_device *dev, struct netpoll_info *info); void (*ndo_netpoll_cleanup)(struct net_device *dev); #endif int (*ndo_set_vf_mac)(struct net_device *dev, int queue, u8 *mac); int (*ndo_set_vf_vlan)(struct net_device *dev, int queue, u16 vlan, u8 qos, __be16 proto); int (*ndo_set_vf_rate)(struct net_device *dev, int vf, int min_tx_rate, int max_tx_rate); int (*ndo_set_vf_spoofchk)(struct net_device *dev, int vf, bool setting); int (*ndo_set_vf_trust)(struct net_device *dev, int vf, bool setting); int (*ndo_get_vf_config)(struct net_device *dev, int vf, struct ifla_vf_info *ivf); int (*ndo_set_vf_link_state)(struct net_device *dev, int vf, int link_state); int (*ndo_get_vf_stats)(struct net_device *dev, int vf, struct ifla_vf_stats *vf_stats); int (*ndo_set_vf_port)(struct net_device *dev, int vf, struct nlattr *port[]); int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb); int (*ndo_get_vf_guid)(struct net_device *dev, int vf, struct ifla_vf_guid *node_guid, struct ifla_vf_guid *port_guid); int (*ndo_set_vf_guid)(struct net_device *dev, int vf, u64 guid, int guid_type); int (*ndo_set_vf_rss_query_en)( struct net_device *dev, int vf, bool setting); int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type, void *type_data); #if IS_ENABLED(CONFIG_FCOE) int (*ndo_fcoe_enable)(struct net_device *dev); int (*ndo_fcoe_disable)(struct net_device *dev); int (*ndo_fcoe_ddp_setup)(struct net_device *dev, u16 xid, struct scatterlist *sgl, unsigned int sgc); int (*ndo_fcoe_ddp_done)(struct net_device *dev, u16 xid); int (*ndo_fcoe_ddp_target)(struct net_device *dev, u16 xid, struct scatterlist *sgl, unsigned int sgc); int (*ndo_fcoe_get_hbainfo)(struct net_device *dev, struct netdev_fcoe_hbainfo *hbainfo); #endif #if IS_ENABLED(CONFIG_LIBFCOE) #define NETDEV_FCOE_WWNN 0 #define NETDEV_FCOE_WWPN 1 int (*ndo_fcoe_get_wwn)(struct net_device *dev, u64 *wwn, int type); #endif #ifdef CONFIG_RFS_ACCEL int (*ndo_rx_flow_steer)(struct net_device *dev, const struct sk_buff *skb, u16 rxq_index, u32 flow_id); #endif int (*ndo_add_slave)(struct net_device *dev, struct net_device *slave_dev, struct netlink_ext_ack *extack); int (*ndo_del_slave)(struct net_device *dev, struct net_device *slave_dev); struct net_device* (*ndo_get_xmit_slave)(struct net_device *dev, struct sk_buff *skb, bool all_slaves); struct net_device* (*ndo_sk_get_lower_dev)(struct net_device *dev, struct sock *sk); netdev_features_t (*ndo_fix_features)(struct net_device *dev, netdev_features_t features); int (*ndo_set_features)(struct net_device *dev, netdev_features_t features); int (*ndo_neigh_construct)(struct net_device *dev, struct neighbour *n); void (*ndo_neigh_destroy)(struct net_device *dev, struct neighbour *n); int (*ndo_fdb_add)(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, const unsigned char *addr, u16 vid, u16 flags, struct netlink_ext_ack *extack); int (*ndo_fdb_del)(struct ndmsg *ndm, struct nlattr *tb[], struct net_device *dev, const unsigned char *addr, u16 vid, struct netlink_ext_ack *extack); int (*ndo_fdb_del_bulk)(struct nlmsghdr *nlh, struct net_device *dev, struct netlink_ext_ack *extack); int (*ndo_fdb_dump)(struct sk_buff *skb, struct netlink_callback *cb, struct net_device *dev, struct net_device *filter_dev, int *idx); int (*ndo_fdb_get)(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev, const unsigned char *addr, u16 vid, u32 portid, u32 seq, struct netlink_ext_ack *extack); int (*ndo_mdb_add)(struct net_device *dev, struct nlattr *tb[], u16 nlmsg_flags, struct netlink_ext_ack *extack); int (*ndo_mdb_del)(struct net_device *dev, struct nlattr *tb[], struct netlink_ext_ack *extack); int (*ndo_mdb_del_bulk)(struct net_device *dev, struct nlattr *tb[], struct netlink_ext_ack *extack); int (*ndo_mdb_dump)(struct net_device *dev, struct sk_buff *skb, struct netlink_callback *cb); int (*ndo_mdb_get)(struct net_device *dev, struct nlattr *tb[], u32 portid, u32 seq, struct netlink_ext_ack *extack); int (*ndo_bridge_setlink)(struct net_device *dev, struct nlmsghdr *nlh, u16 flags, struct netlink_ext_ack *extack); int (*ndo_bridge_getlink)(struct sk_buff *skb, u32 pid, u32 seq, struct net_device *dev, u32 filter_mask, int nlflags); int (*ndo_bridge_dellink)(struct net_device *dev, struct nlmsghdr *nlh, u16 flags); int (*ndo_change_carrier)(struct net_device *dev, bool new_carrier); int (*ndo_get_phys_port_id)(struct net_device *dev, struct netdev_phys_item_id *ppid); int (*ndo_get_port_parent_id)(struct net_device *dev, struct netdev_phys_item_id *ppid); int (*ndo_get_phys_port_name)(struct net_device *dev, char *name, size_t len); void* (*ndo_dfwd_add_station)(struct net_device *pdev, struct net_device *dev); void (*ndo_dfwd_del_station)(struct net_device *pdev, void *priv); int (*ndo_set_tx_maxrate)(struct net_device *dev, int queue_index, u32 maxrate); int (*ndo_get_iflink)(const struct net_device *dev); int (*ndo_fill_metadata_dst)(struct net_device *dev, struct sk_buff *skb); void (*ndo_set_rx_headroom)(struct net_device *dev, int needed_headroom); int (*ndo_bpf)(struct net_device *dev, struct netdev_bpf *bpf); int (*ndo_xdp_xmit)(struct net_device *dev, int n, struct xdp_frame **xdp, u32 flags); struct net_device * (*ndo_xdp_get_xmit_slave)(struct net_device *dev, struct xdp_buff *xdp); int (*ndo_xsk_wakeup)(struct net_device *dev, u32 queue_id, u32 flags); int (*ndo_tunnel_ctl)(struct net_device *dev, struct ip_tunnel_parm *p, int cmd); struct net_device * (*ndo_get_peer_dev)(struct net_device *dev); int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path); ktime_t (*ndo_get_tstamp)(struct net_device *dev, const struct skb_shared_hwtstamps *hwtstamps, bool cycles); int (*ndo_hwtstamp_get)(struct net_device *dev, struct kernel_hwtstamp_config *kernel_config); int (*ndo_hwtstamp_set)(struct net_device *dev, struct kernel_hwtstamp_config *kernel_config, struct netlink_ext_ack *extack); }; /** * enum netdev_priv_flags - &struct net_device priv_flags * * These are the &struct net_device, they are only set internally * by drivers and used in the kernel. These flags are invisible to * userspace; this means that the order of these flags can change * during any kernel release. * * You should have a pretty good reason to be extending these flags. * * @IFF_802_1Q_VLAN: 802.1Q VLAN device * @IFF_EBRIDGE: Ethernet bridging device * @IFF_BONDING: bonding master or slave * @IFF_ISATAP: ISATAP interface (RFC4214) * @IFF_WAN_HDLC: WAN HDLC device * @IFF_XMIT_DST_RELEASE: dev_hard_start_xmit() is allowed to * release skb->dst * @IFF_DONT_BRIDGE: disallow bridging this ether dev * @IFF_DISABLE_NETPOLL: disable netpoll at run-time * @IFF_MACVLAN_PORT: device used as macvlan port * @IFF_BRIDGE_PORT: device used as bridge port * @IFF_OVS_DATAPATH: device used as Open vSwitch datapath port * @IFF_TX_SKB_SHARING: The interface supports sharing skbs on transmit * @IFF_UNICAST_FLT: Supports unicast filtering * @IFF_TEAM_PORT: device used as team port * @IFF_SUPP_NOFCS: device supports sending custom FCS * @IFF_LIVE_ADDR_CHANGE: device supports hardware address * change when it's running * @IFF_MACVLAN: Macvlan device * @IFF_XMIT_DST_RELEASE_PERM: IFF_XMIT_DST_RELEASE not taking into account * underlying stacked devices * @IFF_L3MDEV_MASTER: device is an L3 master device * @IFF_NO_QUEUE: device can run without qdisc attached * @IFF_OPENVSWITCH: device is a Open vSwitch master * @IFF_L3MDEV_SLAVE: device is enslaved to an L3 master device * @IFF_TEAM: device is a team device * @IFF_RXFH_CONFIGURED: device has had Rx Flow indirection table configured * @IFF_PHONY_HEADROOM: the headroom value is controlled by an external * entity (i.e. the master device for bridged veth) * @IFF_MACSEC: device is a MACsec device * @IFF_NO_RX_HANDLER: device doesn't support the rx_handler hook * @IFF_FAILOVER: device is a failover master device * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device * @IFF_NO_ADDRCONF: prevent ipv6 addrconf * @IFF_TX_SKB_NO_LINEAR: device/driver is capable of xmitting frames with * skb_headlen(skb) == 0 (data starts from frag0) * @IFF_CHANGE_PROTO_DOWN: device supports setting carrier via IFLA_PROTO_DOWN * @IFF_SEE_ALL_HWTSTAMP_REQUESTS: device wants to see calls to * ndo_hwtstamp_set() for all timestamp requests regardless of source, * even if those aren't HWTSTAMP_SOURCE_NETDEV. */ enum netdev_priv_flags { IFF_802_1Q_VLAN = 1<<0, IFF_EBRIDGE = 1<<1, IFF_BONDING = 1<<2, IFF_ISATAP = 1<<3, IFF_WAN_HDLC = 1<<4, IFF_XMIT_DST_RELEASE = 1<<5, IFF_DONT_BRIDGE = 1<<6, IFF_DISABLE_NETPOLL = 1<<7, IFF_MACVLAN_PORT = 1<<8, IFF_BRIDGE_PORT = 1<<9, IFF_OVS_DATAPATH = 1<<10, IFF_TX_SKB_SHARING = 1<<11, IFF_UNICAST_FLT = 1<<12, IFF_TEAM_PORT = 1<<13, IFF_SUPP_NOFCS = 1<<14, IFF_LIVE_ADDR_CHANGE = 1<<15, IFF_MACVLAN = 1<<16, IFF_XMIT_DST_RELEASE_PERM = 1<<17, IFF_L3MDEV_MASTER = 1<<18, IFF_NO_QUEUE = 1<<19, IFF_OPENVSWITCH = 1<<20, IFF_L3MDEV_SLAVE = 1<<21, IFF_TEAM = 1<<22, IFF_RXFH_CONFIGURED = 1<<23, IFF_PHONY_HEADROOM = 1<<24, IFF_MACSEC = 1<<25, IFF_NO_RX_HANDLER = 1<<26, IFF_FAILOVER = 1<<27, IFF_FAILOVER_SLAVE = 1<<28, IFF_L3MDEV_RX_HANDLER = 1<<29, IFF_NO_ADDRCONF = BIT_ULL(30), IFF_TX_SKB_NO_LINEAR = BIT_ULL(31), IFF_CHANGE_PROTO_DOWN = BIT_ULL(32), IFF_SEE_ALL_HWTSTAMP_REQUESTS = BIT_ULL(33), }; #define IFF_802_1Q_VLAN IFF_802_1Q_VLAN #define IFF_EBRIDGE IFF_EBRIDGE #define IFF_BONDING IFF_BONDING #define IFF_ISATAP IFF_ISATAP #define IFF_WAN_HDLC IFF_WAN_HDLC #define IFF_XMIT_DST_RELEASE IFF_XMIT_DST_RELEASE #define IFF_DONT_BRIDGE IFF_DONT_BRIDGE #define IFF_DISABLE_NETPOLL IFF_DISABLE_NETPOLL #define IFF_MACVLAN_PORT IFF_MACVLAN_PORT #define IFF_BRIDGE_PORT IFF_BRIDGE_PORT #define IFF_OVS_DATAPATH IFF_OVS_DATAPATH #define IFF_TX_SKB_SHARING IFF_TX_SKB_SHARING #define IFF_UNICAST_FLT IFF_UNICAST_FLT #define IFF_TEAM_PORT IFF_TEAM_PORT #define IFF_SUPP_NOFCS IFF_SUPP_NOFCS #define IFF_LIVE_ADDR_CHANGE IFF_LIVE_ADDR_CHANGE #define IFF_MACVLAN IFF_MACVLAN #define IFF_XMIT_DST_RELEASE_PERM IFF_XMIT_DST_RELEASE_PERM #define IFF_L3MDEV_MASTER IFF_L3MDEV_MASTER #define IFF_NO_QUEUE IFF_NO_QUEUE #define IFF_OPENVSWITCH IFF_OPENVSWITCH #define IFF_L3MDEV_SLAVE IFF_L3MDEV_SLAVE #define IFF_TEAM IFF_TEAM #define IFF_RXFH_CONFIGURED IFF_RXFH_CONFIGURED #define IFF_PHONY_HEADROOM IFF_PHONY_HEADROOM #define IFF_MACSEC IFF_MACSEC #define IFF_NO_RX_HANDLER IFF_NO_RX_HANDLER #define IFF_FAILOVER IFF_FAILOVER #define IFF_FAILOVER_SLAVE IFF_FAILOVER_SLAVE #define IFF_L3MDEV_RX_HANDLER IFF_L3MDEV_RX_HANDLER #define IFF_TX_SKB_NO_LINEAR IFF_TX_SKB_NO_LINEAR /* Specifies the type of the struct net_device::ml_priv pointer */ enum netdev_ml_priv_type { ML_PRIV_NONE, ML_PRIV_CAN, }; enum netdev_stat_type { NETDEV_PCPU_STAT_NONE, NETDEV_PCPU_STAT_LSTATS, /* struct pcpu_lstats */ NETDEV_PCPU_STAT_TSTATS, /* struct pcpu_sw_netstats */ NETDEV_PCPU_STAT_DSTATS, /* struct pcpu_dstats */ }; enum netdev_reg_state { NETREG_UNINITIALIZED = 0, NETREG_REGISTERED, /* completed register_netdevice */ NETREG_UNREGISTERING, /* called unregister_netdevice */ NETREG_UNREGISTERED, /* completed unregister todo */ NETREG_RELEASED, /* called free_netdev */ NETREG_DUMMY, /* dummy device for NAPI poll */ }; /** * struct net_device - The DEVICE structure. * * Actually, this whole structure is a big mistake. It mixes I/O * data with strictly "high-level" data, and it has to know about * almost every data structure used in the INET module. * * @name: This is the first field of the "visible" part of this structure * (i.e. as seen by users in the "Space.c" file). It is the name * of the interface. * * @name_node: Name hashlist node * @ifalias: SNMP alias * @mem_end: Shared memory end * @mem_start: Shared memory start * @base_addr: Device I/O address * @irq: Device IRQ number * * @state: Generic network queuing layer state, see netdev_state_t * @dev_list: The global list of network devices * @napi_list: List entry used for polling NAPI devices * @unreg_list: List entry when we are unregistering the * device; see the function unregister_netdev * @close_list: List entry used when we are closing the device * @ptype_all: Device-specific packet handlers for all protocols * @ptype_specific: Device-specific, protocol-specific packet handlers * * @adj_list: Directly linked devices, like slaves for bonding * @features: Currently active device features * @hw_features: User-changeable features * * @wanted_features: User-requested features * @vlan_features: Mask of features inheritable by VLAN devices * * @hw_enc_features: Mask of features inherited by encapsulating devices * This field indicates what encapsulation * offloads the hardware is capable of doing, * and drivers will need to set them appropriately. * * @mpls_features: Mask of features inheritable by MPLS * @gso_partial_features: value(s) from NETIF_F_GSO\* * * @ifindex: interface index * @group: The group the device belongs to * * @stats: Statistics struct, which was left as a legacy, use * rtnl_link_stats64 instead * * @core_stats: core networking counters, * do not use this in drivers * @carrier_up_count: Number of times the carrier has been up * @carrier_down_count: Number of times the carrier has been down * * @wireless_handlers: List of functions to handle Wireless Extensions, * instead of ioctl, * see <net/iw_handler.h> for details. * @wireless_data: Instance data managed by the core of wireless extensions * * @netdev_ops: Includes several pointers to callbacks, * if one wants to override the ndo_*() functions * @xdp_metadata_ops: Includes pointers to XDP metadata callbacks. * @xsk_tx_metadata_ops: Includes pointers to AF_XDP TX metadata callbacks. * @ethtool_ops: Management operations * @l3mdev_ops: Layer 3 master device operations * @ndisc_ops: Includes callbacks for different IPv6 neighbour * discovery handling. Necessary for e.g. 6LoWPAN. * @xfrmdev_ops: Transformation offload operations * @tlsdev_ops: Transport Layer Security offload operations * @header_ops: Includes callbacks for creating,parsing,caching,etc * of Layer 2 headers. * * @flags: Interface flags (a la BSD) * @xdp_features: XDP capability supported by the device * @priv_flags: Like 'flags' but invisible to userspace, * see if.h for the definitions * @gflags: Global flags ( kept as legacy ) * @padded: How much padding added by alloc_netdev() * @operstate: RFC2863 operstate * @link_mode: Mapping policy to operstate * @if_port: Selectable AUI, TP, ... * @dma: DMA channel * @mtu: Interface MTU value * @min_mtu: Interface Minimum MTU value * @max_mtu: Interface Maximum MTU value * @type: Interface hardware type * @hard_header_len: Maximum hardware header length. * @min_header_len: Minimum hardware header length * * @needed_headroom: Extra headroom the hardware may need, but not in all * cases can this be guaranteed * @needed_tailroom: Extra tailroom the hardware may need, but not in all * cases can this be guaranteed. Some cases also use * LL_MAX_HEADER instead to allocate the skb * * interface address info: * * @perm_addr: Permanent hw address * @addr_assign_type: Hw address assignment type * @addr_len: Hardware address length * @upper_level: Maximum depth level of upper devices. * @lower_level: Maximum depth level of lower devices. * @neigh_priv_len: Used in neigh_alloc() * @dev_id: Used to differentiate devices that share * the same link layer address * @dev_port: Used to differentiate devices that share * the same function * @addr_list_lock: XXX: need comments on this one * @name_assign_type: network interface name assignment type * @uc_promisc: Counter that indicates promiscuous mode * has been enabled due to the need to listen to * additional unicast addresses in a device that * does not implement ndo_set_rx_mode() * @uc: unicast mac addresses * @mc: multicast mac addresses * @dev_addrs: list of device hw addresses * @queues_kset: Group of all Kobjects in the Tx and RX queues * @promiscuity: Number of times the NIC is told to work in * promiscuous mode; if it becomes 0 the NIC will * exit promiscuous mode * @allmulti: Counter, enables or disables allmulticast mode * * @vlan_info: VLAN info * @dsa_ptr: dsa specific data * @tipc_ptr: TIPC specific data * @atalk_ptr: AppleTalk link * @ip_ptr: IPv4 specific data * @ip6_ptr: IPv6 specific data * @ax25_ptr: AX.25 specific data * @ieee80211_ptr: IEEE 802.11 specific data, assign before registering * @ieee802154_ptr: IEEE 802.15.4 low-rate Wireless Personal Area Network * device struct * @mpls_ptr: mpls_dev struct pointer * @mctp_ptr: MCTP specific data * * @dev_addr: Hw address (before bcast, * because most packets are unicast) * * @_rx: Array of RX queues * @num_rx_queues: Number of RX queues * allocated at register_netdev() time * @real_num_rx_queues: Number of RX queues currently active in device * @xdp_prog: XDP sockets filter program pointer * @gro_flush_timeout: timeout for GRO layer in NAPI * @napi_defer_hard_irqs: If not zero, provides a counter that would * allow to avoid NIC hard IRQ, on busy queues. * * @rx_handler: handler for received packets * @rx_handler_data: XXX: need comments on this one * @tcx_ingress: BPF & clsact qdisc specific data for ingress processing * @ingress_queue: XXX: need comments on this one * @nf_hooks_ingress: netfilter hooks executed for ingress packets * @broadcast: hw bcast address * * @rx_cpu_rmap: CPU reverse-mapping for RX completion interrupts, * indexed by RX queue number. Assigned by driver. * This must only be set if the ndo_rx_flow_steer * operation is defined * @index_hlist: Device index hash chain * * @_tx: Array of TX queues * @num_tx_queues: Number of TX queues allocated at alloc_netdev_mq() time * @real_num_tx_queues: Number of TX queues currently active in device * @qdisc: Root qdisc from userspace point of view * @tx_queue_len: Max frames per queue allowed * @tx_global_lock: XXX: need comments on this one * @xdp_bulkq: XDP device bulk queue * @xps_maps: all CPUs/RXQs maps for XPS device * * @xps_maps: XXX: need comments on this one * @tcx_egress: BPF & clsact qdisc specific data for egress processing * @nf_hooks_egress: netfilter hooks executed for egress packets * @qdisc_hash: qdisc hash table * @watchdog_timeo: Represents the timeout that is used by * the watchdog (see dev_watchdog()) * @watchdog_timer: List of timers * * @proto_down_reason: reason a netdev interface is held down * @pcpu_refcnt: Number of references to this device * @dev_refcnt: Number of references to this device * @refcnt_tracker: Tracker directory for tracked references to this device * @todo_list: Delayed register/unregister * @link_watch_list: XXX: need comments on this one * * @reg_state: Register/unregister state machine * @dismantle: Device is going to be freed * @rtnl_link_state: This enum represents the phases of creating * a new link * * @needs_free_netdev: Should unregister perform free_netdev? * @priv_destructor: Called from unregister * @npinfo: XXX: need comments on this one * @nd_net: Network namespace this network device is inside * * @ml_priv: Mid-layer private * @ml_priv_type: Mid-layer private type * * @pcpu_stat_type: Type of device statistics which the core should * allocate/free: none, lstats, tstats, dstats. none * means the driver is handling statistics allocation/ * freeing internally. * @lstats: Loopback statistics: packets, bytes * @tstats: Tunnel statistics: RX/TX packets, RX/TX bytes * @dstats: Dummy statistics: RX/TX/drop packets, RX/TX bytes * * @garp_port: GARP * @mrp_port: MRP * * @dm_private: Drop monitor private * * @dev: Class/net/name entry * @sysfs_groups: Space for optional device, statistics and wireless * sysfs groups * * @sysfs_rx_queue_group: Space for optional per-rx queue attributes * @rtnl_link_ops: Rtnl_link_ops * @stat_ops: Optional ops for queue-aware statistics * * @gso_max_size: Maximum size of generic segmentation offload * @tso_max_size: Device (as in HW) limit on the max TSO request size * @gso_max_segs: Maximum number of segments that can be passed to the * NIC for GSO * @tso_max_segs: Device (as in HW) limit on the max TSO segment count * @gso_ipv4_max_size: Maximum size of generic segmentation offload, * for IPv4. * * @dcbnl_ops: Data Center Bridging netlink ops * @num_tc: Number of traffic classes in the net device * @tc_to_txq: XXX: need comments on this one * @prio_tc_map: XXX: need comments on this one * * @fcoe_ddp_xid: Max exchange id for FCoE LRO by ddp * * @priomap: XXX: need comments on this one * @phydev: Physical device may attach itself * for hardware timestamping * @sfp_bus: attached &struct sfp_bus structure. * * @qdisc_tx_busylock: lockdep class annotating Qdisc->busylock spinlock * * @proto_down: protocol port state information can be sent to the * switch driver and used to set the phys state of the * switch port. * * @wol_enabled: Wake-on-LAN is enabled * * @threaded: napi threaded mode is enabled * * @net_notifier_list: List of per-net netdev notifier block * that follow this device when it is moved * to another network namespace. * * @macsec_ops: MACsec offloading ops * * @udp_tunnel_nic_info: static structure describing the UDP tunnel * offload capabilities of the device * @udp_tunnel_nic: UDP tunnel offload state * @xdp_state: stores info on attached XDP BPF programs * * @nested_level: Used as a parameter of spin_lock_nested() of * dev->addr_list_lock. * @unlink_list: As netif_addr_lock() can be called recursively, * keep a list of interfaces to be deleted. * @gro_max_size: Maximum size of aggregated packet in generic * receive offload (GRO) * @gro_ipv4_max_size: Maximum size of aggregated packet in generic * receive offload (GRO), for IPv4. * @xdp_zc_max_segs: Maximum number of segments supported by AF_XDP * zero copy driver * * @dev_addr_shadow: Copy of @dev_addr to catch direct writes. * @linkwatch_dev_tracker: refcount tracker used by linkwatch. * @watchdog_dev_tracker: refcount tracker used by watchdog. * @dev_registered_tracker: tracker for reference held while * registered * @offload_xstats_l3: L3 HW stats for this netdevice. * * @devlink_port: Pointer to related devlink port structure. * Assigned by a driver before netdev registration using * SET_NETDEV_DEVLINK_PORT macro. This pointer is static * during the time netdevice is registered. * * @dpll_pin: Pointer to the SyncE source pin of a DPLL subsystem, * where the clock is recovered. * * FIXME: cleanup struct net_device such that network protocol info * moves out. */ struct net_device { /* Cacheline organization can be found documented in * Documentation/networking/net_cachelines/net_device.rst. * Please update the document when adding new fields. */ /* TX read-mostly hotpath */ __cacheline_group_begin(net_device_read_tx); unsigned long long priv_flags; const struct net_device_ops *netdev_ops; const struct header_ops *header_ops; struct netdev_queue *_tx; netdev_features_t gso_partial_features; unsigned int real_num_tx_queues; unsigned int gso_max_size; unsigned int gso_ipv4_max_size; u16 gso_max_segs; s16 num_tc; /* Note : dev->mtu is often read without holding a lock. * Writers usually hold RTNL. * It is recommended to use READ_ONCE() to annotate the reads, * and to use WRITE_ONCE() to annotate the writes. */ unsigned int mtu; unsigned short needed_headroom; struct netdev_tc_txq tc_to_txq[TC_MAX_QUEUE]; #ifdef CONFIG_XPS struct xps_dev_maps __rcu *xps_maps[XPS_MAPS_MAX]; #endif #ifdef CONFIG_NETFILTER_EGRESS struct nf_hook_entries __rcu *nf_hooks_egress; #endif #ifdef CONFIG_NET_XGRESS struct bpf_mprog_entry __rcu *tcx_egress; #endif __cacheline_group_end(net_device_read_tx); /* TXRX read-mostly hotpath */ __cacheline_group_begin(net_device_read_txrx); union { struct pcpu_lstats __percpu *lstats; struct pcpu_sw_netstats __percpu *tstats; struct pcpu_dstats __percpu *dstats; }; unsigned int flags; unsigned short hard_header_len; netdev_features_t features; struct inet6_dev __rcu *ip6_ptr; __cacheline_group_end(net_device_read_txrx); /* RX read-mostly hotpath */ __cacheline_group_begin(net_device_read_rx); struct bpf_prog __rcu *xdp_prog; struct list_head ptype_specific; int ifindex; unsigned int real_num_rx_queues; struct netdev_rx_queue *_rx; unsigned long gro_flush_timeout; int napi_defer_hard_irqs; unsigned int gro_max_size; unsigned int gro_ipv4_max_size; rx_handler_func_t __rcu *rx_handler; void __rcu *rx_handler_data; possible_net_t nd_net; #ifdef CONFIG_NETPOLL struct netpoll_info __rcu *npinfo; #endif #ifdef CONFIG_NET_XGRESS struct bpf_mprog_entry __rcu *tcx_ingress; #endif __cacheline_group_end(net_device_read_rx); char name[IFNAMSIZ]; struct netdev_name_node *name_node; struct dev_ifalias __rcu *ifalias; /* * I/O specific fields * FIXME: Merge these and struct ifmap into one */ unsigned long mem_end; unsigned long mem_start; unsigned long base_addr; /* * Some hardware also needs these fields (state,dev_list, * napi_list,unreg_list,close_list) but they are not * part of the usual set specified in Space.c. */ unsigned long state; struct list_head dev_list; struct list_head napi_list; struct list_head unreg_list; struct list_head close_list; struct list_head ptype_all; struct { struct list_head upper; struct list_head lower; } adj_list; /* Read-mostly cache-line for fast-path access */ xdp_features_t xdp_features; const struct xdp_metadata_ops *xdp_metadata_ops; const struct xsk_tx_metadata_ops *xsk_tx_metadata_ops; unsigned short gflags; unsigned short needed_tailroom; netdev_features_t hw_features; netdev_features_t wanted_features; netdev_features_t vlan_features; netdev_features_t hw_enc_features; netdev_features_t mpls_features; unsigned int min_mtu; unsigned int max_mtu; unsigned short type; unsigned char min_header_len; unsigned char name_assign_type; int group; struct net_device_stats stats; /* not used by modern drivers */ struct net_device_core_stats __percpu *core_stats; /* Stats to monitor link on/off, flapping */ atomic_t carrier_up_count; atomic_t carrier_down_count; #ifdef CONFIG_WIRELESS_EXT const struct iw_handler_def *wireless_handlers; struct iw_public_data *wireless_data; #endif const struct ethtool_ops *ethtool_ops; #ifdef CONFIG_NET_L3_MASTER_DEV const struct l3mdev_ops *l3mdev_ops; #endif #if IS_ENABLED(CONFIG_IPV6) const struct ndisc_ops *ndisc_ops; #endif #ifdef CONFIG_XFRM_OFFLOAD const struct xfrmdev_ops *xfrmdev_ops; #endif #if IS_ENABLED(CONFIG_TLS_DEVICE) const struct tlsdev_ops *tlsdev_ops; #endif unsigned int operstate; unsigned char link_mode; unsigned char if_port; unsigned char dma; /* Interface address info. */ unsigned char perm_addr[MAX_ADDR_LEN]; unsigned char addr_assign_type; unsigned char addr_len; unsigned char upper_level; unsigned char lower_level; unsigned short neigh_priv_len; unsigned short dev_id; unsigned short dev_port; unsigned short padded; spinlock_t addr_list_lock; int irq; struct netdev_hw_addr_list uc; struct netdev_hw_addr_list mc; struct netdev_hw_addr_list dev_addrs; #ifdef CONFIG_SYSFS struct kset *queues_kset; #endif #ifdef CONFIG_LOCKDEP struct list_head unlink_list; #endif unsigned int promiscuity; unsigned int allmulti; bool uc_promisc; #ifdef CONFIG_LOCKDEP unsigned char nested_level; #endif /* Protocol-specific pointers */ struct in_device __rcu *ip_ptr; #if IS_ENABLED(CONFIG_VLAN_8021Q) struct vlan_info __rcu *vlan_info; #endif #if IS_ENABLED(CONFIG_NET_DSA) struct dsa_port *dsa_ptr; #endif #if IS_ENABLED(CONFIG_TIPC) struct tipc_bearer __rcu *tipc_ptr; #endif #if IS_ENABLED(CONFIG_ATALK) void *atalk_ptr; #endif #if IS_ENABLED(CONFIG_AX25) void *ax25_ptr; #endif #if IS_ENABLED(CONFIG_CFG80211) struct wireless_dev *ieee80211_ptr; #endif #if IS_ENABLED(CONFIG_IEEE802154) || IS_ENABLED(CONFIG_6LOWPAN) struct wpan_dev *ieee802154_ptr; #endif #if IS_ENABLED(CONFIG_MPLS_ROUTING) struct mpls_dev __rcu *mpls_ptr; #endif #if IS_ENABLED(CONFIG_MCTP) struct mctp_dev __rcu *mctp_ptr; #endif /* * Cache lines mostly used on receive path (including eth_type_trans()) */ /* Interface address info used in eth_type_trans() */ const unsigned char *dev_addr; unsigned int num_rx_queues; #define GRO_LEGACY_MAX_SIZE 65536u /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE), * and shinfo->gso_segs is a 16bit field. */ #define GRO_MAX_SIZE (8 * 65535u) unsigned int xdp_zc_max_segs; struct netdev_queue __rcu *ingress_queue; #ifdef CONFIG_NETFILTER_INGRESS struct nf_hook_entries __rcu *nf_hooks_ingress; #endif unsigned char broadcast[MAX_ADDR_LEN]; #ifdef CONFIG_RFS_ACCEL struct cpu_rmap *rx_cpu_rmap; #endif struct hlist_node index_hlist; /* * Cache lines mostly used on transmit path */ unsigned int num_tx_queues; struct Qdisc __rcu *qdisc; unsigned int tx_queue_len; spinlock_t tx_global_lock; struct xdp_dev_bulk_queue __percpu *xdp_bulkq; #ifdef CONFIG_NET_SCHED DECLARE_HASHTABLE (qdisc_hash, 4); #endif /* These may be needed for future network-power-down code. */ struct timer_list watchdog_timer; int watchdog_timeo; u32 proto_down_reason; struct list_head todo_list; #ifdef CONFIG_PCPU_DEV_REFCNT int __percpu *pcpu_refcnt; #else refcount_t dev_refcnt; #endif struct ref_tracker_dir refcnt_tracker; struct list_head link_watch_list; u8 reg_state; bool dismantle; enum { RTNL_LINK_INITIALIZED, RTNL_LINK_INITIALIZING, } rtnl_link_state:16; bool needs_free_netdev; void (*priv_destructor)(struct net_device *dev); /* mid-layer private */ void *ml_priv; enum netdev_ml_priv_type ml_priv_type; enum netdev_stat_type pcpu_stat_type:8; #if IS_ENABLED(CONFIG_GARP) struct garp_port __rcu *garp_port; #endif #if IS_ENABLED(CONFIG_MRP) struct mrp_port __rcu *mrp_port; #endif #if IS_ENABLED(CONFIG_NET_DROP_MONITOR) struct dm_hw_stat_delta __rcu *dm_private; #endif struct device dev; const struct attribute_group *sysfs_groups[4]; const struct attribute_group *sysfs_rx_queue_group; const struct rtnl_link_ops *rtnl_link_ops; const struct netdev_stat_ops *stat_ops; /* for setting kernel sock attribute on TCP connection setup */ #define GSO_MAX_SEGS 65535u #define GSO_LEGACY_MAX_SIZE 65536u /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE), * and shinfo->gso_segs is a 16bit field. */ #define GSO_MAX_SIZE (8 * GSO_MAX_SEGS) #define TSO_LEGACY_MAX_SIZE 65536 #define TSO_MAX_SIZE UINT_MAX unsigned int tso_max_size; #define TSO_MAX_SEGS U16_MAX u16 tso_max_segs; #ifdef CONFIG_DCB const struct dcbnl_rtnl_ops *dcbnl_ops; #endif u8 prio_tc_map[TC_BITMASK + 1]; #if IS_ENABLED(CONFIG_FCOE) unsigned int fcoe_ddp_xid; #endif #if IS_ENABLED(CONFIG_CGROUP_NET_PRIO) struct netprio_map __rcu *priomap; #endif struct phy_device *phydev; struct sfp_bus *sfp_bus; struct lock_class_key *qdisc_tx_busylock; bool proto_down; unsigned wol_enabled:1; unsigned threaded:1; struct list_head net_notifier_list; #if IS_ENABLED(CONFIG_MACSEC) /* MACsec management functions */ const struct macsec_ops *macsec_ops; #endif const struct udp_tunnel_nic_info *udp_tunnel_nic_info; struct udp_tunnel_nic *udp_tunnel_nic; /* protected by rtnl_lock */ struct bpf_xdp_entity xdp_state[__MAX_XDP_MODE]; u8 dev_addr_shadow[MAX_ADDR_LEN]; netdevice_tracker linkwatch_dev_tracker; netdevice_tracker watchdog_dev_tracker; netdevice_tracker dev_registered_tracker; struct rtnl_hw_stats64 *offload_xstats_l3; struct devlink_port *devlink_port; #if IS_ENABLED(CONFIG_DPLL) struct dpll_pin __rcu *dpll_pin; #endif #if IS_ENABLED(CONFIG_PAGE_POOL) /** @page_pools: page pools created for this netdevice */ struct hlist_head page_pools; #endif }; #define to_net_dev(d) container_of(d, struct net_device, dev) /* * Driver should use this to assign devlink port instance to a netdevice * before it registers the netdevice. Therefore devlink_port is static * during the netdev lifetime after it is registered. */ #define SET_NETDEV_DEVLINK_PORT(dev, port) \ ({ \ WARN_ON((dev)->reg_state != NETREG_UNINITIALIZED); \ ((dev)->devlink_port = (port)); \ }) static inline bool netif_elide_gro(const struct net_device *dev) { if (!(dev->features & NETIF_F_GRO) || dev->xdp_prog) return true; return false; } #define NETDEV_ALIGN 32 static inline int netdev_get_prio_tc_map(const struct net_device *dev, u32 prio) { return dev->prio_tc_map[prio & TC_BITMASK]; } static inline int netdev_set_prio_tc_map(struct net_device *dev, u8 prio, u8 tc) { if (tc >= dev->num_tc) return -EINVAL; dev->prio_tc_map[prio & TC_BITMASK] = tc & TC_BITMASK; return 0; } int netdev_txq_to_tc(struct net_device *dev, unsigned int txq); void netdev_reset_tc(struct net_device *dev); int netdev_set_tc_queue(struct net_device *dev, u8 tc, u16 count, u16 offset); int netdev_set_num_tc(struct net_device *dev, u8 num_tc); static inline int netdev_get_num_tc(struct net_device *dev) { return dev->num_tc; } static inline void net_prefetch(void *p) { prefetch(p); #if L1_CACHE_BYTES < 128 prefetch((u8 *)p + L1_CACHE_BYTES); #endif } static inline void net_prefetchw(void *p) { prefetchw(p); #if L1_CACHE_BYTES < 128 prefetchw((u8 *)p + L1_CACHE_BYTES); #endif } void netdev_unbind_sb_channel(struct net_device *dev, struct net_device *sb_dev); int netdev_bind_sb_channel_queue(struct net_device *dev, struct net_device *sb_dev, u8 tc, u16 count, u16 offset); int netdev_set_sb_channel(struct net_device *dev, u16 channel); static inline int netdev_get_sb_channel(struct net_device *dev) { return max_t(int, -dev->num_tc, 0); } static inline struct netdev_queue *netdev_get_tx_queue(const struct net_device *dev, unsigned int index) { DEBUG_NET_WARN_ON_ONCE(index >= dev->num_tx_queues); return &dev->_tx[index]; } static inline struct netdev_queue *skb_get_tx_queue(const struct net_device *dev, const struct sk_buff *skb) { return netdev_get_tx_queue(dev, skb_get_queue_mapping(skb)); } static inline void netdev_for_each_tx_queue(struct net_device *dev, void (*f)(struct net_device *, struct netdev_queue *, void *), void *arg) { unsigned int i; for (i = 0; i < dev->num_tx_queues; i++) f(dev, &dev->_tx[i], arg); } #define netdev_lockdep_set_classes(dev) \ { \ static struct lock_class_key qdisc_tx_busylock_key; \ static struct lock_class_key qdisc_xmit_lock_key; \ static struct lock_class_key dev_addr_list_lock_key; \ unsigned int i; \ \ (dev)->qdisc_tx_busylock = &qdisc_tx_busylock_key; \ lockdep_set_class(&(dev)->addr_list_lock, \ &dev_addr_list_lock_key); \ for (i = 0; i < (dev)->num_tx_queues; i++) \ lockdep_set_class(&(dev)->_tx[i]._xmit_lock, \ &qdisc_xmit_lock_key); \ } u16 netdev_pick_tx(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev); struct netdev_queue *netdev_core_pick_tx(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev); /* returns the headroom that the master device needs to take in account * when forwarding to this dev */ static inline unsigned netdev_get_fwd_headroom(struct net_device *dev) { return dev->priv_flags & IFF_PHONY_HEADROOM ? 0 : dev->needed_headroom; } static inline void netdev_set_rx_headroom(struct net_device *dev, int new_hr) { if (dev->netdev_ops->ndo_set_rx_headroom) dev->netdev_ops->ndo_set_rx_headroom(dev, new_hr); } /* set the device rx headroom to the dev's default */ static inline void netdev_reset_rx_headroom(struct net_device *dev) { netdev_set_rx_headroom(dev, -1); } static inline void *netdev_get_ml_priv(struct net_device *dev, enum netdev_ml_priv_type type) { if (dev->ml_priv_type != type) return NULL; return dev->ml_priv; } static inline void netdev_set_ml_priv(struct net_device *dev, void *ml_priv, enum netdev_ml_priv_type type) { WARN(dev->ml_priv_type && dev->ml_priv_type != type, "Overwriting already set ml_priv_type (%u) with different ml_priv_type (%u)!\n", dev->ml_priv_type, type); WARN(!dev->ml_priv_type && dev->ml_priv, "Overwriting already set ml_priv and ml_priv_type is ML_PRIV_NONE!\n"); dev->ml_priv = ml_priv; dev->ml_priv_type = type; } /* * Net namespace inlines */ static inline struct net *dev_net(const struct net_device *dev) { return read_pnet(&dev->nd_net); } static inline void dev_net_set(struct net_device *dev, struct net *net) { write_pnet(&dev->nd_net, net); } /** * netdev_priv - access network device private data * @dev: network device * * Get network device private data */ static inline void *netdev_priv(const struct net_device *dev) { return (char *)dev + ALIGN(sizeof(struct net_device), NETDEV_ALIGN); } /* Set the sysfs physical device reference for the network logical device * if set prior to registration will cause a symlink during initialization. */ #define SET_NETDEV_DEV(net, pdev) ((net)->dev.parent = (pdev)) /* Set the sysfs device type for the network logical device to allow * fine-grained identification of different network device types. For * example Ethernet, Wireless LAN, Bluetooth, WiMAX etc. */ #define SET_NETDEV_DEVTYPE(net, devtype) ((net)->dev.type = (devtype)) void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index, enum netdev_queue_type type, struct napi_struct *napi); static inline void netif_napi_set_irq(struct napi_struct *napi, int irq) { napi->irq = irq; } /* Default NAPI poll() weight * Device drivers are strongly advised to not use bigger value */ #define NAPI_POLL_WEIGHT 64 void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight); /** * netif_napi_add() - initialize a NAPI context * @dev: network device * @napi: NAPI context * @poll: polling function * * netif_napi_add() must be used to initialize a NAPI context prior to calling * *any* of the other NAPI-related functions. */ static inline void netif_napi_add(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int)) { netif_napi_add_weight(dev, napi, poll, NAPI_POLL_WEIGHT); } static inline void netif_napi_add_tx_weight(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) { set_bit(NAPI_STATE_NO_BUSY_POLL, &napi->state); netif_napi_add_weight(dev, napi, poll, weight); } /** * netif_napi_add_tx() - initialize a NAPI context to be used for Tx only * @dev: network device * @napi: NAPI context * @poll: polling function * * This variant of netif_napi_add() should be used from drivers using NAPI * to exclusively poll a TX queue. * This will avoid we add it into napi_hash[], thus polluting this hash table. */ static inline void netif_napi_add_tx(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int)) { netif_napi_add_tx_weight(dev, napi, poll, NAPI_POLL_WEIGHT); } /** * __netif_napi_del - remove a NAPI context * @napi: NAPI context * * Warning: caller must observe RCU grace period before freeing memory * containing @napi. Drivers might want to call this helper to combine * all the needed RCU grace periods into a single one. */ void __netif_napi_del(struct napi_struct *napi); /** * netif_napi_del - remove a NAPI context * @napi: NAPI context * * netif_napi_del() removes a NAPI context from the network device NAPI list */ static inline void netif_napi_del(struct napi_struct *napi) { __netif_napi_del(napi); synchronize_net(); } struct packet_type { __be16 type; /* This is really htons(ether_type). */ bool ignore_outgoing; struct net_device *dev; /* NULL is wildcarded here */ netdevice_tracker dev_tracker; int (*func) (struct sk_buff *, struct net_device *, struct packet_type *, struct net_device *); void (*list_func) (struct list_head *, struct packet_type *, struct net_device *); bool (*id_match)(struct packet_type *ptype, struct sock *sk); struct net *af_packet_net; void *af_packet_priv; struct list_head list; }; struct offload_callbacks { struct sk_buff *(*gso_segment)(struct sk_buff *skb, netdev_features_t features); struct sk_buff *(*gro_receive)(struct list_head *head, struct sk_buff *skb); int (*gro_complete)(struct sk_buff *skb, int nhoff); }; struct packet_offload { __be16 type; /* This is really htons(ether_type). */ u16 priority; struct offload_callbacks callbacks; struct list_head list; }; /* often modified stats are per-CPU, other are shared (netdev->stats) */ struct pcpu_sw_netstats { u64_stats_t rx_packets; u64_stats_t rx_bytes; u64_stats_t tx_packets; u64_stats_t tx_bytes; struct u64_stats_sync syncp; } __aligned(4 * sizeof(u64)); struct pcpu_dstats { u64 rx_packets; u64 rx_bytes; u64 rx_drops; u64 tx_packets; u64 tx_bytes; u64 tx_drops; struct u64_stats_sync syncp; } __aligned(8 * sizeof(u64)); struct pcpu_lstats { u64_stats_t packets; u64_stats_t bytes; struct u64_stats_sync syncp; } __aligned(2 * sizeof(u64)); void dev_lstats_read(struct net_device *dev, u64 *packets, u64 *bytes); static inline void dev_sw_netstats_rx_add(struct net_device *dev, unsigned int len) { struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats); u64_stats_update_begin(&tstats->syncp); u64_stats_add(&tstats->rx_bytes, len); u64_stats_inc(&tstats->rx_packets); u64_stats_update_end(&tstats->syncp); } static inline void dev_sw_netstats_tx_add(struct net_device *dev, unsigned int packets, unsigned int len) { struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats); u64_stats_update_begin(&tstats->syncp); u64_stats_add(&tstats->tx_bytes, len); u64_stats_add(&tstats->tx_packets, packets); u64_stats_update_end(&tstats->syncp); } static inline void dev_lstats_add(struct net_device *dev, unsigned int len) { struct pcpu_lstats *lstats = this_cpu_ptr(dev->lstats); u64_stats_update_begin(&lstats->syncp); u64_stats_add(&lstats->bytes, len); u64_stats_inc(&lstats->packets); u64_stats_update_end(&lstats->syncp); } #define __netdev_alloc_pcpu_stats(type, gfp) \ ({ \ typeof(type) __percpu *pcpu_stats = alloc_percpu_gfp(type, gfp);\ if (pcpu_stats) { \ int __cpu; \ for_each_possible_cpu(__cpu) { \ typeof(type) *stat; \ stat = per_cpu_ptr(pcpu_stats, __cpu); \ u64_stats_init(&stat->syncp); \ } \ } \ pcpu_stats; \ }) #define netdev_alloc_pcpu_stats(type) \ __netdev_alloc_pcpu_stats(type, GFP_KERNEL) #define devm_netdev_alloc_pcpu_stats(dev, type) \ ({ \ typeof(type) __percpu *pcpu_stats = devm_alloc_percpu(dev, type);\ if (pcpu_stats) { \ int __cpu; \ for_each_possible_cpu(__cpu) { \ typeof(type) *stat; \ stat = per_cpu_ptr(pcpu_stats, __cpu); \ u64_stats_init(&stat->syncp); \ } \ } \ pcpu_stats; \ }) enum netdev_lag_tx_type { NETDEV_LAG_TX_TYPE_UNKNOWN, NETDEV_LAG_TX_TYPE_RANDOM, NETDEV_LAG_TX_TYPE_BROADCAST, NETDEV_LAG_TX_TYPE_ROUNDROBIN, NETDEV_LAG_TX_TYPE_ACTIVEBACKUP, NETDEV_LAG_TX_TYPE_HASH, }; enum netdev_lag_hash { NETDEV_LAG_HASH_NONE, NETDEV_LAG_HASH_L2, NETDEV_LAG_HASH_L34, NETDEV_LAG_HASH_L23, NETDEV_LAG_HASH_E23, NETDEV_LAG_HASH_E34, NETDEV_LAG_HASH_VLAN_SRCMAC, NETDEV_LAG_HASH_UNKNOWN, }; struct netdev_lag_upper_info { enum netdev_lag_tx_type tx_type; enum netdev_lag_hash hash_type; }; struct netdev_lag_lower_state_info { u8 link_up : 1, tx_enabled : 1; }; #include <linux/notifier.h> /* netdevice notifier chain. Please remember to update netdev_cmd_to_name() * and the rtnetlink notification exclusion list in rtnetlink_event() when * adding new types. */ enum netdev_cmd { NETDEV_UP = 1, /* For now you can't veto a device up/down */ NETDEV_DOWN, NETDEV_REBOOT, /* Tell a protocol stack a network interface detected a hardware crash and restarted - we can use this eg to kick tcp sessions once done */ NETDEV_CHANGE, /* Notify device state change */ NETDEV_REGISTER, NETDEV_UNREGISTER, NETDEV_CHANGEMTU, /* notify after mtu change happened */ NETDEV_CHANGEADDR, /* notify after the address change */ NETDEV_PRE_CHANGEADDR, /* notify before the address change */ NETDEV_GOING_DOWN, NETDEV_CHANGENAME, NETDEV_FEAT_CHANGE, NETDEV_BONDING_FAILOVER, NETDEV_PRE_UP, NETDEV_PRE_TYPE_CHANGE, NETDEV_POST_TYPE_CHANGE, NETDEV_POST_INIT, NETDEV_PRE_UNINIT, NETDEV_RELEASE, NETDEV_NOTIFY_PEERS, NETDEV_JOIN, NETDEV_CHANGEUPPER, NETDEV_RESEND_IGMP, NETDEV_PRECHANGEMTU, /* notify before mtu change happened */ NETDEV_CHANGEINFODATA, NETDEV_BONDING_INFO, NETDEV_PRECHANGEUPPER, NETDEV_CHANGELOWERSTATE, NETDEV_UDP_TUNNEL_PUSH_INFO, NETDEV_UDP_TUNNEL_DROP_INFO, NETDEV_CHANGE_TX_QUEUE_LEN, NETDEV_CVLAN_FILTER_PUSH_INFO, NETDEV_CVLAN_FILTER_DROP_INFO, NETDEV_SVLAN_FILTER_PUSH_INFO, NETDEV_SVLAN_FILTER_DROP_INFO, NETDEV_OFFLOAD_XSTATS_ENABLE, NETDEV_OFFLOAD_XSTATS_DISABLE, NETDEV_OFFLOAD_XSTATS_REPORT_USED, NETDEV_OFFLOAD_XSTATS_REPORT_DELTA, NETDEV_XDP_FEAT_CHANGE, }; const char *netdev_cmd_to_name(enum netdev_cmd cmd); int register_netdevice_notifier(struct notifier_block *nb); int unregister_netdevice_notifier(struct notifier_block *nb); int register_netdevice_notifier_net(struct net *net, struct notifier_block *nb); int unregister_netdevice_notifier_net(struct net *net, struct notifier_block *nb); int register_netdevice_notifier_dev_net(struct net_device *dev, struct notifier_block *nb, struct netdev_net_notifier *nn); int unregister_netdevice_notifier_dev_net(struct net_device *dev, struct notifier_block *nb, struct netdev_net_notifier *nn); struct netdev_notifier_info { struct net_device *dev; struct netlink_ext_ack *extack; }; struct netdev_notifier_info_ext { struct netdev_notifier_info info; /* must be first */ union { u32 mtu; } ext; }; struct netdev_notifier_change_info { struct netdev_notifier_info info; /* must be first */ unsigned int flags_changed; }; struct netdev_notifier_changeupper_info { struct netdev_notifier_info info; /* must be first */ struct net_device *upper_dev; /* new upper dev */ bool master; /* is upper dev master */ bool linking; /* is the notification for link or unlink */ void *upper_info; /* upper dev info */ }; struct netdev_notifier_changelowerstate_info { struct netdev_notifier_info info; /* must be first */ void *lower_state_info; /* is lower dev state */ }; struct netdev_notifier_pre_changeaddr_info { struct netdev_notifier_info info; /* must be first */ const unsigned char *dev_addr; }; enum netdev_offload_xstats_type { NETDEV_OFFLOAD_XSTATS_TYPE_L3 = 1, }; struct netdev_notifier_offload_xstats_info { struct netdev_notifier_info info; /* must be first */ enum netdev_offload_xstats_type type; union { /* NETDEV_OFFLOAD_XSTATS_REPORT_DELTA */ struct netdev_notifier_offload_xstats_rd *report_delta; /* NETDEV_OFFLOAD_XSTATS_REPORT_USED */ struct netdev_notifier_offload_xstats_ru *report_used; }; }; int netdev_offload_xstats_enable(struct net_device *dev, enum netdev_offload_xstats_type type, struct netlink_ext_ack *extack); int netdev_offload_xstats_disable(struct net_device *dev, enum netdev_offload_xstats_type type); bool netdev_offload_xstats_enabled(const struct net_device *dev, enum netdev_offload_xstats_type type); int netdev_offload_xstats_get(struct net_device *dev, enum netdev_offload_xstats_type type, struct rtnl_hw_stats64 *stats, bool *used, struct netlink_ext_ack *extack); void netdev_offload_xstats_report_delta(struct netdev_notifier_offload_xstats_rd *rd, const struct rtnl_hw_stats64 *stats); void netdev_offload_xstats_report_used(struct netdev_notifier_offload_xstats_ru *ru); void netdev_offload_xstats_push_delta(struct net_device *dev, enum netdev_offload_xstats_type type, const struct rtnl_hw_stats64 *stats); static inline void netdev_notifier_info_init(struct netdev_notifier_info *info, struct net_device *dev) { info->dev = dev; info->extack = NULL; } static inline struct net_device * netdev_notifier_info_to_dev(const struct netdev_notifier_info *info) { return info->dev; } static inline struct netlink_ext_ack * netdev_notifier_info_to_extack(const struct netdev_notifier_info *info) { return info->extack; } int call_netdevice_notifiers(unsigned long val, struct net_device *dev); int call_netdevice_notifiers_info(unsigned long val, struct netdev_notifier_info *info); #define for_each_netdev(net, d) \ list_for_each_entry(d, &(net)->dev_base_head, dev_list) #define for_each_netdev_reverse(net, d) \ list_for_each_entry_reverse(d, &(net)->dev_base_head, dev_list) #define for_each_netdev_rcu(net, d) \ list_for_each_entry_rcu(d, &(net)->dev_base_head, dev_list) #define for_each_netdev_safe(net, d, n) \ list_for_each_entry_safe(d, n, &(net)->dev_base_head, dev_list) #define for_each_netdev_continue(net, d) \ list_for_each_entry_continue(d, &(net)->dev_base_head, dev_list) #define for_each_netdev_continue_reverse(net, d) \ list_for_each_entry_continue_reverse(d, &(net)->dev_base_head, \ dev_list) #define for_each_netdev_continue_rcu(net, d) \ list_for_each_entry_continue_rcu(d, &(net)->dev_base_head, dev_list) #define for_each_netdev_in_bond_rcu(bond, slave) \ for_each_netdev_rcu(&init_net, slave) \ if (netdev_master_upper_dev_get_rcu(slave) == (bond)) #define net_device_entry(lh) list_entry(lh, struct net_device, dev_list) #define for_each_netdev_dump(net, d, ifindex) \ xa_for_each_start(&(net)->dev_by_index, (ifindex), (d), (ifindex)) static inline struct net_device *next_net_device(struct net_device *dev) { struct list_head *lh; struct net *net; net = dev_net(dev); lh = dev->dev_list.next; return lh == &net->dev_base_head ? NULL : net_device_entry(lh); } static inline struct net_device *next_net_device_rcu(struct net_device *dev) { struct list_head *lh; struct net *net; net = dev_net(dev); lh = rcu_dereference(list_next_rcu(&dev->dev_list)); return lh == &net->dev_base_head ? NULL : net_device_entry(lh); } static inline struct net_device *first_net_device(struct net *net) { return list_empty(&net->dev_base_head) ? NULL : net_device_entry(net->dev_base_head.next); } static inline struct net_device *first_net_device_rcu(struct net *net) { struct list_head *lh = rcu_dereference(list_next_rcu(&net->dev_base_head)); return lh == &net->dev_base_head ? NULL : net_device_entry(lh); } int netdev_boot_setup_check(struct net_device *dev); struct net_device *dev_getbyhwaddr_rcu(struct net *net, unsigned short type, const char *hwaddr); struct net_device *dev_getfirstbyhwtype(struct net *net, unsigned short type); void dev_add_pack(struct packet_type *pt); void dev_remove_pack(struct packet_type *pt); void __dev_remove_pack(struct packet_type *pt); void dev_add_offload(struct packet_offload *po); void dev_remove_offload(struct packet_offload *po); int dev_get_iflink(const struct net_device *dev); int dev_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb); int dev_fill_forward_path(const struct net_device *dev, const u8 *daddr, struct net_device_path_stack *stack); struct net_device *__dev_get_by_flags(struct net *net, unsigned short flags, unsigned short mask); struct net_device *dev_get_by_name(struct net *net, const char *name); struct net_device *dev_get_by_name_rcu(struct net *net, const char *name); struct net_device *__dev_get_by_name(struct net *net, const char *name); bool netdev_name_in_use(struct net *net, const char *name); int dev_alloc_name(struct net_device *dev, const char *name); int dev_open(struct net_device *dev, struct netlink_ext_ack *extack); void dev_close(struct net_device *dev); void dev_close_many(struct list_head *head, bool unlink); void dev_disable_lro(struct net_device *dev); int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *newskb); u16 dev_pick_tx_zero(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev); u16 dev_pick_tx_cpu_id(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev); int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev); int __dev_direct_xmit(struct sk_buff *skb, u16 queue_id); static inline int dev_queue_xmit(struct sk_buff *skb) { return __dev_queue_xmit(skb, NULL); } static inline int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev) { return __dev_queue_xmit(skb, sb_dev); } static inline int dev_direct_xmit(struct sk_buff *skb, u16 queue_id) { int ret; ret = __dev_direct_xmit(skb, queue_id); if (!dev_xmit_complete(ret)) kfree_skb(skb); return ret; } int register_netdevice(struct net_device *dev); void unregister_netdevice_queue(struct net_device *dev, struct list_head *head); void unregister_netdevice_many(struct list_head *head); static inline void unregister_netdevice(struct net_device *dev) { unregister_netdevice_queue(dev, NULL); } int netdev_refcnt_read(const struct net_device *dev); void free_netdev(struct net_device *dev); void netdev_freemem(struct net_device *dev); void init_dummy_netdev(struct net_device *dev); struct net_device *netdev_get_xmit_slave(struct net_device *dev, struct sk_buff *skb, bool all_slaves); struct net_device *netdev_sk_get_lowest_dev(struct net_device *dev, struct sock *sk); struct net_device *dev_get_by_index(struct net *net, int ifindex); struct net_device *__dev_get_by_index(struct net *net, int ifindex); struct net_device *netdev_get_by_index(struct net *net, int ifindex, netdevice_tracker *tracker, gfp_t gfp); struct net_device *netdev_get_by_name(struct net *net, const char *name, netdevice_tracker *tracker, gfp_t gfp); struct net_device *dev_get_by_index_rcu(struct net *net, int ifindex); struct net_device *dev_get_by_napi_id(unsigned int napi_id); static inline int dev_hard_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, const void *daddr, const void *saddr, unsigned int len) { if (!dev->header_ops || !dev->header_ops->create) return 0; return dev->header_ops->create(skb, dev, type, daddr, saddr, len); } static inline int dev_parse_header(const struct sk_buff *skb, unsigned char *haddr) { const struct net_device *dev = skb->dev; if (!dev->header_ops || !dev->header_ops->parse) return 0; return dev->header_ops->parse(skb, haddr); } static inline __be16 dev_parse_header_protocol(const struct sk_buff *skb) { const struct net_device *dev = skb->dev; if (!dev->header_ops || !dev->header_ops->parse_protocol) return 0; return dev->header_ops->parse_protocol(skb); } /* ll_header must have at least hard_header_len allocated */ static inline bool dev_validate_header(const struct net_device *dev, char *ll_header, int len) { if (likely(len >= dev->hard_header_len)) return true; if (len < dev->min_header_len) return false; if (capable(CAP_SYS_RAWIO)) { memset(ll_header + len, 0, dev->hard_header_len - len); return true; } if (dev->header_ops && dev->header_ops->validate) return dev->header_ops->validate(ll_header, len); return false; } static inline bool dev_has_header(const struct net_device *dev) { return dev->header_ops && dev->header_ops->create; } /* * Incoming packets are placed on per-CPU queues */ struct softnet_data { struct list_head poll_list; struct sk_buff_head process_queue; /* stats */ unsigned int processed; unsigned int time_squeeze; #ifdef CONFIG_RPS struct softnet_data *rps_ipi_list; #endif bool in_net_rx_action; bool in_napi_threaded_poll; #ifdef CONFIG_NET_FLOW_LIMIT struct sd_flow_limit __rcu *flow_limit; #endif struct Qdisc *output_queue; struct Qdisc **output_queue_tailp; struct sk_buff *completion_queue; #ifdef CONFIG_XFRM_OFFLOAD struct sk_buff_head xfrm_backlog; #endif /* written and read only by owning cpu: */ struct { u16 recursion; u8 more; #ifdef CONFIG_NET_EGRESS u8 skip_txqueue; #endif } xmit; #ifdef CONFIG_RPS /* input_queue_head should be written by cpu owning this struct, * and only read by other cpus. Worth using a cache line. */ unsigned int input_queue_head ____cacheline_aligned_in_smp; /* Elements below can be accessed between CPUs for RPS/RFS */ call_single_data_t csd ____cacheline_aligned_in_smp; struct softnet_data *rps_ipi_next; unsigned int cpu; unsigned int input_queue_tail; #endif unsigned int received_rps; unsigned int dropped; struct sk_buff_head input_pkt_queue; struct napi_struct backlog; /* Another possibly contended cache line */ spinlock_t defer_lock ____cacheline_aligned_in_smp; int defer_count; int defer_ipi_scheduled; struct sk_buff *defer_list; call_single_data_t defer_csd; }; static inline void input_queue_head_incr(struct softnet_data *sd) { #ifdef CONFIG_RPS sd->input_queue_head++; #endif } static inline void input_queue_tail_incr_save(struct softnet_data *sd, unsigned int *qtail) { #ifdef CONFIG_RPS *qtail = ++sd->input_queue_tail; #endif } DECLARE_PER_CPU_ALIGNED(struct softnet_data, softnet_data); static inline int dev_recursion_level(void) { return this_cpu_read(softnet_data.xmit.recursion); } #define XMIT_RECURSION_LIMIT 8 static inline bool dev_xmit_recursion(void) { return unlikely(__this_cpu_read(softnet_data.xmit.recursion) > XMIT_RECURSION_LIMIT); } static inline void dev_xmit_recursion_inc(void) { __this_cpu_inc(softnet_data.xmit.recursion); } static inline void dev_xmit_recursion_dec(void) { __this_cpu_dec(softnet_data.xmit.recursion); } void __netif_schedule(struct Qdisc *q); void netif_schedule_queue(struct netdev_queue *txq); static inline void netif_tx_schedule_all(struct net_device *dev) { unsigned int i; for (i = 0; i < dev->num_tx_queues; i++) netif_schedule_queue(netdev_get_tx_queue(dev, i)); } static __always_inline void netif_tx_start_queue(struct netdev_queue *dev_queue) { clear_bit(__QUEUE_STATE_DRV_XOFF, &dev_queue->state); } /** * netif_start_queue - allow transmit * @dev: network device * * Allow upper layers to call the device hard_start_xmit routine. */ static inline void netif_start_queue(struct net_device *dev) { netif_tx_start_queue(netdev_get_tx_queue(dev, 0)); } static inline void netif_tx_start_all_queues(struct net_device *dev) { unsigned int i; for (i = 0; i < dev->num_tx_queues; i++) { struct netdev_queue *txq = netdev_get_tx_queue(dev, i); netif_tx_start_queue(txq); } } void netif_tx_wake_queue(struct netdev_queue *dev_queue); /** * netif_wake_queue - restart transmit * @dev: network device * * Allow upper layers to call the device hard_start_xmit routine. * Used for flow control when transmit resources are available. */ static inline void netif_wake_queue(struct net_device *dev) { netif_tx_wake_queue(netdev_get_tx_queue(dev, 0)); } static inline void netif_tx_wake_all_queues(struct net_device *dev) { unsigned int i; for (i = 0; i < dev->num_tx_queues; i++) { struct netdev_queue *txq = netdev_get_tx_queue(dev, i); netif_tx_wake_queue(txq); } } static __always_inline void netif_tx_stop_queue(struct netdev_queue *dev_queue) { /* Must be an atomic op see netif_txq_try_stop() */ set_bit(__QUEUE_STATE_DRV_XOFF, &dev_queue->state); } /** * netif_stop_queue - stop transmitted packets * @dev: network device * * Stop upper layers calling the device hard_start_xmit routine. * Used for flow control when transmit resources are unavailable. */ static inline void netif_stop_queue(struct net_device *dev) { netif_tx_stop_queue(netdev_get_tx_queue(dev, 0)); } void netif_tx_stop_all_queues(struct net_device *dev); static inline bool netif_tx_queue_stopped(const struct netdev_queue *dev_queue) { return test_bit(__QUEUE_STATE_DRV_XOFF, &dev_queue->state); } /** * netif_queue_stopped - test if transmit queue is flowblocked * @dev: network device * * Test if transmit queue on device is currently unable to send. */ static inline bool netif_queue_stopped(const struct net_device *dev) { return netif_tx_queue_stopped(netdev_get_tx_queue(dev, 0)); } static inline bool netif_xmit_stopped(const struct netdev_queue *dev_queue) { return dev_queue->state & QUEUE_STATE_ANY_XOFF; } static inline bool netif_xmit_frozen_or_stopped(const struct netdev_queue *dev_queue) { return dev_queue->state & QUEUE_STATE_ANY_XOFF_OR_FROZEN; } static inline bool netif_xmit_frozen_or_drv_stopped(const struct netdev_queue *dev_queue) { return dev_queue->state & QUEUE_STATE_DRV_XOFF_OR_FROZEN; } /** * netdev_queue_set_dql_min_limit - set dql minimum limit * @dev_queue: pointer to transmit queue * @min_limit: dql minimum limit * * Forces xmit_more() to return true until the minimum threshold * defined by @min_limit is reached (or until the tx queue is * empty). Warning: to be use with care, misuse will impact the * latency. */ static inline void netdev_queue_set_dql_min_limit(struct netdev_queue *dev_queue, unsigned int min_limit) { #ifdef CONFIG_BQL dev_queue->dql.min_limit = min_limit; #endif } static inline int netdev_queue_dql_avail(const struct netdev_queue *txq) { #ifdef CONFIG_BQL /* Non-BQL migrated drivers will return 0, too. */ return dql_avail(&txq->dql); #else return 0; #endif } /** * netdev_txq_bql_enqueue_prefetchw - prefetch bql data for write * @dev_queue: pointer to transmit queue * * BQL enabled drivers might use this helper in their ndo_start_xmit(), * to give appropriate hint to the CPU. */ static inline void netdev_txq_bql_enqueue_prefetchw(struct netdev_queue *dev_queue) { #ifdef CONFIG_BQL prefetchw(&dev_queue->dql.num_queued); #endif } /** * netdev_txq_bql_complete_prefetchw - prefetch bql data for write * @dev_queue: pointer to transmit queue * * BQL enabled drivers might use this helper in their TX completion path, * to give appropriate hint to the CPU. */ static inline void netdev_txq_bql_complete_prefetchw(struct netdev_queue *dev_queue) { #ifdef CONFIG_BQL prefetchw(&dev_queue->dql.limit); #endif } /** * netdev_tx_sent_queue - report the number of bytes queued to a given tx queue * @dev_queue: network device queue * @bytes: number of bytes queued to the device queue * * Report the number of bytes queued for sending/completion to the network * device hardware queue. @bytes should be a good approximation and should * exactly match netdev_completed_queue() @bytes. * This is typically called once per packet, from ndo_start_xmit(). */ static inline void netdev_tx_sent_queue(struct netdev_queue *dev_queue, unsigned int bytes) { #ifdef CONFIG_BQL dql_queued(&dev_queue->dql, bytes); if (likely(dql_avail(&dev_queue->dql) >= 0)) return; set_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state); /* * The XOFF flag must be set before checking the dql_avail below, * because in netdev_tx_completed_queue we update the dql_completed * before checking the XOFF flag. */ smp_mb(); /* check again in case another CPU has just made room avail */ if (unlikely(dql_avail(&dev_queue->dql) >= 0)) clear_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state); #endif } /* Variant of netdev_tx_sent_queue() for drivers that are aware * that they should not test BQL status themselves. * We do want to change __QUEUE_STATE_STACK_XOFF only for the last * skb of a batch. * Returns true if the doorbell must be used to kick the NIC. */ static inline bool __netdev_tx_sent_queue(struct netdev_queue *dev_queue, unsigned int bytes, bool xmit_more) { if (xmit_more) { #ifdef CONFIG_BQL dql_queued(&dev_queue->dql, bytes); #endif return netif_tx_queue_stopped(dev_queue); } netdev_tx_sent_queue(dev_queue, bytes); return true; } /** * netdev_sent_queue - report the number of bytes queued to hardware * @dev: network device * @bytes: number of bytes queued to the hardware device queue * * Report the number of bytes queued for sending/completion to the network * device hardware queue#0. @bytes should be a good approximation and should * exactly match netdev_completed_queue() @bytes. * This is typically called once per packet, from ndo_start_xmit(). */ static inline void netdev_sent_queue(struct net_device *dev, unsigned int bytes) { netdev_tx_sent_queue(netdev_get_tx_queue(dev, 0), bytes); } static inline bool __netdev_sent_queue(struct net_device *dev, unsigned int bytes, bool xmit_more) { return __netdev_tx_sent_queue(netdev_get_tx_queue(dev, 0), bytes, xmit_more); } /** * netdev_tx_completed_queue - report number of packets/bytes at TX completion. * @dev_queue: network device queue * @pkts: number of packets (currently ignored) * @bytes: number of bytes dequeued from the device queue * * Must be called at most once per TX completion round (and not per * individual packet), so that BQL can adjust its limits appropriately. */ static inline void netdev_tx_completed_queue(struct netdev_queue *dev_queue, unsigned int pkts, unsigned int bytes) { #ifdef CONFIG_BQL if (unlikely(!bytes)) return; dql_completed(&dev_queue->dql, bytes); /* * Without the memory barrier there is a small possiblity that * netdev_tx_sent_queue will miss the update and cause the queue to * be stopped forever */ smp_mb(); /* NOTE: netdev_txq_completed_mb() assumes this exists */ if (unlikely(dql_avail(&dev_queue->dql) < 0)) return; if (test_and_clear_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state)) netif_schedule_queue(dev_queue); #endif } /** * netdev_completed_queue - report bytes and packets completed by device * @dev: network device * @pkts: actual number of packets sent over the medium * @bytes: actual number of bytes sent over the medium * * Report the number of bytes and packets transmitted by the network device * hardware queue over the physical medium, @bytes must exactly match the * @bytes amount passed to netdev_sent_queue() */ static inline void netdev_completed_queue(struct net_device *dev, unsigned int pkts, unsigned int bytes) { netdev_tx_completed_queue(netdev_get_tx_queue(dev, 0), pkts, bytes); } static inline void netdev_tx_reset_queue(struct netdev_queue *q) { #ifdef CONFIG_BQL clear_bit(__QUEUE_STATE_STACK_XOFF, &q->state); dql_reset(&q->dql); #endif } /** * netdev_reset_queue - reset the packets and bytes count of a network device * @dev_queue: network device * * Reset the bytes and packet count of a network device and clear the * software flow control OFF bit for this network device */ static inline void netdev_reset_queue(struct net_device *dev_queue) { netdev_tx_reset_queue(netdev_get_tx_queue(dev_queue, 0)); } /** * netdev_cap_txqueue - check if selected tx queue exceeds device queues * @dev: network device * @queue_index: given tx queue index * * Returns 0 if given tx queue index >= number of device tx queues, * otherwise returns the originally passed tx queue index. */ static inline u16 netdev_cap_txqueue(struct net_device *dev, u16 queue_index) { if (unlikely(queue_index >= dev->real_num_tx_queues)) { net_warn_ratelimited("%s selects TX queue %d, but real number of TX queues is %d\n", dev->name, queue_index, dev->real_num_tx_queues); return 0; } return queue_index; } /** * netif_running - test if up * @dev: network device * * Test if the device has been brought up. */ static inline bool netif_running(const struct net_device *dev) { return test_bit(__LINK_STATE_START, &dev->state); } /* * Routines to manage the subqueues on a device. We only need start, * stop, and a check if it's stopped. All other device management is * done at the overall netdevice level. * Also test the device if we're multiqueue. */ /** * netif_start_subqueue - allow sending packets on subqueue * @dev: network device * @queue_index: sub queue index * * Start individual transmit queue of a device with multiple transmit queues. */ static inline void netif_start_subqueue(struct net_device *dev, u16 queue_index) { struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index); netif_tx_start_queue(txq); } /** * netif_stop_subqueue - stop sending packets on subqueue * @dev: network device * @queue_index: sub queue index * * Stop individual transmit queue of a device with multiple transmit queues. */ static inline void netif_stop_subqueue(struct net_device *dev, u16 queue_index) { struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index); netif_tx_stop_queue(txq); } /** * __netif_subqueue_stopped - test status of subqueue * @dev: network device * @queue_index: sub queue index * * Check individual transmit queue of a device with multiple transmit queues. */ static inline bool __netif_subqueue_stopped(const struct net_device *dev, u16 queue_index) { struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index); return netif_tx_queue_stopped(txq); } /** * netif_subqueue_stopped - test status of subqueue * @dev: network device * @skb: sub queue buffer pointer * * Check individual transmit queue of a device with multiple transmit queues. */ static inline bool netif_subqueue_stopped(const struct net_device *dev, struct sk_buff *skb) { return __netif_subqueue_stopped(dev, skb_get_queue_mapping(skb)); } /** * netif_wake_subqueue - allow sending packets on subqueue * @dev: network device * @queue_index: sub queue index * * Resume individual transmit queue of a device with multiple transmit queues. */ static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index) { struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index); netif_tx_wake_queue(txq); } #ifdef CONFIG_XPS int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask, u16 index); int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask, u16 index, enum xps_map_type type); /** * netif_attr_test_mask - Test a CPU or Rx queue set in a mask * @j: CPU/Rx queue index * @mask: bitmask of all cpus/rx queues * @nr_bits: number of bits in the bitmask * * Test if a CPU or Rx queue index is set in a mask of all CPU/Rx queues. */ static inline bool netif_attr_test_mask(unsigned long j, const unsigned long *mask, unsigned int nr_bits) { cpu_max_bits_warn(j, nr_bits); return test_bit(j, mask); } /** * netif_attr_test_online - Test for online CPU/Rx queue * @j: CPU/Rx queue index * @online_mask: bitmask for CPUs/Rx queues that are online * @nr_bits: number of bits in the bitmask * * Returns true if a CPU/Rx queue is online. */ static inline bool netif_attr_test_online(unsigned long j, const unsigned long *online_mask, unsigned int nr_bits) { cpu_max_bits_warn(j, nr_bits); if (online_mask) return test_bit(j, online_mask); return (j < nr_bits); } /** * netif_attrmask_next - get the next CPU/Rx queue in a cpu/Rx queues mask * @n: CPU/Rx queue index * @srcp: the cpumask/Rx queue mask pointer * @nr_bits: number of bits in the bitmask * * Returns >= nr_bits if no further CPUs/Rx queues set. */ static inline unsigned int netif_attrmask_next(int n, const unsigned long *srcp, unsigned int nr_bits) { /* -1 is a legal arg here. */ if (n != -1) cpu_max_bits_warn(n, nr_bits); if (srcp) return find_next_bit(srcp, nr_bits, n + 1); return n + 1; } /** * netif_attrmask_next_and - get the next CPU/Rx queue in \*src1p & \*src2p * @n: CPU/Rx queue index * @src1p: the first CPUs/Rx queues mask pointer * @src2p: the second CPUs/Rx queues mask pointer * @nr_bits: number of bits in the bitmask * * Returns >= nr_bits if no further CPUs/Rx queues set in both. */ static inline int netif_attrmask_next_and(int n, const unsigned long *src1p, const unsigned long *src2p, unsigned int nr_bits) { /* -1 is a legal arg here. */ if (n != -1) cpu_max_bits_warn(n, nr_bits); if (src1p && src2p) return find_next_and_bit(src1p, src2p, nr_bits, n + 1); else if (src1p) return find_next_bit(src1p, nr_bits, n + 1); else if (src2p) return find_next_bit(src2p, nr_bits, n + 1); return n + 1; } #else static inline int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask, u16 index) { return 0; } static inline int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask, u16 index, enum xps_map_type type) { return 0; } #endif /** * netif_is_multiqueue - test if device has multiple transmit queues * @dev: network device * * Check if device has multiple transmit queues */ static inline bool netif_is_multiqueue(const struct net_device *dev) { return dev->num_tx_queues > 1; } int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq); #ifdef CONFIG_SYSFS int netif_set_real_num_rx_queues(struct net_device *dev, unsigned int rxq); #else static inline int netif_set_real_num_rx_queues(struct net_device *dev, unsigned int rxqs) { dev->real_num_rx_queues = rxqs; return 0; } #endif int netif_set_real_num_queues(struct net_device *dev, unsigned int txq, unsigned int rxq); int netif_get_num_default_rss_queues(void); void dev_kfree_skb_irq_reason(struct sk_buff *skb, enum skb_drop_reason reason); void dev_kfree_skb_any_reason(struct sk_buff *skb, enum skb_drop_reason reason); /* * It is not allowed to call kfree_skb() or consume_skb() from hardware * interrupt context or with hardware interrupts being disabled. * (in_hardirq() || irqs_disabled()) * * We provide four helpers that can be used in following contexts : * * dev_kfree_skb_irq(skb) when caller drops a packet from irq context, * replacing kfree_skb(skb) * * dev_consume_skb_irq(skb) when caller consumes a packet from irq context. * Typically used in place of consume_skb(skb) in TX completion path * * dev_kfree_skb_any(skb) when caller doesn't know its current irq context, * replacing kfree_skb(skb) * * dev_consume_skb_any(skb) when caller doesn't know its current irq context, * and consumed a packet. Used in place of consume_skb(skb) */ static inline void dev_kfree_skb_irq(struct sk_buff *skb) { dev_kfree_skb_irq_reason(skb, SKB_DROP_REASON_NOT_SPECIFIED); } static inline void dev_consume_skb_irq(struct sk_buff *skb) { dev_kfree_skb_irq_reason(skb, SKB_CONSUMED); } static inline void dev_kfree_skb_any(struct sk_buff *skb) { dev_kfree_skb_any_reason(skb, SKB_DROP_REASON_NOT_SPECIFIED); } static inline void dev_consume_skb_any(struct sk_buff *skb) { dev_kfree_skb_any_reason(skb, SKB_CONSUMED); } u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp, struct bpf_prog *xdp_prog); void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog); int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb); int netif_rx(struct sk_buff *skb); int __netif_rx(struct sk_buff *skb); int netif_receive_skb(struct sk_buff *skb); int netif_receive_skb_core(struct sk_buff *skb); void netif_receive_skb_list_internal(struct list_head *head); void netif_receive_skb_list(struct list_head *head); gro_result_t napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb); void napi_gro_flush(struct napi_struct *napi, bool flush_old); struct sk_buff *napi_get_frags(struct napi_struct *napi); void napi_get_frags_check(struct napi_struct *napi); gro_result_t napi_gro_frags(struct napi_struct *napi); static inline void napi_free_frags(struct napi_struct *napi) { kfree_skb(napi->skb); napi->skb = NULL; } bool netdev_is_rx_handler_busy(struct net_device *dev); int netdev_rx_handler_register(struct net_device *dev, rx_handler_func_t *rx_handler, void *rx_handler_data); void netdev_rx_handler_unregister(struct net_device *dev); bool dev_valid_name(const char *name); static inline bool is_socket_ioctl_cmd(unsigned int cmd) { return _IOC_TYPE(cmd) == SOCK_IOC_TYPE; } int get_user_ifreq(struct ifreq *ifr, void __user **ifrdata, void __user *arg); int put_user_ifreq(struct ifreq *ifr, void __user *arg); int dev_ioctl(struct net *net, unsigned int cmd, struct ifreq *ifr, void __user *data, bool *need_copyout); int dev_ifconf(struct net *net, struct ifconf __user *ifc); int generic_hwtstamp_get_lower(struct net_device *dev, struct kernel_hwtstamp_config *kernel_cfg); int generic_hwtstamp_set_lower(struct net_device *dev, struct kernel_hwtstamp_config *kernel_cfg, struct netlink_ext_ack *extack); int dev_set_hwtstamp_phylib(struct net_device *dev, struct kernel_hwtstamp_config *cfg, struct netlink_ext_ack *extack); int dev_ethtool(struct net *net, struct ifreq *ifr, void __user *userdata); unsigned int dev_get_flags(const struct net_device *); int __dev_change_flags(struct net_device *dev, unsigned int flags, struct netlink_ext_ack *extack); int dev_change_flags(struct net_device *dev, unsigned int flags, struct netlink_ext_ack *extack); int dev_set_alias(struct net_device *, const char *, size_t); int dev_get_alias(const struct net_device *, char *, size_t); int __dev_change_net_namespace(struct net_device *dev, struct net *net, const char *pat, int new_ifindex); static inline int dev_change_net_namespace(struct net_device *dev, struct net *net, const char *pat) { return __dev_change_net_namespace(dev, net, pat, 0); } int __dev_set_mtu(struct net_device *, int); int dev_set_mtu(struct net_device *, int); int dev_pre_changeaddr_notify(struct net_device *dev, const char *addr, struct netlink_ext_ack *extack); int dev_set_mac_address(struct net_device *dev, struct sockaddr *sa, struct netlink_ext_ack *extack); int dev_set_mac_address_user(struct net_device *dev, struct sockaddr *sa, struct netlink_ext_ack *extack); int dev_get_mac_address(struct sockaddr *sa, struct net *net, char *dev_name); int dev_get_port_parent_id(struct net_device *dev, struct netdev_phys_item_id *ppid, bool recurse); bool netdev_port_same_parent_id(struct net_device *a, struct net_device *b); struct sk_buff *validate_xmit_skb_list(struct sk_buff *skb, struct net_device *dev, bool *again); struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev, struct netdev_queue *txq, int *ret); int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog); u8 dev_xdp_prog_count(struct net_device *dev); u32 dev_xdp_prog_id(struct net_device *dev, enum bpf_xdp_mode mode); int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb); int dev_forward_skb(struct net_device *dev, struct sk_buff *skb); int dev_forward_skb_nomtu(struct net_device *dev, struct sk_buff *skb); bool is_skb_forwardable(const struct net_device *dev, const struct sk_buff *skb); static __always_inline bool __is_skb_forwardable(const struct net_device *dev, const struct sk_buff *skb, const bool check_mtu) { const u32 vlan_hdr_len = 4; /* VLAN_HLEN */ unsigned int len; if (!(dev->flags & IFF_UP)) return false; if (!check_mtu) return true; len = dev->mtu + dev->hard_header_len + vlan_hdr_len; if (skb->len <= len) return true; /* if TSO is enabled, we don't care about the length as the packet * could be forwarded without being segmented before */ if (skb_is_gso(skb)) return true; return false; } void netdev_core_stats_inc(struct net_device *dev, u32 offset); #define DEV_CORE_STATS_INC(FIELD) \ static inline void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ { \ netdev_core_stats_inc(dev, \ offsetof(struct net_device_core_stats, FIELD)); \ } DEV_CORE_STATS_INC(rx_dropped) DEV_CORE_STATS_INC(tx_dropped) DEV_CORE_STATS_INC(rx_nohandler) DEV_CORE_STATS_INC(rx_otherhost_dropped) #undef DEV_CORE_STATS_INC static __always_inline int ____dev_forward_skb(struct net_device *dev, struct sk_buff *skb, const bool check_mtu) { if (skb_orphan_frags(skb, GFP_ATOMIC) || unlikely(!__is_skb_forwardable(dev, skb, check_mtu))) { dev_core_stats_rx_dropped_inc(dev); kfree_skb(skb); return NET_RX_DROP; } skb_scrub_packet(skb, !net_eq(dev_net(dev), dev_net(skb->dev))); skb->priority = 0; return 0; } bool dev_nit_active(struct net_device *dev); void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev); static inline void __dev_put(struct net_device *dev) { if (dev) { #ifdef CONFIG_PCPU_DEV_REFCNT this_cpu_dec(*dev->pcpu_refcnt); #else refcount_dec(&dev->dev_refcnt); #endif } } static inline void __dev_hold(struct net_device *dev) { if (dev) { #ifdef CONFIG_PCPU_DEV_REFCNT this_cpu_inc(*dev->pcpu_refcnt); #else refcount_inc(&dev->dev_refcnt); #endif } } static inline void __netdev_tracker_alloc(struct net_device *dev, netdevice_tracker *tracker, gfp_t gfp) { #ifdef CONFIG_NET_DEV_REFCNT_TRACKER ref_tracker_alloc(&dev->refcnt_tracker, tracker, gfp); #endif } /* netdev_tracker_alloc() can upgrade a prior untracked reference * taken by dev_get_by_name()/dev_get_by_index() to a tracked one. */ static inline void netdev_tracker_alloc(struct net_device *dev, netdevice_tracker *tracker, gfp_t gfp) { #ifdef CONFIG_NET_DEV_REFCNT_TRACKER refcount_dec(&dev->refcnt_tracker.no_tracker); __netdev_tracker_alloc(dev, tracker, gfp); #endif } static inline void netdev_tracker_free(struct net_device *dev, netdevice_tracker *tracker) { #ifdef CONFIG_NET_DEV_REFCNT_TRACKER ref_tracker_free(&dev->refcnt_tracker, tracker); #endif } static inline void netdev_hold(struct net_device *dev, netdevice_tracker *tracker, gfp_t gfp) { if (dev) { __dev_hold(dev); __netdev_tracker_alloc(dev, tracker, gfp); } } static inline void netdev_put(struct net_device *dev, netdevice_tracker *tracker) { if (dev) { netdev_tracker_free(dev, tracker); __dev_put(dev); } } /** * dev_hold - get reference to device * @dev: network device * * Hold reference to device to keep it from being freed. * Try using netdev_hold() instead. */ static inline void dev_hold(struct net_device *dev) { netdev_hold(dev, NULL, GFP_ATOMIC); } /** * dev_put - release reference to device * @dev: network device * * Release reference to device to allow it to be freed. * Try using netdev_put() instead. */ static inline void dev_put(struct net_device *dev) { netdev_put(dev, NULL); } static inline void netdev_ref_replace(struct net_device *odev, struct net_device *ndev, netdevice_tracker *tracker, gfp_t gfp) { if (odev) netdev_tracker_free(odev, tracker); __dev_hold(ndev); __dev_put(odev); if (ndev) __netdev_tracker_alloc(ndev, tracker, gfp); } /* Carrier loss detection, dial on demand. The functions netif_carrier_on * and _off may be called from IRQ context, but it is caller * who is responsible for serialization of these calls. * * The name carrier is inappropriate, these functions should really be * called netif_lowerlayer_*() because they represent the state of any * kind of lower layer not just hardware media. */ void linkwatch_fire_event(struct net_device *dev); /** * linkwatch_sync_dev - sync linkwatch for the given device * @dev: network device to sync linkwatch for * * Sync linkwatch for the given device, removing it from the * pending work list (if queued). */ void linkwatch_sync_dev(struct net_device *dev); /** * netif_carrier_ok - test if carrier present * @dev: network device * * Check if carrier is present on device */ static inline bool netif_carrier_ok(const struct net_device *dev) { return !test_bit(__LINK_STATE_NOCARRIER, &dev->state); } unsigned long dev_trans_start(struct net_device *dev); void __netdev_watchdog_up(struct net_device *dev); void netif_carrier_on(struct net_device *dev); void netif_carrier_off(struct net_device *dev); void netif_carrier_event(struct net_device *dev); /** * netif_dormant_on - mark device as dormant. * @dev: network device * * Mark device as dormant (as per RFC2863). * * The dormant state indicates that the relevant interface is not * actually in a condition to pass packets (i.e., it is not 'up') but is * in a "pending" state, waiting for some external event. For "on- * demand" interfaces, this new state identifies the situation where the * interface is waiting for events to place it in the up state. */ static inline void netif_dormant_on(struct net_device *dev) { if (!test_and_set_bit(__LINK_STATE_DORMANT, &dev->state)) linkwatch_fire_event(dev); } /** * netif_dormant_off - set device as not dormant. * @dev: network device * * Device is not in dormant state. */ static inline void netif_dormant_off(struct net_device *dev) { if (test_and_clear_bit(__LINK_STATE_DORMANT, &dev->state)) linkwatch_fire_event(dev); } /** * netif_dormant - test if device is dormant * @dev: network device * * Check if device is dormant. */ static inline bool netif_dormant(const struct net_device *dev) { return test_bit(__LINK_STATE_DORMANT, &dev->state); } /** * netif_testing_on - mark device as under test. * @dev: network device * * Mark device as under test (as per RFC2863). * * The testing state indicates that some test(s) must be performed on * the interface. After completion, of the test, the interface state * will change to up, dormant, or down, as appropriate. */ static inline void netif_testing_on(struct net_device *dev) { if (!test_and_set_bit(__LINK_STATE_TESTING, &dev->state)) linkwatch_fire_event(dev); } /** * netif_testing_off - set device as not under test. * @dev: network device * * Device is not in testing state. */ static inline void netif_testing_off(struct net_device *dev) { if (test_and_clear_bit(__LINK_STATE_TESTING, &dev->state)) linkwatch_fire_event(dev); } /** * netif_testing - test if device is under test * @dev: network device * * Check if device is under test */ static inline bool netif_testing(const struct net_device *dev) { return test_bit(__LINK_STATE_TESTING, &dev->state); } /** * netif_oper_up - test if device is operational * @dev: network device * * Check if carrier is operational */ static inline bool netif_oper_up(const struct net_device *dev) { unsigned int operstate = READ_ONCE(dev->operstate); return operstate == IF_OPER_UP || operstate == IF_OPER_UNKNOWN /* backward compat */; } /** * netif_device_present - is device available or removed * @dev: network device * * Check if device has not been removed from system. */ static inline bool netif_device_present(const struct net_device *dev) { return test_bit(__LINK_STATE_PRESENT, &dev->state); } void netif_device_detach(struct net_device *dev); void netif_device_attach(struct net_device *dev); /* * Network interface message level settings */ enum { NETIF_MSG_DRV_BIT, NETIF_MSG_PROBE_BIT, NETIF_MSG_LINK_BIT, NETIF_MSG_TIMER_BIT, NETIF_MSG_IFDOWN_BIT, NETIF_MSG_IFUP_BIT, NETIF_MSG_RX_ERR_BIT, NETIF_MSG_TX_ERR_BIT, NETIF_MSG_TX_QUEUED_BIT, NETIF_MSG_INTR_BIT, NETIF_MSG_TX_DONE_BIT, NETIF_MSG_RX_STATUS_BIT, NETIF_MSG_PKTDATA_BIT, NETIF_MSG_HW_BIT, NETIF_MSG_WOL_BIT, /* When you add a new bit above, update netif_msg_class_names array * in net/ethtool/common.c */ NETIF_MSG_CLASS_COUNT, }; /* Both ethtool_ops interface and internal driver implementation use u32 */ static_assert(NETIF_MSG_CLASS_COUNT <= 32); #define __NETIF_MSG_BIT(bit) ((u32)1 << (bit)) #define __NETIF_MSG(name) __NETIF_MSG_BIT(NETIF_MSG_ ## name ## _BIT) #define NETIF_MSG_DRV __NETIF_MSG(DRV) #define NETIF_MSG_PROBE __NETIF_MSG(PROBE) #define NETIF_MSG_LINK __NETIF_MSG(LINK) #define NETIF_MSG_TIMER __NETIF_MSG(TIMER) #define NETIF_MSG_IFDOWN __NETIF_MSG(IFDOWN) #define NETIF_MSG_IFUP __NETIF_MSG(IFUP) #define NETIF_MSG_RX_ERR __NETIF_MSG(RX_ERR) #define NETIF_MSG_TX_ERR __NETIF_MSG(TX_ERR) #define NETIF_MSG_TX_QUEUED __NETIF_MSG(TX_QUEUED) #define NETIF_MSG_INTR __NETIF_MSG(INTR) #define NETIF_MSG_TX_DONE __NETIF_MSG(TX_DONE) #define NETIF_MSG_RX_STATUS __NETIF_MSG(RX_STATUS) #define NETIF_MSG_PKTDATA __NETIF_MSG(PKTDATA) #define NETIF_MSG_HW __NETIF_MSG(HW) #define NETIF_MSG_WOL __NETIF_MSG(WOL) #define netif_msg_drv(p) ((p)->msg_enable & NETIF_MSG_DRV) #define netif_msg_probe(p) ((p)->msg_enable & NETIF_MSG_PROBE) #define netif_msg_link(p) ((p)->msg_enable & NETIF_MSG_LINK) #define netif_msg_timer(p) ((p)->msg_enable & NETIF_MSG_TIMER) #define netif_msg_ifdown(p) ((p)->msg_enable & NETIF_MSG_IFDOWN) #define netif_msg_ifup(p) ((p)->msg_enable & NETIF_MSG_IFUP) #define netif_msg_rx_err(p) ((p)->msg_enable & NETIF_MSG_RX_ERR) #define netif_msg_tx_err(p) ((p)->msg_enable & NETIF_MSG_TX_ERR) #define netif_msg_tx_queued(p) ((p)->msg_enable & NETIF_MSG_TX_QUEUED) #define netif_msg_intr(p) ((p)->msg_enable & NETIF_MSG_INTR) #define netif_msg_tx_done(p) ((p)->msg_enable & NETIF_MSG_TX_DONE) #define netif_msg_rx_status(p) ((p)->msg_enable & NETIF_MSG_RX_STATUS) #define netif_msg_pktdata(p) ((p)->msg_enable & NETIF_MSG_PKTDATA) #define netif_msg_hw(p) ((p)->msg_enable & NETIF_MSG_HW) #define netif_msg_wol(p) ((p)->msg_enable & NETIF_MSG_WOL) static inline u32 netif_msg_init(int debug_value, int default_msg_enable_bits) { /* use default */ if (debug_value < 0 || debug_value >= (sizeof(u32) * 8)) return default_msg_enable_bits; if (debug_value == 0) /* no output */ return 0; /* set low N bits */ return (1U << debug_value) - 1; } static inline void __netif_tx_lock(struct netdev_queue *txq, int cpu) { spin_lock(&txq->_xmit_lock); /* Pairs with READ_ONCE() in __dev_queue_xmit() */ WRITE_ONCE(txq->xmit_lock_owner, cpu); } static inline bool __netif_tx_acquire(struct netdev_queue *txq) { __acquire(&txq->_xmit_lock); return true; } static inline void __netif_tx_release(struct netdev_queue *txq) { __release(&txq->_xmit_lock); } static inline void __netif_tx_lock_bh(struct netdev_queue *txq) { spin_lock_bh(&txq->_xmit_lock); /* Pairs with READ_ONCE() in __dev_queue_xmit() */ WRITE_ONCE(txq->xmit_lock_owner, smp_processor_id()); } static inline bool __netif_tx_trylock(struct netdev_queue *txq) { bool ok = spin_trylock(&txq->_xmit_lock); if (likely(ok)) { /* Pairs with READ_ONCE() in __dev_queue_xmit() */ WRITE_ONCE(txq->xmit_lock_owner, smp_processor_id()); } return ok; } static inline void __netif_tx_unlock(struct netdev_queue *txq) { /* Pairs with READ_ONCE() in __dev_queue_xmit() */ WRITE_ONCE(txq->xmit_lock_owner, -1); spin_unlock(&txq->_xmit_lock); } static inline void __netif_tx_unlock_bh(struct netdev_queue *txq) { /* Pairs with READ_ONCE() in __dev_queue_xmit() */ WRITE_ONCE(txq->xmit_lock_owner, -1); spin_unlock_bh(&txq->_xmit_lock); } /* * txq->trans_start can be read locklessly from dev_watchdog() */ static inline void txq_trans_update(struct netdev_queue *txq) { if (txq->xmit_lock_owner != -1) WRITE_ONCE(txq->trans_start, jiffies); } static inline void txq_trans_cond_update(struct netdev_queue *txq) { unsigned long now = jiffies; if (READ_ONCE(txq->trans_start) != now) WRITE_ONCE(txq->trans_start, now); } /* legacy drivers only, netdev_start_xmit() sets txq->trans_start */ static inline void netif_trans_update(struct net_device *dev) { struct netdev_queue *txq = netdev_get_tx_queue(dev, 0); txq_trans_cond_update(txq); } /** * netif_tx_lock - grab network device transmit lock * @dev: network device * * Get network device transmit lock */ void netif_tx_lock(struct net_device *dev); static inline void netif_tx_lock_bh(struct net_device *dev) { local_bh_disable(); netif_tx_lock(dev); } void netif_tx_unlock(struct net_device *dev); static inline void netif_tx_unlock_bh(struct net_device *dev) { netif_tx_unlock(dev); local_bh_enable(); } #define HARD_TX_LOCK(dev, txq, cpu) { \ if ((dev->features & NETIF_F_LLTX) == 0) { \ __netif_tx_lock(txq, cpu); \ } else { \ __netif_tx_acquire(txq); \ } \ } #define HARD_TX_TRYLOCK(dev, txq) \ (((dev->features & NETIF_F_LLTX) == 0) ? \ __netif_tx_trylock(txq) : \ __netif_tx_acquire(txq)) #define HARD_TX_UNLOCK(dev, txq) { \ if ((dev->features & NETIF_F_LLTX) == 0) { \ __netif_tx_unlock(txq); \ } else { \ __netif_tx_release(txq); \ } \ } static inline void netif_tx_disable(struct net_device *dev) { unsigned int i; int cpu; local_bh_disable(); cpu = smp_processor_id(); spin_lock(&dev->tx_global_lock); for (i = 0; i < dev->num_tx_queues; i++) { struct netdev_queue *txq = netdev_get_tx_queue(dev, i); __netif_tx_lock(txq, cpu); netif_tx_stop_queue(txq); __netif_tx_unlock(txq); } spin_unlock(&dev->tx_global_lock); local_bh_enable(); } static inline void netif_addr_lock(struct net_device *dev) { unsigned char nest_level = 0; #ifdef CONFIG_LOCKDEP nest_level = dev->nested_level; #endif spin_lock_nested(&dev->addr_list_lock, nest_level); } static inline void netif_addr_lock_bh(struct net_device *dev) { unsigned char nest_level = 0; #ifdef CONFIG_LOCKDEP nest_level = dev->nested_level; #endif local_bh_disable(); spin_lock_nested(&dev->addr_list_lock, nest_level); } static inline void netif_addr_unlock(struct net_device *dev) { spin_unlock(&dev->addr_list_lock); } static inline void netif_addr_unlock_bh(struct net_device *dev) { spin_unlock_bh(&dev->addr_list_lock); } /* * dev_addrs walker. Should be used only for read access. Call with * rcu_read_lock held. */ #define for_each_dev_addr(dev, ha) \ list_for_each_entry_rcu(ha, &dev->dev_addrs.list, list) /* These functions live elsewhere (drivers/net/net_init.c, but related) */ void ether_setup(struct net_device *dev); /* Support for loadable net-drivers */ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, unsigned char name_assign_type, void (*setup)(struct net_device *), unsigned int txqs, unsigned int rxqs); #define alloc_netdev(sizeof_priv, name, name_assign_type, setup) \ alloc_netdev_mqs(sizeof_priv, name, name_assign_type, setup, 1, 1) #define alloc_netdev_mq(sizeof_priv, name, name_assign_type, setup, count) \ alloc_netdev_mqs(sizeof_priv, name, name_assign_type, setup, count, \ count) int register_netdev(struct net_device *dev); void unregister_netdev(struct net_device *dev); int devm_register_netdev(struct device *dev, struct net_device *ndev); /* General hardware address lists handling functions */ int __hw_addr_sync(struct netdev_hw_addr_list *to_list, struct netdev_hw_addr_list *from_list, int addr_len); void __hw_addr_unsync(struct netdev_hw_addr_list *to_list, struct netdev_hw_addr_list *from_list, int addr_len); int __hw_addr_sync_dev(struct netdev_hw_addr_list *list, struct net_device *dev, int (*sync)(struct net_device *, const unsigned char *), int (*unsync)(struct net_device *, const unsigned char *)); int __hw_addr_ref_sync_dev(struct netdev_hw_addr_list *list, struct net_device *dev, int (*sync)(struct net_device *, const unsigned char *, int), int (*unsync)(struct net_device *, const unsigned char *, int)); void __hw_addr_ref_unsync_dev(struct netdev_hw_addr_list *list, struct net_device *dev, int (*unsync)(struct net_device *, const unsigned char *, int)); void __hw_addr_unsync_dev(struct netdev_hw_addr_list *list, struct net_device *dev, int (*unsync)(struct net_device *, const unsigned char *)); void __hw_addr_init(struct netdev_hw_addr_list *list); /* Functions used for device addresses handling */ void dev_addr_mod(struct net_device *dev, unsigned int offset, const void *addr, size_t len); static inline void __dev_addr_set(struct net_device *dev, const void *addr, size_t len) { dev_addr_mod(dev, 0, addr, len); } static inline void dev_addr_set(struct net_device *dev, const u8 *addr) { __dev_addr_set(dev, addr, dev->addr_len); } int dev_addr_add(struct net_device *dev, const unsigned char *addr, unsigned char addr_type); int dev_addr_del(struct net_device *dev, const unsigned char *addr, unsigned char addr_type); /* Functions used for unicast addresses handling */ int dev_uc_add(struct net_device *dev, const unsigned char *addr); int dev_uc_add_excl(struct net_device *dev, const unsigned char *addr); int dev_uc_del(struct net_device *dev, const unsigned char *addr); int dev_uc_sync(struct net_device *to, struct net_device *from); int dev_uc_sync_multiple(struct net_device *to, struct net_device *from); void dev_uc_unsync(struct net_device *to, struct net_device *from); void dev_uc_flush(struct net_device *dev); void dev_uc_init(struct net_device *dev); /** * __dev_uc_sync - Synchonize device's unicast list * @dev: device to sync * @sync: function to call if address should be added * @unsync: function to call if address should be removed * * Add newly added addresses to the interface, and release * addresses that have been deleted. */ static inline int __dev_uc_sync(struct net_device *dev, int (*sync)(struct net_device *, const unsigned char *), int (*unsync)(struct net_device *, const unsigned char *)) { return __hw_addr_sync_dev(&dev->uc, dev, sync, unsync); } /** * __dev_uc_unsync - Remove synchronized addresses from device * @dev: device to sync * @unsync: function to call if address should be removed * * Remove all addresses that were added to the device by dev_uc_sync(). */ static inline void __dev_uc_unsync(struct net_device *dev, int (*unsync)(struct net_device *, const unsigned char *)) { __hw_addr_unsync_dev(&dev->uc, dev, unsync); } /* Functions used for multicast addresses handling */ int dev_mc_add(struct net_device *dev, const unsigned char *addr); int dev_mc_add_global(struct net_device *dev, const unsigned char *addr); int dev_mc_add_excl(struct net_device *dev, const unsigned char *addr); int dev_mc_del(struct net_device *dev, const unsigned char *addr); int dev_mc_del_global(struct net_device *dev, const unsigned char *addr); int dev_mc_sync(struct net_device *to, struct net_device *from); int dev_mc_sync_multiple(struct net_device *to, struct net_device *from); void dev_mc_unsync(struct net_device *to, struct net_device *from); void dev_mc_flush(struct net_device *dev); void dev_mc_init(struct net_device *dev); /** * __dev_mc_sync - Synchonize device's multicast list * @dev: device to sync * @sync: function to call if address should be added * @unsync: function to call if address should be removed * * Add newly added addresses to the interface, and release * addresses that have been deleted. */ static inline int __dev_mc_sync(struct net_device *dev, int (*sync)(struct net_device *, const unsigned char *), int (*unsync)(struct net_device *, const unsigned char *)) { return __hw_addr_sync_dev(&dev->mc, dev, sync, unsync); } /** * __dev_mc_unsync - Remove synchronized addresses from device * @dev: device to sync * @unsync: function to call if address should be removed * * Remove all addresses that were added to the device by dev_mc_sync(). */ static inline void __dev_mc_unsync(struct net_device *dev, int (*unsync)(struct net_device *, const unsigned char *)) { __hw_addr_unsync_dev(&dev->mc, dev, unsync); } /* Functions used for secondary unicast and multicast support */ void dev_set_rx_mode(struct net_device *dev); int dev_set_promiscuity(struct net_device *dev, int inc); int dev_set_allmulti(struct net_device *dev, int inc); void netdev_state_change(struct net_device *dev); void __netdev_notify_peers(struct net_device *dev); void netdev_notify_peers(struct net_device *dev); void netdev_features_change(struct net_device *dev); /* Load a device via the kmod */ void dev_load(struct net *net, const char *name); struct rtnl_link_stats64 *dev_get_stats(struct net_device *dev, struct rtnl_link_stats64 *storage); void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64, const struct net_device_stats *netdev_stats); void dev_fetch_sw_netstats(struct rtnl_link_stats64 *s, const struct pcpu_sw_netstats __percpu *netstats); void dev_get_tstats64(struct net_device *dev, struct rtnl_link_stats64 *s); enum { NESTED_SYNC_IMM_BIT, NESTED_SYNC_TODO_BIT, }; #define __NESTED_SYNC_BIT(bit) ((u32)1 << (bit)) #define __NESTED_SYNC(name) __NESTED_SYNC_BIT(NESTED_SYNC_ ## name ## _BIT) #define NESTED_SYNC_IMM __NESTED_SYNC(IMM) #define NESTED_SYNC_TODO __NESTED_SYNC(TODO) struct netdev_nested_priv { unsigned char flags; void *data; }; bool netdev_has_upper_dev(struct net_device *dev, struct net_device *upper_dev); struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev, struct list_head **iter); /* iterate through upper list, must be called under RCU read lock */ #define netdev_for_each_upper_dev_rcu(dev, updev, iter) \ for (iter = &(dev)->adj_list.upper, \ updev = netdev_upper_get_next_dev_rcu(dev, &(iter)); \ updev; \ updev = netdev_upper_get_next_dev_rcu(dev, &(iter))) int netdev_walk_all_upper_dev_rcu(struct net_device *dev, int (*fn)(struct net_device *upper_dev, struct netdev_nested_priv *priv), struct netdev_nested_priv *priv); bool netdev_has_upper_dev_all_rcu(struct net_device *dev, struct net_device *upper_dev); bool netdev_has_any_upper_dev(struct net_device *dev); void *netdev_lower_get_next_private(struct net_device *dev, struct list_head **iter); void *netdev_lower_get_next_private_rcu(struct net_device *dev, struct list_head **iter); #define netdev_for_each_lower_private(dev, priv, iter) \ for (iter = (dev)->adj_list.lower.next, \ priv = netdev_lower_get_next_private(dev, &(iter)); \ priv; \ priv = netdev_lower_get_next_private(dev, &(iter))) #define netdev_for_each_lower_private_rcu(dev, priv, iter) \ for (iter = &(dev)->adj_list.lower, \ priv = netdev_lower_get_next_private_rcu(dev, &(iter)); \ priv; \ priv = netdev_lower_get_next_private_rcu(dev, &(iter))) void *netdev_lower_get_next(struct net_device *dev, struct list_head **iter); #define netdev_for_each_lower_dev(dev, ldev, iter) \ for (iter = (dev)->adj_list.lower.next, \ ldev = netdev_lower_get_next(dev, &(iter)); \ ldev; \ ldev = netdev_lower_get_next(dev, &(iter))) struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev, struct list_head **iter); int netdev_walk_all_lower_dev(struct net_device *dev, int (*fn)(struct net_device *lower_dev, struct netdev_nested_priv *priv), struct netdev_nested_priv *priv); int netdev_walk_all_lower_dev_rcu(struct net_device *dev, int (*fn)(struct net_device *lower_dev, struct netdev_nested_priv *priv), struct netdev_nested_priv *priv); void *netdev_adjacent_get_private(struct list_head *adj_list); void *netdev_lower_get_first_private_rcu(struct net_device *dev); struct net_device *netdev_master_upper_dev_get(struct net_device *dev); struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev); int netdev_upper_dev_link(struct net_device *dev, struct net_device *upper_dev, struct netlink_ext_ack *extack); int netdev_master_upper_dev_link(struct net_device *dev, struct net_device *upper_dev, void *upper_priv, void *upper_info, struct netlink_ext_ack *extack); void netdev_upper_dev_unlink(struct net_device *dev, struct net_device *upper_dev); int netdev_adjacent_change_prepare(struct net_device *old_dev, struct net_device *new_dev, struct net_device *dev, struct netlink_ext_ack *extack); void netdev_adjacent_change_commit(struct net_device *old_dev, struct net_device *new_dev, struct net_device *dev); void netdev_adjacent_change_abort(struct net_device *old_dev, struct net_device *new_dev, struct net_device *dev); void netdev_adjacent_rename_links(struct net_device *dev, char *oldname); void *netdev_lower_dev_get_private(struct net_device *dev, struct net_device *lower_dev); void netdev_lower_state_changed(struct net_device *lower_dev, void *lower_state_info); /* RSS keys are 40 or 52 bytes long */ #define NETDEV_RSS_KEY_LEN 52 extern u8 netdev_rss_key[NETDEV_RSS_KEY_LEN] __read_mostly; void netdev_rss_key_fill(void *buffer, size_t len); int skb_checksum_help(struct sk_buff *skb); int skb_crc32c_csum_help(struct sk_buff *skb); int skb_csum_hwoffload_help(struct sk_buff *skb, const netdev_features_t features); struct netdev_bonding_info { ifslave slave; ifbond master; }; struct netdev_notifier_bonding_info { struct netdev_notifier_info info; /* must be first */ struct netdev_bonding_info bonding_info; }; void netdev_bonding_info_change(struct net_device *dev, struct netdev_bonding_info *bonding_info); #if IS_ENABLED(CONFIG_ETHTOOL_NETLINK) void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data); #else static inline void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data) { } #endif __be16 skb_network_protocol(struct sk_buff *skb, int *depth); static inline bool can_checksum_protocol(netdev_features_t features, __be16 protocol) { if (protocol == htons(ETH_P_FCOE)) return !!(features & NETIF_F_FCOE_CRC); /* Assume this is an IP checksum (not SCTP CRC) */ if (features & NETIF_F_HW_CSUM) { /* Can checksum everything */ return true; } switch (protocol) { case htons(ETH_P_IP): return !!(features & NETIF_F_IP_CSUM); case htons(ETH_P_IPV6): return !!(features & NETIF_F_IPV6_CSUM); default: return false; } } #ifdef CONFIG_BUG void netdev_rx_csum_fault(struct net_device *dev, struct sk_buff *skb); #else static inline void netdev_rx_csum_fault(struct net_device *dev, struct sk_buff *skb) { } #endif /* rx skb timestamps */ void net_enable_timestamp(void); void net_disable_timestamp(void); static inline ktime_t netdev_get_tstamp(struct net_device *dev, const struct skb_shared_hwtstamps *hwtstamps, bool cycles) { const struct net_device_ops *ops = dev->netdev_ops; if (ops->ndo_get_tstamp) return ops->ndo_get_tstamp(dev, hwtstamps, cycles); return hwtstamps->hwtstamp; } static inline netdev_tx_t __netdev_start_xmit(const struct net_device_ops *ops, struct sk_buff *skb, struct net_device *dev, bool more) { __this_cpu_write(softnet_data.xmit.more, more); return ops->ndo_start_xmit(skb, dev); } static inline bool netdev_xmit_more(void) { return __this_cpu_read(softnet_data.xmit.more); } static inline netdev_tx_t netdev_start_xmit(struct sk_buff *skb, struct net_device *dev, struct netdev_queue *txq, bool more) { const struct net_device_ops *ops = dev->netdev_ops; netdev_tx_t rc; rc = __netdev_start_xmit(ops, skb, dev, more); if (rc == NETDEV_TX_OK) txq_trans_update(txq); return rc; } int netdev_class_create_file_ns(const struct class_attribute *class_attr, const void *ns); void netdev_class_remove_file_ns(const struct class_attribute *class_attr, const void *ns); extern const struct kobj_ns_type_operations net_ns_type_operations; const char *netdev_drivername(const struct net_device *dev); static inline netdev_features_t netdev_intersect_features(netdev_features_t f1, netdev_features_t f2) { if ((f1 ^ f2) & NETIF_F_HW_CSUM) { if (f1 & NETIF_F_HW_CSUM) f1 |= (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM); else f2 |= (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM); } return f1 & f2; } static inline netdev_features_t netdev_get_wanted_features( struct net_device *dev) { return (dev->features & ~dev->hw_features) | dev->wanted_features; } netdev_features_t netdev_increment_features(netdev_features_t all, netdev_features_t one, netdev_features_t mask); /* Allow TSO being used on stacked device : * Performing the GSO segmentation before last device * is a performance improvement. */ static inline netdev_features_t netdev_add_tso_features(netdev_features_t features, netdev_features_t mask) { return netdev_increment_features(features, NETIF_F_ALL_TSO, mask); } int __netdev_update_features(struct net_device *dev); void netdev_update_features(struct net_device *dev); void netdev_change_features(struct net_device *dev); void netif_stacked_transfer_operstate(const struct net_device *rootdev, struct net_device *dev); netdev_features_t passthru_features_check(struct sk_buff *skb, struct net_device *dev, netdev_features_t features); netdev_features_t netif_skb_features(struct sk_buff *skb); void skb_warn_bad_offload(const struct sk_buff *skb); static inline bool net_gso_ok(netdev_features_t features, int gso_type) { netdev_features_t feature = (netdev_features_t)gso_type << NETIF_F_GSO_SHIFT; /* check flags correspondence */ BUILD_BUG_ON(SKB_GSO_TCPV4 != (NETIF_F_TSO >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_DODGY != (NETIF_F_GSO_ROBUST >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_TCP_ECN != (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_TCP_FIXEDID != (NETIF_F_TSO_MANGLEID >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_TCPV6 != (NETIF_F_TSO6 >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_FCOE != (NETIF_F_FSO >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_GRE != (NETIF_F_GSO_GRE >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_GRE_CSUM != (NETIF_F_GSO_GRE_CSUM >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_IPXIP4 != (NETIF_F_GSO_IPXIP4 >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_IPXIP6 != (NETIF_F_GSO_IPXIP6 >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_UDP_TUNNEL != (NETIF_F_GSO_UDP_TUNNEL >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_UDP_TUNNEL_CSUM != (NETIF_F_GSO_UDP_TUNNEL_CSUM >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_PARTIAL != (NETIF_F_GSO_PARTIAL >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_TUNNEL_REMCSUM != (NETIF_F_GSO_TUNNEL_REMCSUM >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_SCTP != (NETIF_F_GSO_SCTP >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_ESP != (NETIF_F_GSO_ESP >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_UDP != (NETIF_F_GSO_UDP >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_UDP_L4 != (NETIF_F_GSO_UDP_L4 >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_FRAGLIST != (NETIF_F_GSO_FRAGLIST >> NETIF_F_GSO_SHIFT)); return (features & feature) == feature; } static inline bool skb_gso_ok(struct sk_buff *skb, netdev_features_t features) { return net_gso_ok(features, skb_shinfo(skb)->gso_type) && (!skb_has_frag_list(skb) || (features & NETIF_F_FRAGLIST)); } static inline bool netif_needs_gso(struct sk_buff *skb, netdev_features_t features) { return skb_is_gso(skb) && (!skb_gso_ok(skb, features) || unlikely((skb->ip_summed != CHECKSUM_PARTIAL) && (skb->ip_summed != CHECKSUM_UNNECESSARY))); } void netif_set_tso_max_size(struct net_device *dev, unsigned int size); void netif_set_tso_max_segs(struct net_device *dev, unsigned int segs); void netif_inherit_tso_max(struct net_device *to, const struct net_device *from); static inline bool netif_is_macsec(const struct net_device *dev) { return dev->priv_flags & IFF_MACSEC; } static inline bool netif_is_macvlan(const struct net_device *dev) { return dev->priv_flags & IFF_MACVLAN; } static inline bool netif_is_macvlan_port(const struct net_device *dev) { return dev->priv_flags & IFF_MACVLAN_PORT; } static inline bool netif_is_bond_master(const struct net_device *dev) { return dev->flags & IFF_MASTER && dev->priv_flags & IFF_BONDING; } static inline bool netif_is_bond_slave(const struct net_device *dev) { return dev->flags & IFF_SLAVE && dev->priv_flags & IFF_BONDING; } static inline bool netif_supports_nofcs(struct net_device *dev) { return dev->priv_flags & IFF_SUPP_NOFCS; } static inline bool netif_has_l3_rx_handler(const struct net_device *dev) { return dev->priv_flags & IFF_L3MDEV_RX_HANDLER; } static inline bool netif_is_l3_master(const struct net_device *dev) { return dev->priv_flags & IFF_L3MDEV_MASTER; } static inline bool netif_is_l3_slave(const struct net_device *dev) { return dev->priv_flags & IFF_L3MDEV_SLAVE; } static inline int dev_sdif(const struct net_device *dev) { #ifdef CONFIG_NET_L3_MASTER_DEV if (netif_is_l3_slave(dev)) return dev->ifindex; #endif return 0; } static inline bool netif_is_bridge_master(const struct net_device *dev) { return dev->priv_flags & IFF_EBRIDGE; } static inline bool netif_is_bridge_port(const struct net_device *dev) { return dev->priv_flags & IFF_BRIDGE_PORT; } static inline bool netif_is_ovs_master(const struct net_device *dev) { return dev->priv_flags & IFF_OPENVSWITCH; } static inline bool netif_is_ovs_port(const struct net_device *dev) { return dev->priv_flags & IFF_OVS_DATAPATH; } static inline bool netif_is_any_bridge_master(const struct net_device *dev) { return netif_is_bridge_master(dev) || netif_is_ovs_master(dev); } static inline bool netif_is_any_bridge_port(const struct net_device *dev) { return netif_is_bridge_port(dev) || netif_is_ovs_port(dev); } static inline bool netif_is_team_master(const struct net_device *dev) { return dev->priv_flags & IFF_TEAM; } static inline bool netif_is_team_port(const struct net_device *dev) { return dev->priv_flags & IFF_TEAM_PORT; } static inline bool netif_is_lag_master(const struct net_device *dev) { return netif_is_bond_master(dev) || netif_is_team_master(dev); } static inline bool netif_is_lag_port(const struct net_device *dev) { return netif_is_bond_slave(dev) || netif_is_team_port(dev); } static inline bool netif_is_rxfh_configured(const struct net_device *dev) { return dev->priv_flags & IFF_RXFH_CONFIGURED; } static inline bool netif_is_failover(const struct net_device *dev) { return dev->priv_flags & IFF_FAILOVER; } static inline bool netif_is_failover_slave(const struct net_device *dev) { return dev->priv_flags & IFF_FAILOVER_SLAVE; } /* This device needs to keep skb dst for qdisc enqueue or ndo_start_xmit() */ static inline void netif_keep_dst(struct net_device *dev) { dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM); } /* return true if dev can't cope with mtu frames that need vlan tag insertion */ static inline bool netif_reduces_vlan_mtu(struct net_device *dev) { /* TODO: reserve and use an additional IFF bit, if we get more users */ return netif_is_macsec(dev); } extern struct pernet_operations __net_initdata loopback_net_ops; /* Logging, debugging and troubleshooting/diagnostic helpers. */ /* netdev_printk helpers, similar to dev_printk */ static inline const char *netdev_name(const struct net_device *dev) { if (!dev->name[0] || strchr(dev->name, '%')) return "(unnamed net_device)"; return dev->name; } static inline const char *netdev_reg_state(const struct net_device *dev) { u8 reg_state = READ_ONCE(dev->reg_state); switch (reg_state) { case NETREG_UNINITIALIZED: return " (uninitialized)"; case NETREG_REGISTERED: return ""; case NETREG_UNREGISTERING: return " (unregistering)"; case NETREG_UNREGISTERED: return " (unregistered)"; case NETREG_RELEASED: return " (released)"; case NETREG_DUMMY: return " (dummy)"; } WARN_ONCE(1, "%s: unknown reg_state %d\n", dev->name, reg_state); return " (unknown)"; } #define MODULE_ALIAS_NETDEV(device) \ MODULE_ALIAS("netdev-" device) /* * netdev_WARN() acts like dev_printk(), but with the key difference * of using a WARN/WARN_ON to get the message out, including the * file/line information and a backtrace. */ #define netdev_WARN(dev, format, args...) \ WARN(1, "netdevice: %s%s: " format, netdev_name(dev), \ netdev_reg_state(dev), ##args) #define netdev_WARN_ONCE(dev, format, args...) \ WARN_ONCE(1, "netdevice: %s%s: " format, netdev_name(dev), \ netdev_reg_state(dev), ##args) /* * The list of packet types we will receive (as opposed to discard) * and the routines to invoke. * * Why 16. Because with 16 the only overlap we get on a hash of the * low nibble of the protocol value is RARP/SNAP/X.25. * * 0800 IP * 0001 802.3 * 0002 AX.25 * 0004 802.2 * 8035 RARP * 0005 SNAP * 0805 X.25 * 0806 ARP * 8137 IPX * 0009 Localtalk * 86DD IPv6 */ #define PTYPE_HASH_SIZE (16) #define PTYPE_HASH_MASK (PTYPE_HASH_SIZE - 1) extern struct list_head ptype_base[PTYPE_HASH_SIZE] __read_mostly; extern struct net_device *blackhole_netdev; /* Note: Avoid these macros in fast path, prefer per-cpu or per-queue counters. */ #define DEV_STATS_INC(DEV, FIELD) atomic_long_inc(&(DEV)->stats.__##FIELD) #define DEV_STATS_ADD(DEV, FIELD, VAL) \ atomic_long_add((VAL), &(DEV)->stats.__##FIELD) #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD) #endif /* _LINUX_NETDEVICE_H */ |
1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 | // SPDX-License-Identifier: GPL-2.0-only /* * Driver for the ov7660 sensor * * Copyright (C) 2009 Erik Andrén * Copyright (C) 2007 Ilyes Gouta. Based on the m5603x Linux Driver Project. * Copyright (C) 2005 m5603x Linux Driver Project <m5602@x3ng.com.br> * * Portions of code to USB interface and ALi driver software, * Copyright (c) 2006 Willem Duinker * v4l2 interface modeled after the V4L2 driver * for SN9C10x PC Camera Controllers */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include "m5602_ov7660.h" static int ov7660_s_ctrl(struct v4l2_ctrl *ctrl); static void ov7660_dump_registers(struct sd *sd); static const unsigned char preinit_ov7660[][4] = { {BRIDGE, M5602_XB_MCU_CLK_DIV, 0x02}, {BRIDGE, M5602_XB_MCU_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_ADC_CTRL, 0xc0}, {BRIDGE, M5602_XB_SENSOR_TYPE, 0x0d}, {BRIDGE, M5602_XB_SENSOR_CTRL, 0x00}, {BRIDGE, M5602_XB_GPIO_DIR, 0x03}, {BRIDGE, M5602_XB_GPIO_DIR, 0x03}, {BRIDGE, M5602_XB_ADC_CTRL, 0xc0}, {BRIDGE, M5602_XB_SENSOR_TYPE, 0x0c}, {SENSOR, OV7660_OFON, 0x0c}, {SENSOR, OV7660_COM2, 0x11}, {SENSOR, OV7660_COM7, 0x05}, {BRIDGE, M5602_XB_GPIO_DIR, 0x01}, {BRIDGE, M5602_XB_GPIO_DAT, 0x04}, {BRIDGE, M5602_XB_GPIO_EN_H, 0x06}, {BRIDGE, M5602_XB_GPIO_DIR_H, 0x06}, {BRIDGE, M5602_XB_GPIO_DAT_H, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x08}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_ADC_CTRL, 0xc0}, {BRIDGE, M5602_XB_SENSOR_TYPE, 0x0c}, {BRIDGE, M5602_XB_GPIO_DIR, 0x05}, {BRIDGE, M5602_XB_GPIO_DAT, 0x00}, {BRIDGE, M5602_XB_GPIO_EN_H, 0x06}, {BRIDGE, M5602_XB_GPIO_EN_L, 0x00} }; static const unsigned char init_ov7660[][4] = { {BRIDGE, M5602_XB_MCU_CLK_DIV, 0x02}, {BRIDGE, M5602_XB_MCU_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_ADC_CTRL, 0xc0}, {BRIDGE, M5602_XB_SENSOR_TYPE, 0x0d}, {BRIDGE, M5602_XB_SENSOR_CTRL, 0x00}, {BRIDGE, M5602_XB_GPIO_DIR, 0x01}, {BRIDGE, M5602_XB_GPIO_DIR, 0x01}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_ADC_CTRL, 0xc0}, {BRIDGE, M5602_XB_SENSOR_TYPE, 0x0c}, {BRIDGE, M5602_XB_GPIO_DIR, 0x05}, {BRIDGE, M5602_XB_GPIO_DAT, 0x00}, {BRIDGE, M5602_XB_GPIO_EN_H, 0x06}, {BRIDGE, M5602_XB_GPIO_EN_L, 0x00}, {SENSOR, OV7660_COM7, 0x80}, {SENSOR, OV7660_CLKRC, 0x80}, {SENSOR, OV7660_COM9, 0x4c}, {SENSOR, OV7660_OFON, 0x43}, {SENSOR, OV7660_COM12, 0x28}, {SENSOR, OV7660_COM8, 0x00}, {SENSOR, OV7660_COM10, 0x40}, {SENSOR, OV7660_HSTART, 0x0c}, {SENSOR, OV7660_HSTOP, 0x61}, {SENSOR, OV7660_HREF, 0xa4}, {SENSOR, OV7660_PSHFT, 0x0b}, {SENSOR, OV7660_VSTART, 0x01}, {SENSOR, OV7660_VSTOP, 0x7a}, {SENSOR, OV7660_VSTOP, 0x00}, {SENSOR, OV7660_COM7, 0x05}, {SENSOR, OV7660_COM6, 0x42}, {SENSOR, OV7660_BBIAS, 0x94}, {SENSOR, OV7660_GbBIAS, 0x94}, {SENSOR, OV7660_RSVD29, 0x94}, {SENSOR, OV7660_RBIAS, 0x94}, {SENSOR, OV7660_COM1, 0x00}, {SENSOR, OV7660_AECH, 0x00}, {SENSOR, OV7660_AECHH, 0x00}, {SENSOR, OV7660_ADC, 0x05}, {SENSOR, OV7660_COM13, 0x00}, {SENSOR, OV7660_RSVDA1, 0x23}, {SENSOR, OV7660_TSLB, 0x0d}, {SENSOR, OV7660_HV, 0x80}, {SENSOR, OV7660_LCC1, 0x00}, {SENSOR, OV7660_LCC2, 0x00}, {SENSOR, OV7660_LCC3, 0x10}, {SENSOR, OV7660_LCC4, 0x40}, {SENSOR, OV7660_LCC5, 0x01}, {SENSOR, OV7660_AECH, 0x20}, {SENSOR, OV7660_COM1, 0x00}, {SENSOR, OV7660_OFON, 0x0c}, {SENSOR, OV7660_COM2, 0x11}, {SENSOR, OV7660_COM7, 0x05}, {BRIDGE, M5602_XB_GPIO_DIR, 0x01}, {BRIDGE, M5602_XB_GPIO_DAT, 0x04}, {BRIDGE, M5602_XB_GPIO_EN_H, 0x06}, {BRIDGE, M5602_XB_GPIO_DIR_H, 0x06}, {BRIDGE, M5602_XB_GPIO_DAT_H, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x08}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_ADC_CTRL, 0xc0}, {BRIDGE, M5602_XB_SENSOR_TYPE, 0x0c}, {BRIDGE, M5602_XB_GPIO_DIR, 0x05}, {BRIDGE, M5602_XB_GPIO_DAT, 0x00}, {BRIDGE, M5602_XB_GPIO_EN_H, 0x06}, {BRIDGE, M5602_XB_GPIO_EN_L, 0x00}, {SENSOR, OV7660_AECH, 0x5f}, {SENSOR, OV7660_COM1, 0x03}, {SENSOR, OV7660_OFON, 0x0c}, {SENSOR, OV7660_COM2, 0x11}, {SENSOR, OV7660_COM7, 0x05}, {BRIDGE, M5602_XB_GPIO_DIR, 0x01}, {BRIDGE, M5602_XB_GPIO_DAT, 0x04}, {BRIDGE, M5602_XB_GPIO_EN_H, 0x06}, {BRIDGE, M5602_XB_GPIO_DIR_H, 0x06}, {BRIDGE, M5602_XB_GPIO_DAT_H, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x08}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_ADC_CTRL, 0xc0}, {BRIDGE, M5602_XB_SENSOR_TYPE, 0x0c}, {BRIDGE, M5602_XB_GPIO_DIR, 0x05}, {BRIDGE, M5602_XB_GPIO_DAT, 0x00}, {BRIDGE, M5602_XB_GPIO_EN_H, 0x06}, {BRIDGE, M5602_XB_GPIO_EN_L, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x06}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, {BRIDGE, M5602_XB_ADC_CTRL, 0xc0}, {BRIDGE, M5602_XB_SENSOR_TYPE, 0x0c}, {BRIDGE, M5602_XB_LINE_OF_FRAME_H, 0x81}, {BRIDGE, M5602_XB_PIX_OF_LINE_H, 0x82}, {BRIDGE, M5602_XB_SIG_INI, 0x01}, {BRIDGE, M5602_XB_VSYNC_PARA, 0x00}, {BRIDGE, M5602_XB_VSYNC_PARA, 0x08}, {BRIDGE, M5602_XB_VSYNC_PARA, 0x00}, {BRIDGE, M5602_XB_VSYNC_PARA, 0x00}, {BRIDGE, M5602_XB_VSYNC_PARA, 0x01}, {BRIDGE, M5602_XB_VSYNC_PARA, 0xec}, {BRIDGE, M5602_XB_VSYNC_PARA, 0x00}, {BRIDGE, M5602_XB_VSYNC_PARA, 0x00}, {BRIDGE, M5602_XB_SIG_INI, 0x00}, {BRIDGE, M5602_XB_SIG_INI, 0x02}, {BRIDGE, M5602_XB_HSYNC_PARA, 0x00}, {BRIDGE, M5602_XB_HSYNC_PARA, 0x27}, {BRIDGE, M5602_XB_HSYNC_PARA, 0x02}, {BRIDGE, M5602_XB_HSYNC_PARA, 0xa7}, {BRIDGE, M5602_XB_SIG_INI, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_DIV, 0x00}, {BRIDGE, M5602_XB_SEN_CLK_CTRL, 0xb0}, }; static struct v4l2_pix_format ov7660_modes[] = { { 640, 480, V4L2_PIX_FMT_SBGGR8, V4L2_FIELD_NONE, .sizeimage = 640 * 480, .bytesperline = 640, .colorspace = V4L2_COLORSPACE_SRGB, .priv = 0 } }; static const struct v4l2_ctrl_ops ov7660_ctrl_ops = { .s_ctrl = ov7660_s_ctrl, }; int ov7660_probe(struct sd *sd) { int err = 0, i; u8 prod_id = 0, ver_id = 0; if (force_sensor) { if (force_sensor == OV7660_SENSOR) { pr_info("Forcing an %s sensor\n", ov7660.name); goto sensor_found; } /* If we want to force another sensor, don't try to probe this one */ return -ENODEV; } /* Do the preinit */ for (i = 0; i < ARRAY_SIZE(preinit_ov7660) && !err; i++) { u8 data[2]; if (preinit_ov7660[i][0] == BRIDGE) { err = m5602_write_bridge(sd, preinit_ov7660[i][1], preinit_ov7660[i][2]); } else { data[0] = preinit_ov7660[i][2]; err = m5602_write_sensor(sd, preinit_ov7660[i][1], data, 1); } } if (err < 0) return err; if (m5602_read_sensor(sd, OV7660_PID, &prod_id, 1)) return -ENODEV; if (m5602_read_sensor(sd, OV7660_VER, &ver_id, 1)) return -ENODEV; pr_info("Sensor reported 0x%x%x\n", prod_id, ver_id); if ((prod_id == 0x76) && (ver_id == 0x60)) { pr_info("Detected a ov7660 sensor\n"); goto sensor_found; } return -ENODEV; sensor_found: sd->gspca_dev.cam.cam_mode = ov7660_modes; sd->gspca_dev.cam.nmodes = ARRAY_SIZE(ov7660_modes); return 0; } int ov7660_init(struct sd *sd) { int i, err; /* Init the sensor */ for (i = 0; i < ARRAY_SIZE(init_ov7660); i++) { u8 data[2]; if (init_ov7660[i][0] == BRIDGE) { err = m5602_write_bridge(sd, init_ov7660[i][1], init_ov7660[i][2]); } else { data[0] = init_ov7660[i][2]; err = m5602_write_sensor(sd, init_ov7660[i][1], data, 1); } if (err < 0) return err; } if (dump_sensor) ov7660_dump_registers(sd); return 0; } int ov7660_init_controls(struct sd *sd) { struct v4l2_ctrl_handler *hdl = &sd->gspca_dev.ctrl_handler; sd->gspca_dev.vdev.ctrl_handler = hdl; v4l2_ctrl_handler_init(hdl, 6); v4l2_ctrl_new_std(hdl, &ov7660_ctrl_ops, V4L2_CID_AUTO_WHITE_BALANCE, 0, 1, 1, 1); v4l2_ctrl_new_std_menu(hdl, &ov7660_ctrl_ops, V4L2_CID_EXPOSURE_AUTO, 1, 0, V4L2_EXPOSURE_AUTO); sd->autogain = v4l2_ctrl_new_std(hdl, &ov7660_ctrl_ops, V4L2_CID_AUTOGAIN, 0, 1, 1, 1); sd->gain = v4l2_ctrl_new_std(hdl, &ov7660_ctrl_ops, V4L2_CID_GAIN, 0, 255, 1, OV7660_DEFAULT_GAIN); sd->hflip = v4l2_ctrl_new_std(hdl, &ov7660_ctrl_ops, V4L2_CID_HFLIP, 0, 1, 1, 0); sd->vflip = v4l2_ctrl_new_std(hdl, &ov7660_ctrl_ops, V4L2_CID_VFLIP, 0, 1, 1, 0); if (hdl->error) { pr_err("Could not initialize controls\n"); return hdl->error; } v4l2_ctrl_auto_cluster(2, &sd->autogain, 0, false); v4l2_ctrl_cluster(2, &sd->hflip); return 0; } int ov7660_start(struct sd *sd) { return 0; } int ov7660_stop(struct sd *sd) { return 0; } void ov7660_disconnect(struct sd *sd) { ov7660_stop(sd); sd->sensor = NULL; } static int ov7660_set_gain(struct gspca_dev *gspca_dev, __s32 val) { int err; u8 i2c_data = val; struct sd *sd = (struct sd *) gspca_dev; gspca_dbg(gspca_dev, D_CONF, "Setting gain to %d\n", val); err = m5602_write_sensor(sd, OV7660_GAIN, &i2c_data, 1); return err; } static int ov7660_set_auto_white_balance(struct gspca_dev *gspca_dev, __s32 val) { int err; u8 i2c_data; struct sd *sd = (struct sd *) gspca_dev; gspca_dbg(gspca_dev, D_CONF, "Set auto white balance to %d\n", val); err = m5602_read_sensor(sd, OV7660_COM8, &i2c_data, 1); if (err < 0) return err; i2c_data = ((i2c_data & 0xfd) | ((val & 0x01) << 1)); err = m5602_write_sensor(sd, OV7660_COM8, &i2c_data, 1); return err; } static int ov7660_set_auto_gain(struct gspca_dev *gspca_dev, __s32 val) { int err; u8 i2c_data; struct sd *sd = (struct sd *) gspca_dev; gspca_dbg(gspca_dev, D_CONF, "Set auto gain control to %d\n", val); err = m5602_read_sensor(sd, OV7660_COM8, &i2c_data, 1); if (err < 0) return err; i2c_data = ((i2c_data & 0xfb) | ((val & 0x01) << 2)); return m5602_write_sensor(sd, OV7660_COM8, &i2c_data, 1); } static int ov7660_set_auto_exposure(struct gspca_dev *gspca_dev, __s32 val) { int err; u8 i2c_data; struct sd *sd = (struct sd *) gspca_dev; gspca_dbg(gspca_dev, D_CONF, "Set auto exposure control to %d\n", val); err = m5602_read_sensor(sd, OV7660_COM8, &i2c_data, 1); if (err < 0) return err; val = (val == V4L2_EXPOSURE_AUTO); i2c_data = ((i2c_data & 0xfe) | ((val & 0x01) << 0)); return m5602_write_sensor(sd, OV7660_COM8, &i2c_data, 1); } static int ov7660_set_hvflip(struct gspca_dev *gspca_dev) { int err; u8 i2c_data; struct sd *sd = (struct sd *) gspca_dev; gspca_dbg(gspca_dev, D_CONF, "Set hvflip to %d, %d\n", sd->hflip->val, sd->vflip->val); i2c_data = (sd->hflip->val << 5) | (sd->vflip->val << 4); err = m5602_write_sensor(sd, OV7660_MVFP, &i2c_data, 1); return err; } static int ov7660_s_ctrl(struct v4l2_ctrl *ctrl) { struct gspca_dev *gspca_dev = container_of(ctrl->handler, struct gspca_dev, ctrl_handler); struct sd *sd = (struct sd *) gspca_dev; int err; if (!gspca_dev->streaming) return 0; switch (ctrl->id) { case V4L2_CID_AUTO_WHITE_BALANCE: err = ov7660_set_auto_white_balance(gspca_dev, ctrl->val); break; case V4L2_CID_EXPOSURE_AUTO: err = ov7660_set_auto_exposure(gspca_dev, ctrl->val); break; case V4L2_CID_AUTOGAIN: err = ov7660_set_auto_gain(gspca_dev, ctrl->val); if (err || ctrl->val) return err; err = ov7660_set_gain(gspca_dev, sd->gain->val); break; case V4L2_CID_HFLIP: err = ov7660_set_hvflip(gspca_dev); break; default: return -EINVAL; } return err; } static void ov7660_dump_registers(struct sd *sd) { int address; pr_info("Dumping the ov7660 register state\n"); for (address = 0; address < 0xa9; address++) { u8 value; m5602_read_sensor(sd, address, &value, 1); pr_info("register 0x%x contains 0x%x\n", address, value); } pr_info("ov7660 register state dump complete\n"); pr_info("Probing for which registers that are read/write\n"); for (address = 0; address < 0xff; address++) { u8 old_value, ctrl_value; u8 test_value[2] = {0xff, 0xff}; m5602_read_sensor(sd, address, &old_value, 1); m5602_write_sensor(sd, address, test_value, 1); m5602_read_sensor(sd, address, &ctrl_value, 1); if (ctrl_value == test_value[0]) pr_info("register 0x%x is writeable\n", address); else pr_info("register 0x%x is read only\n", address); /* Restore original value */ m5602_write_sensor(sd, address, &old_value, 1); } } |
4 4 4 4 4 4 4 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 | // SPDX-License-Identifier: GPL-2.0-or-later /* AFS cell alias detection * * Copyright (C) 2020 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #include <linux/slab.h> #include <linux/sched.h> #include <linux/namei.h> #include <keys/rxrpc-type.h> #include "internal.h" /* * Sample a volume. */ static struct afs_volume *afs_sample_volume(struct afs_cell *cell, struct key *key, const char *name, unsigned int namelen) { struct afs_volume *volume; struct afs_fs_context fc = { .type = 0, /* Explicitly leave it to the VLDB */ .volnamesz = namelen, .volname = name, .net = cell->net, .cell = cell, .key = key, /* This might need to be something */ }; volume = afs_create_volume(&fc); _leave(" = %p", volume); return volume; } /* * Compare the address lists of a pair of fileservers. */ static int afs_compare_fs_alists(const struct afs_server *server_a, const struct afs_server *server_b) { const struct afs_addr_list *la, *lb; int a = 0, b = 0, addr_matches = 0; la = rcu_dereference(server_a->endpoint_state)->addresses; lb = rcu_dereference(server_b->endpoint_state)->addresses; while (a < la->nr_addrs && b < lb->nr_addrs) { unsigned long pa = (unsigned long)la->addrs[a].peer; unsigned long pb = (unsigned long)lb->addrs[b].peer; long diff = pa - pb; if (diff < 0) { a++; } else if (diff > 0) { b++; } else { addr_matches++; a++; b++; } } return addr_matches; } /* * Compare the fileserver lists of two volumes. The server lists are sorted in * order of ascending UUID. */ static int afs_compare_volume_slists(const struct afs_volume *vol_a, const struct afs_volume *vol_b) { const struct afs_server_list *la, *lb; int i, a = 0, b = 0, uuid_matches = 0, addr_matches = 0; la = rcu_dereference(vol_a->servers); lb = rcu_dereference(vol_b->servers); for (i = 0; i < AFS_MAXTYPES; i++) if (vol_a->vids[i] != vol_b->vids[i]) return 0; while (a < la->nr_servers && b < lb->nr_servers) { const struct afs_server *server_a = la->servers[a].server; const struct afs_server *server_b = lb->servers[b].server; int diff = memcmp(&server_a->uuid, &server_b->uuid, sizeof(uuid_t)); if (diff < 0) { a++; } else if (diff > 0) { b++; } else { uuid_matches++; addr_matches += afs_compare_fs_alists(server_a, server_b); a++; b++; } } _leave(" = %d [um %d]", addr_matches, uuid_matches); return addr_matches; } /* * Compare root.cell volumes. */ static int afs_compare_cell_roots(struct afs_cell *cell) { struct afs_cell *p; _enter(""); rcu_read_lock(); hlist_for_each_entry_rcu(p, &cell->net->proc_cells, proc_link) { if (p == cell || p->alias_of) continue; if (!p->root_volume) continue; /* Ignore cells that don't have a root.cell volume. */ if (afs_compare_volume_slists(cell->root_volume, p->root_volume) != 0) goto is_alias; } rcu_read_unlock(); _leave(" = 0"); return 0; is_alias: rcu_read_unlock(); cell->alias_of = afs_use_cell(p, afs_cell_trace_use_alias); return 1; } /* * Query the new cell for a volume from a cell we're already using. */ static int afs_query_for_alias_one(struct afs_cell *cell, struct key *key, struct afs_cell *p) { struct afs_volume *volume, *pvol = NULL; int ret; /* Arbitrarily pick a volume from the list. */ read_seqlock_excl(&p->volume_lock); if (!RB_EMPTY_ROOT(&p->volumes)) pvol = afs_get_volume(rb_entry(p->volumes.rb_node, struct afs_volume, cell_node), afs_volume_trace_get_query_alias); read_sequnlock_excl(&p->volume_lock); if (!pvol) return 0; _enter("%s:%s", cell->name, pvol->name); /* And see if it's in the new cell. */ volume = afs_sample_volume(cell, key, pvol->name, pvol->name_len); if (IS_ERR(volume)) { afs_put_volume(pvol, afs_volume_trace_put_query_alias); if (PTR_ERR(volume) != -ENOMEDIUM) return PTR_ERR(volume); /* That volume is not in the new cell, so not an alias */ return 0; } /* The new cell has a like-named volume also - compare volume ID, * server and address lists. */ ret = 0; if (pvol->vid == volume->vid) { rcu_read_lock(); if (afs_compare_volume_slists(volume, pvol)) ret = 1; rcu_read_unlock(); } afs_put_volume(volume, afs_volume_trace_put_query_alias); afs_put_volume(pvol, afs_volume_trace_put_query_alias); return ret; } /* * Query the new cell for volumes we know exist in cells we're already using. */ static int afs_query_for_alias(struct afs_cell *cell, struct key *key) { struct afs_cell *p; _enter("%s", cell->name); if (mutex_lock_interruptible(&cell->net->proc_cells_lock) < 0) return -ERESTARTSYS; hlist_for_each_entry(p, &cell->net->proc_cells, proc_link) { if (p == cell || p->alias_of) continue; if (RB_EMPTY_ROOT(&p->volumes)) continue; if (p->root_volume) continue; /* Ignore cells that have a root.cell volume. */ afs_use_cell(p, afs_cell_trace_use_check_alias); mutex_unlock(&cell->net->proc_cells_lock); if (afs_query_for_alias_one(cell, key, p) != 0) goto is_alias; if (mutex_lock_interruptible(&cell->net->proc_cells_lock) < 0) { afs_unuse_cell(cell->net, p, afs_cell_trace_unuse_check_alias); return -ERESTARTSYS; } afs_unuse_cell(cell->net, p, afs_cell_trace_unuse_check_alias); } mutex_unlock(&cell->net->proc_cells_lock); _leave(" = 0"); return 0; is_alias: cell->alias_of = p; /* Transfer our ref */ return 1; } /* * Look up a VLDB record for a volume. */ static char *afs_vl_get_cell_name(struct afs_cell *cell, struct key *key) { struct afs_vl_cursor vc; char *cell_name = ERR_PTR(-EDESTADDRREQ); bool skipped = false, not_skipped = false; int ret; if (!afs_begin_vlserver_operation(&vc, cell, key)) return ERR_PTR(-ERESTARTSYS); while (afs_select_vlserver(&vc)) { if (!test_bit(AFS_VLSERVER_FL_IS_YFS, &vc.server->flags)) { vc.call_error = -EOPNOTSUPP; skipped = true; continue; } not_skipped = true; cell_name = afs_yfsvl_get_cell_name(&vc); } ret = afs_end_vlserver_operation(&vc); if (skipped && !not_skipped) ret = -EOPNOTSUPP; return ret < 0 ? ERR_PTR(ret) : cell_name; } static int yfs_check_canonical_cell_name(struct afs_cell *cell, struct key *key) { struct afs_cell *master; char *cell_name; cell_name = afs_vl_get_cell_name(cell, key); if (IS_ERR(cell_name)) return PTR_ERR(cell_name); if (strcmp(cell_name, cell->name) == 0) { kfree(cell_name); return 0; } master = afs_lookup_cell(cell->net, cell_name, strlen(cell_name), NULL, false); kfree(cell_name); if (IS_ERR(master)) return PTR_ERR(master); cell->alias_of = master; /* Transfer our ref */ return 1; } static int afs_do_cell_detect_alias(struct afs_cell *cell, struct key *key) { struct afs_volume *root_volume; int ret; _enter("%s", cell->name); ret = yfs_check_canonical_cell_name(cell, key); if (ret != -EOPNOTSUPP) return ret; /* Try and get the root.cell volume for comparison with other cells */ root_volume = afs_sample_volume(cell, key, "root.cell", 9); if (!IS_ERR(root_volume)) { cell->root_volume = root_volume; return afs_compare_cell_roots(cell); } if (PTR_ERR(root_volume) != -ENOMEDIUM) return PTR_ERR(root_volume); /* Okay, this cell doesn't have an root.cell volume. We need to * locate some other random volume and use that to check. */ return afs_query_for_alias(cell, key); } /* * Check to see if a new cell is an alias of a cell we already have. At this * point we have the cell's volume server list. * * Returns 0 if we didn't detect an alias, 1 if we found an alias and an error * if we had problems gathering the data required. In the case the we did * detect an alias, cell->alias_of is set to point to the assumed master. */ int afs_cell_detect_alias(struct afs_cell *cell, struct key *key) { struct afs_net *net = cell->net; int ret; if (mutex_lock_interruptible(&net->cells_alias_lock) < 0) return -ERESTARTSYS; if (test_bit(AFS_CELL_FL_CHECK_ALIAS, &cell->flags)) { ret = afs_do_cell_detect_alias(cell, key); if (ret >= 0) clear_bit_unlock(AFS_CELL_FL_CHECK_ALIAS, &cell->flags); } else { ret = cell->alias_of ? 1 : 0; } mutex_unlock(&net->cells_alias_lock); if (ret == 1) pr_notice("kAFS: Cell %s is an alias of %s\n", cell->name, cell->alias_of->name); return ret; } |
1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 | // SPDX-License-Identifier: GPL-2.0 /* USB Driver for Sierra Wireless Copyright (C) 2006, 2007, 2008 Kevin Lloyd <klloyd@sierrawireless.com>, Copyright (C) 2008, 2009 Elina Pasheva, Matthew Safar, Rory Filer <linux@sierrawireless.com> IMPORTANT DISCLAIMER: This driver is not commercially supported by Sierra Wireless. Use at your own risk. Portions based on the option driver by Matthias Urlichs <smurf@smurf.noris.de> Whom based his on the Keyspan driver by Hugh Blemings <hugh@blemings.org> */ /* Uncomment to log function calls */ /* #define DEBUG */ #define DRIVER_AUTHOR "Kevin Lloyd, Elina Pasheva, Matthew Safar, Rory Filer" #define DRIVER_DESC "USB Driver for Sierra Wireless USB modems" #include <linux/kernel.h> #include <linux/jiffies.h> #include <linux/errno.h> #include <linux/tty.h> #include <linux/slab.h> #include <linux/tty_flip.h> #include <linux/module.h> #include <linux/usb.h> #include <linux/usb/serial.h> #define SWIMS_USB_REQUEST_SetPower 0x00 #define SWIMS_USB_REQUEST_SetNmea 0x07 #define N_IN_URB_HM 8 #define N_OUT_URB_HM 64 #define N_IN_URB 4 #define N_OUT_URB 4 #define IN_BUFLEN 4096 #define MAX_TRANSFER (PAGE_SIZE - 512) /* MAX_TRANSFER is chosen so that the VM is not stressed by allocations > PAGE_SIZE and the number of packets in a page is an integer 512 is the largest possible packet on EHCI */ static bool nmea; struct sierra_iface_list { const u8 *nums; /* array of interface numbers */ size_t count; /* number of elements in array */ }; struct sierra_intf_private { spinlock_t susp_lock; unsigned int suspended:1; int in_flight; unsigned int open_ports; }; static int sierra_set_power_state(struct usb_device *udev, __u16 swiState) { return usb_control_msg(udev, usb_sndctrlpipe(udev, 0), SWIMS_USB_REQUEST_SetPower, /* __u8 request */ USB_TYPE_VENDOR, /* __u8 request type */ swiState, /* __u16 value */ 0, /* __u16 index */ NULL, /* void *data */ 0, /* __u16 size */ USB_CTRL_SET_TIMEOUT); /* int timeout */ } static int sierra_vsc_set_nmea(struct usb_device *udev, __u16 enable) { return usb_control_msg(udev, usb_sndctrlpipe(udev, 0), SWIMS_USB_REQUEST_SetNmea, /* __u8 request */ USB_TYPE_VENDOR, /* __u8 request type */ enable, /* __u16 value */ 0x0000, /* __u16 index */ NULL, /* void *data */ 0, /* __u16 size */ USB_CTRL_SET_TIMEOUT); /* int timeout */ } static int sierra_calc_num_ports(struct usb_serial *serial, struct usb_serial_endpoints *epds) { int num_ports = 0; u8 ifnum, numendpoints; ifnum = serial->interface->cur_altsetting->desc.bInterfaceNumber; numendpoints = serial->interface->cur_altsetting->desc.bNumEndpoints; /* Dummy interface present on some SKUs should be ignored */ if (ifnum == 0x99) num_ports = 0; else if (numendpoints <= 3) num_ports = 1; else num_ports = (numendpoints-1)/2; return num_ports; } static bool is_listed(const u8 ifnum, const struct sierra_iface_list *list) { int i; if (!list) return false; for (i = 0; i < list->count; i++) { if (list->nums[i] == ifnum) return true; } return false; } static u8 sierra_interface_num(struct usb_serial *serial) { return serial->interface->cur_altsetting->desc.bInterfaceNumber; } static int sierra_probe(struct usb_serial *serial, const struct usb_device_id *id) { const struct sierra_iface_list *ignore_list; int result = 0; struct usb_device *udev; u8 ifnum; udev = serial->dev; ifnum = sierra_interface_num(serial); /* * If this interface supports more than 1 alternate * select the 2nd one */ if (serial->interface->num_altsetting == 2) { dev_dbg(&udev->dev, "Selecting alt setting for interface %d\n", ifnum); /* We know the alternate setting is 1 for the MC8785 */ usb_set_interface(udev, ifnum, 1); } ignore_list = (const struct sierra_iface_list *)id->driver_info; if (is_listed(ifnum, ignore_list)) { dev_dbg(&serial->dev->dev, "Ignoring interface #%d\n", ifnum); return -ENODEV; } return result; } /* interfaces with higher memory requirements */ static const u8 hi_memory_typeA_ifaces[] = { 0, 2 }; static const struct sierra_iface_list typeA_interface_list = { .nums = hi_memory_typeA_ifaces, .count = ARRAY_SIZE(hi_memory_typeA_ifaces), }; static const u8 hi_memory_typeB_ifaces[] = { 3, 4, 5, 6 }; static const struct sierra_iface_list typeB_interface_list = { .nums = hi_memory_typeB_ifaces, .count = ARRAY_SIZE(hi_memory_typeB_ifaces), }; /* 'ignorelist' of interfaces not served by this driver */ static const u8 direct_ip_non_serial_ifaces[] = { 7, 8, 9, 10, 11, 19, 20 }; static const struct sierra_iface_list direct_ip_interface_ignore = { .nums = direct_ip_non_serial_ifaces, .count = ARRAY_SIZE(direct_ip_non_serial_ifaces), }; static const struct usb_device_id id_table[] = { { USB_DEVICE(0x0F3D, 0x0112) }, /* Airprime/Sierra PC 5220 */ { USB_DEVICE(0x03F0, 0x1B1D) }, /* HP ev2200 a.k.a MC5720 */ { USB_DEVICE(0x03F0, 0x211D) }, /* HP ev2210 a.k.a MC5725 */ { USB_DEVICE(0x03F0, 0x1E1D) }, /* HP hs2300 a.k.a MC8775 */ { USB_DEVICE(0x1199, 0x0017) }, /* Sierra Wireless EM5625 */ { USB_DEVICE(0x1199, 0x0018) }, /* Sierra Wireless MC5720 */ { USB_DEVICE(0x1199, 0x0218) }, /* Sierra Wireless MC5720 */ { USB_DEVICE(0x1199, 0x0020) }, /* Sierra Wireless MC5725 */ { USB_DEVICE(0x1199, 0x0220) }, /* Sierra Wireless MC5725 */ { USB_DEVICE(0x1199, 0x0022) }, /* Sierra Wireless EM5725 */ { USB_DEVICE(0x1199, 0x0024) }, /* Sierra Wireless MC5727 */ { USB_DEVICE(0x1199, 0x0224) }, /* Sierra Wireless MC5727 */ { USB_DEVICE(0x1199, 0x0019) }, /* Sierra Wireless AirCard 595 */ { USB_DEVICE(0x1199, 0x0021) }, /* Sierra Wireless AirCard 597E */ { USB_DEVICE(0x1199, 0x0112) }, /* Sierra Wireless AirCard 580 */ { USB_DEVICE(0x1199, 0x0120) }, /* Sierra Wireless USB Dongle 595U */ { USB_DEVICE(0x1199, 0x0301) }, /* Sierra Wireless USB Dongle 250U */ /* Sierra Wireless C597 */ { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x0023, 0xFF, 0xFF, 0xFF) }, /* Sierra Wireless T598 */ { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x0025, 0xFF, 0xFF, 0xFF) }, { USB_DEVICE(0x1199, 0x0026) }, /* Sierra Wireless T11 */ { USB_DEVICE(0x1199, 0x0027) }, /* Sierra Wireless AC402 */ { USB_DEVICE(0x1199, 0x0028) }, /* Sierra Wireless MC5728 */ { USB_DEVICE(0x1199, 0x0029) }, /* Sierra Wireless Device */ { USB_DEVICE(0x1199, 0x6802) }, /* Sierra Wireless MC8755 */ { USB_DEVICE(0x1199, 0x6803) }, /* Sierra Wireless MC8765 */ { USB_DEVICE(0x1199, 0x6804) }, /* Sierra Wireless MC8755 */ { USB_DEVICE(0x1199, 0x6805) }, /* Sierra Wireless MC8765 */ { USB_DEVICE(0x1199, 0x6808) }, /* Sierra Wireless MC8755 */ { USB_DEVICE(0x1199, 0x6809) }, /* Sierra Wireless MC8765 */ { USB_DEVICE(0x1199, 0x6812) }, /* Sierra Wireless MC8775 & AC 875U */ { USB_DEVICE(0x1199, 0x6813) }, /* Sierra Wireless MC8775 */ { USB_DEVICE(0x1199, 0x6815) }, /* Sierra Wireless MC8775 */ { USB_DEVICE(0x1199, 0x6816) }, /* Sierra Wireless MC8775 */ { USB_DEVICE(0x1199, 0x6820) }, /* Sierra Wireless AirCard 875 */ { USB_DEVICE(0x1199, 0x6821) }, /* Sierra Wireless AirCard 875U */ { USB_DEVICE(0x1199, 0x6822) }, /* Sierra Wireless AirCard 875E */ { USB_DEVICE(0x1199, 0x6832) }, /* Sierra Wireless MC8780 */ { USB_DEVICE(0x1199, 0x6833) }, /* Sierra Wireless MC8781 */ { USB_DEVICE(0x1199, 0x6834) }, /* Sierra Wireless MC8780 */ { USB_DEVICE(0x1199, 0x6835) }, /* Sierra Wireless MC8781 */ { USB_DEVICE(0x1199, 0x6838) }, /* Sierra Wireless MC8780 */ { USB_DEVICE(0x1199, 0x6839) }, /* Sierra Wireless MC8781 */ { USB_DEVICE(0x1199, 0x683A) }, /* Sierra Wireless MC8785 */ { USB_DEVICE(0x1199, 0x683B) }, /* Sierra Wireless MC8785 Composite */ /* Sierra Wireless MC8790, MC8791, MC8792 Composite */ { USB_DEVICE(0x1199, 0x683C) }, { USB_DEVICE(0x1199, 0x683D) }, /* Sierra Wireless MC8791 Composite */ /* Sierra Wireless MC8790, MC8791, MC8792 */ { USB_DEVICE(0x1199, 0x683E) }, { USB_DEVICE(0x1199, 0x6850) }, /* Sierra Wireless AirCard 880 */ { USB_DEVICE(0x1199, 0x6851) }, /* Sierra Wireless AirCard 881 */ { USB_DEVICE(0x1199, 0x6852) }, /* Sierra Wireless AirCard 880 E */ { USB_DEVICE(0x1199, 0x6853) }, /* Sierra Wireless AirCard 881 E */ { USB_DEVICE(0x1199, 0x6855) }, /* Sierra Wireless AirCard 880 U */ { USB_DEVICE(0x1199, 0x6856) }, /* Sierra Wireless AirCard 881 U */ { USB_DEVICE(0x1199, 0x6859) }, /* Sierra Wireless AirCard 885 E */ { USB_DEVICE(0x1199, 0x685A) }, /* Sierra Wireless AirCard 885 E */ /* Sierra Wireless C885 */ { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x6880, 0xFF, 0xFF, 0xFF)}, /* Sierra Wireless C888, Air Card 501, USB 303, USB 304 */ { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x6890, 0xFF, 0xFF, 0xFF)}, /* Sierra Wireless C22/C33 */ { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x6891, 0xFF, 0xFF, 0xFF)}, /* Sierra Wireless HSPA Non-Composite Device */ { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x6892, 0xFF, 0xFF, 0xFF)}, { USB_DEVICE(0x1199, 0x6893) }, /* Sierra Wireless Device */ /* Sierra Wireless Direct IP modems */ { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x68A3, 0xFF, 0xFF, 0xFF), .driver_info = (kernel_ulong_t)&direct_ip_interface_ignore }, { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x68AA, 0xFF, 0xFF, 0xFF), .driver_info = (kernel_ulong_t)&direct_ip_interface_ignore }, { USB_DEVICE(0x1199, 0x68AB) }, /* Sierra Wireless AR8550 */ /* AT&T Direct IP LTE modems */ { USB_DEVICE_AND_INTERFACE_INFO(0x0F3D, 0x68AA, 0xFF, 0xFF, 0xFF), .driver_info = (kernel_ulong_t)&direct_ip_interface_ignore }, /* Airprime/Sierra Wireless Direct IP modems */ { USB_DEVICE_AND_INTERFACE_INFO(0x0F3D, 0x68A3, 0xFF, 0xFF, 0xFF), .driver_info = (kernel_ulong_t)&direct_ip_interface_ignore }, { } }; MODULE_DEVICE_TABLE(usb, id_table); struct sierra_port_private { spinlock_t lock; /* lock the structure */ int outstanding_urbs; /* number of out urbs in flight */ struct usb_anchor active; struct usb_anchor delayed; int num_out_urbs; int num_in_urbs; /* Input endpoints and buffers for this port */ struct urb *in_urbs[N_IN_URB_HM]; /* Settings for the port */ int rts_state; /* Handshaking pins (outputs) */ int dtr_state; int cts_state; /* Handshaking pins (inputs) */ int dsr_state; int dcd_state; int ri_state; }; static int sierra_send_setup(struct usb_serial_port *port) { struct usb_serial *serial = port->serial; struct sierra_port_private *portdata; __u16 interface = 0; int val = 0; int do_send = 0; int retval; portdata = usb_get_serial_port_data(port); if (portdata->dtr_state) val |= 0x01; if (portdata->rts_state) val |= 0x02; /* If composite device then properly report interface */ if (serial->num_ports == 1) { interface = sierra_interface_num(serial); /* Control message is sent only to interfaces with * interrupt_in endpoints */ if (port->interrupt_in_urb) { /* send control message */ do_send = 1; } } /* Otherwise the need to do non-composite mapping */ else { if (port->bulk_out_endpointAddress == 2) interface = 0; else if (port->bulk_out_endpointAddress == 4) interface = 1; else if (port->bulk_out_endpointAddress == 5) interface = 2; do_send = 1; } if (!do_send) return 0; retval = usb_autopm_get_interface(serial->interface); if (retval < 0) return retval; retval = usb_control_msg(serial->dev, usb_sndctrlpipe(serial->dev, 0), 0x22, 0x21, val, interface, NULL, 0, USB_CTRL_SET_TIMEOUT); usb_autopm_put_interface(serial->interface); return retval; } static int sierra_tiocmget(struct tty_struct *tty) { struct usb_serial_port *port = tty->driver_data; unsigned int value; struct sierra_port_private *portdata; portdata = usb_get_serial_port_data(port); value = ((portdata->rts_state) ? TIOCM_RTS : 0) | ((portdata->dtr_state) ? TIOCM_DTR : 0) | ((portdata->cts_state) ? TIOCM_CTS : 0) | ((portdata->dsr_state) ? TIOCM_DSR : 0) | ((portdata->dcd_state) ? TIOCM_CAR : 0) | ((portdata->ri_state) ? TIOCM_RNG : 0); return value; } static int sierra_tiocmset(struct tty_struct *tty, unsigned int set, unsigned int clear) { struct usb_serial_port *port = tty->driver_data; struct sierra_port_private *portdata; portdata = usb_get_serial_port_data(port); if (set & TIOCM_RTS) portdata->rts_state = 1; if (set & TIOCM_DTR) portdata->dtr_state = 1; if (clear & TIOCM_RTS) portdata->rts_state = 0; if (clear & TIOCM_DTR) portdata->dtr_state = 0; return sierra_send_setup(port); } static void sierra_release_urb(struct urb *urb) { if (urb) { kfree(urb->transfer_buffer); usb_free_urb(urb); } } static void sierra_outdat_callback(struct urb *urb) { struct usb_serial_port *port = urb->context; struct sierra_port_private *portdata = usb_get_serial_port_data(port); struct sierra_intf_private *intfdata; int status = urb->status; unsigned long flags; intfdata = usb_get_serial_data(port->serial); /* free up the transfer buffer, as usb_free_urb() does not do this */ kfree(urb->transfer_buffer); usb_autopm_put_interface_async(port->serial->interface); if (status) dev_dbg(&port->dev, "%s - nonzero write bulk status " "received: %d\n", __func__, status); spin_lock_irqsave(&portdata->lock, flags); --portdata->outstanding_urbs; spin_unlock_irqrestore(&portdata->lock, flags); spin_lock_irqsave(&intfdata->susp_lock, flags); --intfdata->in_flight; spin_unlock_irqrestore(&intfdata->susp_lock, flags); usb_serial_port_softint(port); } /* Write */ static int sierra_write(struct tty_struct *tty, struct usb_serial_port *port, const unsigned char *buf, int count) { struct sierra_port_private *portdata; struct sierra_intf_private *intfdata; struct usb_serial *serial = port->serial; unsigned long flags; unsigned char *buffer; struct urb *urb; size_t writesize = min((size_t)count, (size_t)MAX_TRANSFER); int retval = 0; /* verify that we actually have some data to write */ if (count == 0) return 0; portdata = usb_get_serial_port_data(port); intfdata = usb_get_serial_data(serial); dev_dbg(&port->dev, "%s: write (%zd bytes)\n", __func__, writesize); spin_lock_irqsave(&portdata->lock, flags); dev_dbg(&port->dev, "%s - outstanding_urbs: %d\n", __func__, portdata->outstanding_urbs); if (portdata->outstanding_urbs > portdata->num_out_urbs) { spin_unlock_irqrestore(&portdata->lock, flags); dev_dbg(&port->dev, "%s - write limit hit\n", __func__); return 0; } portdata->outstanding_urbs++; dev_dbg(&port->dev, "%s - 1, outstanding_urbs: %d\n", __func__, portdata->outstanding_urbs); spin_unlock_irqrestore(&portdata->lock, flags); retval = usb_autopm_get_interface_async(serial->interface); if (retval < 0) { spin_lock_irqsave(&portdata->lock, flags); portdata->outstanding_urbs--; spin_unlock_irqrestore(&portdata->lock, flags); goto error_simple; } buffer = kmemdup(buf, writesize, GFP_ATOMIC); if (!buffer) { retval = -ENOMEM; goto error_no_buffer; } urb = usb_alloc_urb(0, GFP_ATOMIC); if (!urb) { retval = -ENOMEM; goto error_no_urb; } usb_serial_debug_data(&port->dev, __func__, writesize, buffer); usb_fill_bulk_urb(urb, serial->dev, usb_sndbulkpipe(serial->dev, port->bulk_out_endpointAddress), buffer, writesize, sierra_outdat_callback, port); /* Handle the need to send a zero length packet */ urb->transfer_flags |= URB_ZERO_PACKET; spin_lock_irqsave(&intfdata->susp_lock, flags); if (intfdata->suspended) { usb_anchor_urb(urb, &portdata->delayed); spin_unlock_irqrestore(&intfdata->susp_lock, flags); goto skip_power; } else { usb_anchor_urb(urb, &portdata->active); } /* send it down the pipe */ retval = usb_submit_urb(urb, GFP_ATOMIC); if (retval) { usb_unanchor_urb(urb); spin_unlock_irqrestore(&intfdata->susp_lock, flags); dev_err(&port->dev, "%s - usb_submit_urb(write bulk) failed " "with status = %d\n", __func__, retval); goto error; } else { intfdata->in_flight++; spin_unlock_irqrestore(&intfdata->susp_lock, flags); } skip_power: /* we are done with this urb, so let the host driver * really free it when it is finished with it */ usb_free_urb(urb); return writesize; error: usb_free_urb(urb); error_no_urb: kfree(buffer); error_no_buffer: spin_lock_irqsave(&portdata->lock, flags); --portdata->outstanding_urbs; dev_dbg(&port->dev, "%s - 2. outstanding_urbs: %d\n", __func__, portdata->outstanding_urbs); spin_unlock_irqrestore(&portdata->lock, flags); usb_autopm_put_interface_async(serial->interface); error_simple: return retval; } static void sierra_indat_callback(struct urb *urb) { int err; int endpoint; struct usb_serial_port *port; unsigned char *data = urb->transfer_buffer; int status = urb->status; endpoint = usb_pipeendpoint(urb->pipe); port = urb->context; if (status) { dev_dbg(&port->dev, "%s: nonzero status: %d on" " endpoint %02x\n", __func__, status, endpoint); } else { if (urb->actual_length) { tty_insert_flip_string(&port->port, data, urb->actual_length); tty_flip_buffer_push(&port->port); usb_serial_debug_data(&port->dev, __func__, urb->actual_length, data); } else { dev_dbg(&port->dev, "%s: empty read urb" " received\n", __func__); } } /* Resubmit urb so we continue receiving */ if (status != -ESHUTDOWN && status != -EPERM) { usb_mark_last_busy(port->serial->dev); err = usb_submit_urb(urb, GFP_ATOMIC); if (err && err != -EPERM) dev_err(&port->dev, "resubmit read urb failed." "(%d)\n", err); } } static void sierra_instat_callback(struct urb *urb) { int err; int status = urb->status; struct usb_serial_port *port = urb->context; struct sierra_port_private *portdata = usb_get_serial_port_data(port); struct usb_serial *serial = port->serial; dev_dbg(&port->dev, "%s: urb %p port %p has data %p\n", __func__, urb, port, portdata); if (status == 0) { struct usb_ctrlrequest *req_pkt = urb->transfer_buffer; if (!req_pkt) { dev_dbg(&port->dev, "%s: NULL req_pkt\n", __func__); return; } if ((req_pkt->bRequestType == 0xA1) && (req_pkt->bRequest == 0x20)) { int old_dcd_state; unsigned char signals = *((unsigned char *) urb->transfer_buffer + sizeof(struct usb_ctrlrequest)); dev_dbg(&port->dev, "%s: signal x%x\n", __func__, signals); old_dcd_state = portdata->dcd_state; portdata->cts_state = 1; portdata->dcd_state = ((signals & 0x01) ? 1 : 0); portdata->dsr_state = ((signals & 0x02) ? 1 : 0); portdata->ri_state = ((signals & 0x08) ? 1 : 0); if (old_dcd_state && !portdata->dcd_state) tty_port_tty_hangup(&port->port, true); } else { dev_dbg(&port->dev, "%s: type %x req %x\n", __func__, req_pkt->bRequestType, req_pkt->bRequest); } } else dev_dbg(&port->dev, "%s: error %d\n", __func__, status); /* Resubmit urb so we continue receiving IRQ data */ if (status != -ESHUTDOWN && status != -ENOENT) { usb_mark_last_busy(serial->dev); err = usb_submit_urb(urb, GFP_ATOMIC); if (err && err != -EPERM) dev_err(&port->dev, "%s: resubmit intr urb " "failed. (%d)\n", __func__, err); } } static unsigned int sierra_write_room(struct tty_struct *tty) { struct usb_serial_port *port = tty->driver_data; struct sierra_port_private *portdata = usb_get_serial_port_data(port); unsigned long flags; /* try to give a good number back based on if we have any free urbs at * this point in time */ spin_lock_irqsave(&portdata->lock, flags); if (portdata->outstanding_urbs > (portdata->num_out_urbs * 2) / 3) { spin_unlock_irqrestore(&portdata->lock, flags); dev_dbg(&port->dev, "%s - write limit hit\n", __func__); return 0; } spin_unlock_irqrestore(&portdata->lock, flags); return 2048; } static unsigned int sierra_chars_in_buffer(struct tty_struct *tty) { struct usb_serial_port *port = tty->driver_data; struct sierra_port_private *portdata = usb_get_serial_port_data(port); unsigned long flags; unsigned int chars; /* NOTE: This overcounts somewhat. */ spin_lock_irqsave(&portdata->lock, flags); chars = portdata->outstanding_urbs * MAX_TRANSFER; spin_unlock_irqrestore(&portdata->lock, flags); dev_dbg(&port->dev, "%s - %u\n", __func__, chars); return chars; } static void sierra_stop_rx_urbs(struct usb_serial_port *port) { int i; struct sierra_port_private *portdata = usb_get_serial_port_data(port); for (i = 0; i < portdata->num_in_urbs; i++) usb_kill_urb(portdata->in_urbs[i]); usb_kill_urb(port->interrupt_in_urb); } static int sierra_submit_rx_urbs(struct usb_serial_port *port, gfp_t mem_flags) { int ok_cnt; int err = -EINVAL; int i; struct urb *urb; struct sierra_port_private *portdata = usb_get_serial_port_data(port); ok_cnt = 0; for (i = 0; i < portdata->num_in_urbs; i++) { urb = portdata->in_urbs[i]; if (!urb) continue; err = usb_submit_urb(urb, mem_flags); if (err) { dev_err(&port->dev, "%s: submit urb failed: %d\n", __func__, err); } else { ok_cnt++; } } if (ok_cnt && port->interrupt_in_urb) { err = usb_submit_urb(port->interrupt_in_urb, mem_flags); if (err) { dev_err(&port->dev, "%s: submit intr urb failed: %d\n", __func__, err); } } if (ok_cnt > 0) /* at least one rx urb submitted */ return 0; else return err; } static struct urb *sierra_setup_urb(struct usb_serial *serial, int endpoint, int dir, void *ctx, int len, gfp_t mem_flags, usb_complete_t callback) { struct urb *urb; u8 *buf; urb = usb_alloc_urb(0, mem_flags); if (!urb) return NULL; buf = kmallo |